Behind the scenes of the Mixed Reality trailer shoots for Fantastic Contraption and Job Simulator for the HTC VIVE. http://kertgartner.com | http://handcraftcreative.com/ | http://owlchemylabs.com/ | http://northwaygames.com | http://radialgames.com/ |

Why Shoot a Mixed reality Trailer?

Because it’s the best way to convey what it is actually like to be in virtual reality on a 2D screen. Translating the feeling of being in a fully interactive virtual environment onto a 4.7" screen on your iPhone is a difficult problem.

Creating a trailer where you’re showcasing first person footage from the head mounted display is the traditional method people have been using, and it gets you part way there. But it never really provides the viewer a sense of the scale or sense of presence of room scale VR.

When you see someone inside VR and interacting with their virtual environment, something clicks in your brain and the viewer understands it in a way that just isn't same with first person footage.

For Fantastic Contraption we explored three different methods of showcasing what it’s like to be inside VR playing the game.

  1. We composited the player into the virtual environment
  2. We brought some of the virtual environment into the real world
  3. We film an in-game avatar inside the environment with a physical camera move created in the real world

Even if your game isn't the best fit for mixed reality footage (1, and 2), the same sense of immersion can be felt even by creating an in-game avatar that fits with the style of the game and creating camera moves around that character in the virtual space. I believe that for studios that don’t have the resources and budget to create full mixed reality trailers, this will be a great method to showcase their game in the best possible way. This is the method we chose to use for the Space Pirate Trainer trailer and Fantastic Contraption trailer for Oculus touch. You can read about that experience here and the pros/cons of doing it this way vs Mixed Reality.

When Colin approached me about making a trailer for Fantastic Contraption, we wanted to take their mixed reality streaming tech and push one step further. We wanted perfectly synched and composited footage of people in Fantastic Contraption building machines and having fun, while having a completely free camera that we could fly around them to evoke the feeling of what it’s really like to be inside the game.

Here’s how we did it and what was involved. (TL;DR - this is way more work than you’d ever imagine)


Other Mixed Reality Trailers

The crew at MPC created the amazing trailer for Tilt Brush. They recently posted a behind the scenes video where you get a glimpse into their process which is similar, but a very different execution than we took. They shot against grey and did not capture the camera moves live. The only thing captured live was the creation of the actual geometry, then they 3D tracked the live action footage and brought that camera data back into Unity. They then rotoscoped the actors off of the grey background (very labour intensive) and rendered different passes out of Unity (foreground, background, z-depth) to create a final composite in Nuke. MPC also went to the effort to paint out all of the wires coming out of the HMD. A very expensive and labour intensive process, but the results look great!

GravLab is another interesting Mixed Reality trailer. They mounted a GoPro to the HMD of the VIVE and managed to get some first person shots of the player interacting with the virtual environment using their real hands and controllers. There's also some static third person shots which look like they were shot with a webcam. The alignment isn't perfect in this trailer and the production value isn't as high as the Contraption or Tilt Brush trailers, but it's cool to see other developers creating mixed reality trailers on a much smaller budget.

Soundstage: VR Music maker is another cool example. Clearly a lot lower budget than the TiltBrush or Contraption trailers, the limited camera motion still helps give the viewer a sense of depth. Would have loved to see some more dramatic camera moves, but i'm guessing they're limited because of the amount of greenscreen or they didn't have a camera stabilizer since their moves are all on a slider.

Mindshow is a simple trailer with some decent mixed reality shots. It looks like their camera wasn't stabilized properly since the shots are pretty shaky. I don't think they added any camera smoothing to their first person shots either, so the whole trailer has a bit of a janky feel to it overall. Good effort though!


Other ways to create a mixed Reality Video

The most exciting development in Mixed Reality in recent months (as of Dec 2016) is Owlchemy Lab's mixed reality tech. They're using a Stereo Depth Camera and shooting a person in front of a greenscreen, extracting them, and compositing them IN THE PROPER DEPTH into the game world inside of Unity. This has a number of advantages over every other method that they cover on their blog. It's worth looking at. The problem with their method is it only works with that one ZED stereo Depth camera (for now) and the quality of the depth map and RGB image isn't nearly as good as you'll get with a high end DSLR. Compositing in-engine can only get you so far as well, and you can extract more precise keys in post production using the wide variety of mature tools that are available for this task (primatte, keylight etc).

ZED now offers a Unity package to make their camera work in your own project!

I have no doubt in my mind that their method will become the defacto-standard for Streamers/Youtubers and live showcasing of VR going forward. But for things like trailers, and more serious video production, I think using higher end cameras and compositing in post is still going to be the best way to achieve a professional looking result... At least for now...

Here is an amazing article by the creators of The Unspoken. It really shows how much custom work and time is needed to tune a mixed reality experience for a specific game. They didn't want to composite in post, so they took the green screen footage and composited it live in game so they could use the in-game lighting effects to make the player seem more like they're part of the world. Very interesting!

Here is another awesome writeup by Croteam on how they created their own mixed reality solution for Serious Sam VR. They have their own engine so they created many custom solutions to a lot of the problems described below. Here's a video showing of their own Camera Calibration solution which is pretty slick!


Background

Still Image from one of the first mixed reality tests by Breakwater Studios and Colin Northway. Gameplay is overlaid at 50% opacity on top of the live action.

Still Image from one of the first mixed reality tests by Breakwater Studios and Colin Northway. Gameplay is overlaid at 50% opacity on top of the live action.

To get an idea of where this all began, Colin and Sarah Northway came up with the rough idea for mixed reality shooting around the winter of 2015 when Breakwater Studios came by to shoot a piece for Unity. At that time, the Northways didn't have a Green screen studio in their apartment for shooting mixed reality videos, but they did have the idea to tape a DK1 Vive controller to the top of their camera and capture gameplay and live action simultaneously. The results were so captivating even with the footage just overlaid at 50% opacity, but I think everyone watching it knew that it was the beginning of something great!

Once they covered their apartment in Green screen and started live streaming the game every Thursday, the results were magical. You can read about the whole live streaming process and how to do it yourself here. The last step was to get that camera moving, and better calibrate it to the live action footage so everything would feel 100% locked and natural. 


research and Development

One of the first tests mounting the third VIVE controller inverted on top of my Panasonic GH2. This test was not rigid enough and inverting the controller wasn't ideal.

Once I received a VIVE Pre, the immediate problem was how sync a third VIVE controller to it so we could use that as our virtual in-game camera. The VIVE headset only has support for two bluetooth controllers, so you need to get an extra bluetooth dongle and pair it to another VIVE controller if you want to have a wireless controller. Connecting the controller via USB is easiest, but if you want wireless (which is honestly the best way to go) buy a third VIVE controller here, then a Steam controller dongle here and flash the firmware of the Steam Controller's dongle to pair with the VIVE controller.

The next step is to calibrate the in-game camera - tied to the location of the third controller -  to the physical camera you’re using to shoot the the live action. At the moment, this step is very manually intensive and requires a lot of patience and thought to get it right.

Camera Calibration

This is where a lot of the magic happens and at least for now, this step is one of the most cumbersome and finicky parts of the whole experience. For a more thorough guide on how to calibrate your camera using OBS, please refer to this guide written by Colin and Sarah Northway.

The short version of that guide is once the third VIVE controller is mounted on to your camera rig, you need to figure out the FOV of your physical lens, and match it to the virtual one. This is not as straightforward as it would seem, since Unity’s FOV is measured vertical FOV which is different than how camera lenses are measured. The calculator here will get you ‘close’ but you’ll still need to adjust it by hand afterwards since this measurement will not be exact.

An example of how we calibrated the in-game camera to the physical camera with the in-game controllers overlaid on top of the DSLR output using OBS. 

Most cameras do not have a full frame sensor, and not all lenses are created equal. There’s multiple factors involved here including lens distortion and cheaper fisheye lenses (especially on webcams) that will make your calibration off more towards the edges of the frame rather than the center. 

Once you have the FOV figured out, it’s a matter of offsetting the position of the camera in XYZ to the center location of the sensor on the camera. This likely means moving it back and down a bit. Also, you’ll have to compensate for any rotation on the XYZ axis as well. This sound simple, but it’s a total pain, and took me about 20 minutes to get it “pretty close” every time we changed the mounting position. You’ll likely be adjusting all the values for a while until you start to hone in on the correct values

I’m hoping in the future, Valve will offer some sort of solution for this that tracks the FOV and offset of the camera using computer vision technology and this entire section will become redundant. Once we had a good calibration with my old DSLR, we did some initial tests you can see below.

Another developer has released an interesting driver that fools SteamVR into thinking there's a third controller attached even when you only have two. There's also a handy calibration tool you can use to populate your externalcamera.cfg file if you're using that method to create your mixed reality setup. I haven't explored these tools in much detail but it looks like they could be very useful until Valve releases some new tools of their own.

TribalInstincts has also created a cool tool to help with camera calibration. You can take a look at it here. Keep in mind this is only meant to work with SteamVR's internal Mixed Reality setup and not something custom made like the one for Fantastic Contraption or Serious Sam VR. 

Jaroslav Stehlik is also working on his own calibration tool here!


Camera Choice

Our initial test mounting the a7S II to the Movi Steadycam with the third VIVE controller on top. In the end, we went with a more rigid solution for the final shoot.

These tests revealed a few problems, we needed to shoot with a higher end DLSR and we needed a stabilized camera rig. After looking at a few options, we settled on using a Sony a7S II and a Movi steadicam. We went with the a7S II over a Cannon or RED because there wasn’t enough clearance on the Movi to mount those larger cameras with the VIVE controller on top. Only the A7s would fit both at the same time and still balance properly on the Movi.

Frame Rate: 24 vs 30 vs 60fps

We experimented with shooting all the live action footage at 60fps to match the in-game footage, but this proved to be problematic on a few levels. 

If you’ve ever seen The Hobbit projected in a theater with a high frame rate (HFR) projector, you know how weird live action looks at anything other than 24fps. Film is traditionally shot and projected at 24fps, and when you start to go higher, it evokes the feelings of the footage being shot on video rather than film, since we’re used to video/news etc being presented at 30i or 60i.

The higher frame rate also makes any camera shakiness/jutter WAY more noticeable, and even on our Movi rig, movements that were completely smooth at 24fps felt janky and wobbly at 60fps. To keep things feeling smooth and film-like, we made the decision to capture the live action at 24fps and conform the gameplay footage to that. I added motion blur to the gameplay footage so it would composite more seamlessly into the live action.

In some situations where there isn’t as much player and/or camera movement, capturing the live action at 60fps might look totally fine. This Audio Shield mixed reality test looks great, but the focus is on the background, there’s no handheld camera to worry about, and the player is essentially static.

The crew from Handcraft came over to shoot a few test shots in my basement. I posted this test to twitter, and people LOVED the footage. We knew we were on to something, all we needed was the green screen and a plan and we would be off to the races.


Unity Additions and Gameplay Output

Once we had the camera rig figured out the next step was to investigate the best output we could get from unity to capture the gameplay. We captured an entire 2560x1440 display at 60fps which gave us four 1280x720 quadrants to work with. To capture the footage at 60fps we recorded the screen using Bandicam, which is now my go-to for capturing gameplay in Windows.

The upper-right quadrant was the Foreground (FG) layer on blue which encompassed everything in front of the head mounted display (HMD). The upper left quadrant was the Background (BG) layer, which was everything behind the HMD. The lower left quadrant was a full composite of those two layers in-game so we could capture the correct shadows from all the foreground objects, and the lower right was a smoothed first person view from the HMD.

Using a 4k monitor was discussed so we could capture four 1080p quadrants, but 720p was enough resolution to work with and we didn’t want to push extra pixels to the display and risk dropping frame rate inside the HMD.

Colin experimented with various ways to split the FG and BG layers, and using pure z-clipping is one of the worst ways to do it. The method he chose is to “pop” full objects from layer to layer when they crossed the threshold of the centre of the HMD. This also made compositing much easier in post since you didn’t have to deal with objects being “half in front and half in back” with seams going through the middle. 

Valve’s Unity plugin has a method for splitting the views into four quadrants as well, but their method places a greyscale alpha channel in the UR quadrant, FG in the UL, BG in the LR, and first person in the LR. That method is interesting because it preserves transparency data. The way you use this is by taking the FG layer, dividing it by the alpha, then post multiplying that over top of the greenscreen layer which is on top of the background. There are artifacts doing it this way, but they can be mitigated to some degree in post.

If you're using OBS to stream or composite using the SteamVR method, you should grab this plugin which allows you to use the Alpha channel information to extract and composite the transparency properly!

Slating and World Rotation

The other thing we needed to add was an in-game slating system to sync up shots in post production. The filenames from Bandicam are named sequentially, but sometimes you might accidentally hit record a few times by accident, or you’re recording a take on the camera that’s not being captured on the computer. Syncing up those after the fact would be a pain, so we added a quick hotkey that would put a number on screen and we had a physical slate that would go in front of the camera so we could match shots visually rather than having to rely on filenames.

Another addition that proved incredibly useful was adding a world rotation hotkey into the game. This allowed us to do the SteamVR room setup one time, then if we needed to change the angle inside the game, we could hit a hotkey and have it snap to 90 degree increments. When you’re shooting on a green screen set, you’ll only have at best 180 degrees of space to work with, if you want to get a shot from a reverse angle of what you’re currently shooting, this was a super fast way to accomplish this.


The Green Screen stage and Shooing on Set!

If you’re going to shoot a mixed reality trailer, please do not shoot this on a crappy lit green screen in your basement. It might work for some games but if you want wide sweeping camera moves and don’t want to struggle with getting proper keys from your green screen footage you’ll save tons of time in post by renting a studio. Our rental included one half day (4h) for setup of the VIVE and lighting, and 1 full day (8h) of shooting which included all of the lights used to light the screen and actors, along with a grip to setup/teardown the lights and help out on set for the entire day. 

We hauled my computer rig there along with all the necessary equipment. After doing a bit of experimentation, we decided to use a corner of the green screen cove rather than the full depth to give the camera a bit more room to maneuver. We were losing tracking when the lighthouses were set up at the far edge of the studio, so by moving one into the far corner that helped keep the HMD, Controllers, and camera controller all tracked - most of the time. 

There were situations where the camera operator was between the actor and both lighthouses and we would lose tracking in those spots. I’m hoping that in the future the VIVE will support more than 2 lighthouses so we can cover more of the stage to get 100% coverage.

All said, the shoot on the green screen went extremely well and there were very few issues that cropped up due to the massive amount of testing we did beforehand.


Capturing First Person VR Footage

We knew from the beginning that Job Simulator would be showcasing mostly first person footage due to the limited scope of the mixed reality footage we were capturing, so we wanted to make sure we could get the best possible result.

The main piece of tech to make good first person footage is camera smoothing. For a great overview of how that tech works, take a look at this video by Andy Moore The Cole's notes version is that rather than displaying 1:1 what the HMD sees, it take the movement of the HMD and smooths it out, removing all of the shakiness and and quick motion our heads do while in VR. 

What feels completely natural inside of the HMD can look absolutely terrible on a 2D screen. Camera smoothing is the first thing that is absolutely necessary.

The other addition was the ability to change the FOV of the smoothed companion camera so it matched roughly to the viewable FOV inside the HMD. We made the decision to make it a little wider, and we captured all the Job Simulator footage at 2560x1440 @ 60fps. The trailer was created at 720p, so the extra FOV and resolution allowed us to pan the footage around and zoom in on it a little bit when needed. This proved extremely useful and I’d highly recommend capturing at a larger resolution and scaling footage down to your output res when you’re editing.

Another interesting discovery is that your eyes dart around inside the HMD, and you interact with things that are in your peripheral vision a lot, and that doesn’t read at all in the 2D footage. You’re not seeing the same FOV in the HMD that you are in the 2D Gameplay being captured. It takes a minimum of two people to do this right. One person acting out the movements inside the HMD, and another person at the computer watching and directing the actor. Make no mistake, the person in the HMD is an actor, and it’s amazing how much personality can come across in a few hand and head movements.

Getting proper hand movement and timing with your head movement is incredibly difficult, and if you want the hands to be visible in the captured footage, you have to hold them up in an unnaturally high position.

Tilting your head from side to side also feels incredibly awkward in 2D first person footage, unless it’s used for a specific moment (like when we drink out of the slurpee machine)

It’s incredibly time consuming to capture first person footage and we did multiple takes of every shot in the trailer to get ones that felt perfect.


Why isn’t there more mixed reality footage in the Job Simulator trailer?

The main reason is that the team at Owlchemy had a very limited amount of time to integrate the mixed reality tech into Job Simulator in time for the shoot at the beginning of March. We managed to plan for a few specific shots in the one ‘job’ that was the most art complete at the time of the shoot. The rest of the game was very much in flux, so we were only able to shoot in the Office and the team only had time to implement the z-clipping method for splitting the layers in Unity which proved very problematic to work with. We captured the best footage we could for the limited amount of time we had on set, and integrated the best clips that fit tonally and narratively with the trailer. 

Another interesting issue specifically with Job Simulator is the sense of scale. Everything inside of job simulator is oversized and cartoony, so when you composite a person inside that game, they actually feel really tiny inside the environment. When you know how big a coffee mug or computer monitor should be, and all of a sudden when that mug is the size of the person’s head, your brain knows something’s off. For Job Simulator, I’m actually scaling up the live action footage somewhat to compensate for this and warping the footage so it looks correct when he’s interacting with the elements.

This issue was a complete surprise to me and we had no way of knowing how it would look until we actually composited the footage.


Post Production and Compositing the Trailer

Post production on the mixed reality footage was fairly straightforward if you have a visual effects background and know how to properly key and colour correct green screen footage. I wouldn’t want to be brand new to chroma keying and be faced with this mountain of footage. 

To my surprise, the compositing in the official Vive trailer is pretty sub par, and I’m guessing they either composited it live and captured that footage, or didn’t have anyone on staff who had the experience necessary to composite it together as seamlessly as we did on the Contraption trailer. Or maybe they simply didn’t have the time to do it properly. It’s pretty simple to overlay one piece of footage over the other, but they neglected to properly key the motion blur, or do any sort of colour correction, light wrap or compositing on the green screen footage to help integrate it into the gameplay. 

To help reduce motion blur on the green screen footage, we shot at a 1/100th shutter rather than a 1/50th shutter which would have been the standard for shooting 24fps. The reduced motion blur helped with keeping our keys sharper and we added motion blur back into the footage in post using Reelsmart Motion blur and AE’s built in Pixel Motion Blur plugin.

For keying, I used a combination of Primatte and Keylight for various shots depending on which worked best. Most shots that were wide enough to show the entire body were a combination of multiple keys, with Primatte keys focusing on the body and Keylight for the ground/shadows.

Footage Calibration and Alignment

Since it’s pretty much impossible to start the camera recording and gameplay recording at the same time, we needed a method to align the footage up to the exact same frame in post. Having the slate helped us keep the shots organized, but it still needed another step to align both pieces of footage down to the exact frame.

We ended up with a process I call “Controller calibration”. Every take had a number of steps:

  1. Roll Camera
  2. Roll Gameplay
  3. Slate Camera and gameplay
  4. Controller calibration with controllers visible to camera
  5. Hide controllers
  6. Action!

Every time we did a take we had to go through all those steps, which was prone to error and mistakes. Fortunately, after a few we got into a rhythm and things went well. The controller calibration step involved the player moving their hands/controllers horizontally and vertically kind of like in a + shape. Since the controllers are moving relatively fast, it allowed me to line up the 60fps gameplay footage to the the exact 24fps footage from the camera. Once it was set, as long as the shot was short enough, it would stay aligned from that point until we cut.

One issue I ran into was something i’ll call "frame drift". It seems like on the longer 1+ minute takes we did, after about 30s or so we would lose lock by a frame or so. I have no idea why this is the case, but I’m thinking it’s due to After effect interpreting some of the 60fps footage as 59.996 or other fractional frame rate, and after enough time has elapsed, it drops/skips a frame. It was never a big enough issue to affect anything as long as I noticed it and could adjust the footage to match where needed.

Shooting in the Living Room

One thing I really wanted to have in the trailer was shots of people playing the game in a home setting with the contraption visible in the environment. This is basically the inverse of the green screen shoot, placing the in-game objects into the real world, rather than the person into the virtual world.

We wanted to have the TV displaying the game for the people on the couch to watch, along with capturing it on the computer, and keeping the frame rate high enough so that the HMD wouldn’t drop frames. This meant rendering the two VIVE displays, the 1080p TV, and my monitor which had to be set to 1080p (split into 4 views). This meant that all the footage in the living room was captured at 540p rather than 720p which made it a bit soft. I haven’t heard any complaints and I don’t think anyone other than me noticed honestly, so people pushing for 4k capture are kinda nuts. It’s way overkill in almost every situation.

One other interesting side effect of pushing the game to two displays and the VIVE is that I couldn’t use the normal game capture mode inside of Bandicam. It would refuse to work since I think it didn't know how to properly capture gameplay running to two displays. I had to use Window capture mode to do it, and that capped the frame rate at 30fps rather than 60. The problem there is that the cadence of 30fps frame rate doesn’t drop down to 24fps very smoothly, so you get some judder in the gameplay footage in the house since it doesn’t mesh with the 24fps footage as well. Again, I don’t think anyone noticed, but it’s a problem I hope to resolve in the future.

To capture the gameplay in the house, we made a few adjustments to our layering system. Since I didn’t want to rotoscope any live action footage since it would be too time consuming, I always wanted 100% of the contraption to be in front of the player regardless of whether it was in the FG or BG layer. To accomplish this, we added an option to add a blue screen key to the BG layer so we could take both the FG and BG elements and composite them in front. 

Editing the Trailer

One interesting side effect of editing a trailer with people reacting to new technology is that they have very strong and emotional reactions to what they’re seeing. We shot a group of friends on the couch, and had to actually cut a lot of the best reaction shots from the trailer, because it felt like their reactions were actually fake and over the top. Every reaction was completely genuine in the trailer, and none of the people are trained actors at all. They were all just having a great time!

There’s a fine line to balance between a reaction feeling cheesy and genuine, and I think we found the perfect balance in the Contraption trailer. Interestingly enough, when the shot the VIVE launch trailer, they ran into the exact same issue and had to cut some of their best reactions because they feared people wouldn’t believe them.


Unresolved Problems and Hope for the Future

There’s many things I hope Valve implements in the coming months to help make producing these videos easier on developers, streamers, and post production. I hope that in a few short months, this section will be removed :D

Support for more than two lighthouses

One major issue we ran into was tracking accuracy and complete tracking loss on the 3rd camera controller mounted to the Movi. Since the camera operator is standing behind the rig, and basically obscuring any lasers coming from behind him, if the camera was between the actor and the camera operator, we would lose tracking. Ideally, i’d love support for 3-4 lighthouses so we can get full 360 degree coverage for the VIVE HMD and controllers. 

Dedicated tracked ‘object’ for cameras

Hooray! As of April 2017, you can now buy a VIVE Tracker from HTC. This works as a dedicated tracking object which can be attached to a DSLR or onto a rig via the 1/4" hotshoe mount. 

One side effect, is that games have to be updated to the latest SteamVR SDK in order to support them. But if they aren't, there's a handy hack available here which makes SteamVR think that the tracker is a third VIVE controller. Very slick! 

Old info Here:

Obviously, rigging up the controller to the camera isn’t ideal, and we need some sort of tracking disk/object that can mount to various rigs via a tripod mount or hot shoe mount. The design of this object must be created in such a way that it protrudes enough from the back of the camera so that it’s more visible to the lasers and doesn’t interfere with steady cam rigs like the Movi. Having a tripod mount and hot shoe mount on the device would be ideal for multiple mounting options, but there are cheap tripod mount to hot shoe adapters available so a tripod mount might be sufficient as long as it’s rigid enough.
We had to zip tie down the controller SUPER hard to the mount we created because there was a lot of bounce/jitter with the controller. Rigidity of the camera/tracking object is super important obviously, so whatever needs to happen design wise to keep it 100% solid is really important.
I was actually kind of shocked to see the way Valve mounted the controller in the VIVE trailer. It seems like doing it vertically would allow for a lot of extra bounce and isn’t the most rigid of setups Ours was mounted horizontally to make calibration easier and to allow it two points of contact along the hot-shoe mount and lens to reduce vibration/bounce.

Support for multiple tracked camera objects

I can see situations where people would want to shoot third person VR footage from multiple cameras in real time and do switching on the fly in a livestream. Basically like a virtual sports event. You’ll have multiple cameras setup and multiple operators covering the ‘game’ from different angles and different focal lengths. I bet this will be huge for covering virtual e-sports events down the line. But it’ll also be super important for streamers/youtubers that want to take the production values of VR videos to another level.

For the purposes of trailers and what I was doing, having 2 external cameras would probably be enough so you could capture the footage in a wide and close up simultaneously. 

Camera calibration

This is the big one that needs to have a better solution. 

Currently, we’re taking a HDMI feed from the live action camera, and a view from the in-game camera and overlaying the two on top. Then, we take the other two VIVE controllers and position them in a t-pose so they’re both visible to both cameras, one closer to the camera and one off to the side and further away, and tweaking the rotational and positional offsets and FOV of the virtual camera until they match the live action camera.

Once they’re close, we move them around to check the accuracy, and then adjust/repeat as needed.

This is an incredibly time consuming process and for people that aren’t familiar with how lenses work (i.e. what an FOV change vs a z-axis positional change looks like to affect an image) I think it would be too complicated for an average user to accomplish. Even calculating the FOV is difficult since Unity uses a vertical FOV and lenses aren’t measured this way. You need to convert it but even then, it’s not 100% accurate.

Lens FOV is assuming a full frame sensor, and most consumer range cameras aren’t full frame. They’re cropped sensors, and sometimes that crop changes between stills mode and video mode, and getting the exact details of your camera’s crop factor in video mode is often impossible. Not to mention wider cheaper lenses tending to have more fisheye lens distortion than professional lenses which will throw off the calibration on the extreme sides of the image. People can also throw in adapters on their lenses which further complicates what the FOV may actually be. 

All this is to say that the best and most accurate way to get a good lens calibration would be optically using computer vision tracking. 

Here’s what I hope happens:

Valve develops a computer vision application that takes the live HDMI feed from your DLSR/Webcam and from the virtual camera and overlays the two on top of each other similar to what OBS does now. Using the optical tracking data from the DLSR camera feed, it adjusts the virtual camera until the two match.

Let’s say the “Valve Mixed Reality kit” you can buy comes with a tracked object that is rigidly mounted to the camera. We then take one of the Vive controllers and move freely so it’s visible to of the virtual and physical camera lens. Since this controller is being tracked by the lighthouses it knows exactly where that is in 3D space. As it’s moved around the scene at different depths and to the extreme edges of what the physical camera can see, since the VIVE controller is a known size/shape, the software can then figure out where the controller is in 3D space by tracking it optically from the HDMI feed from the physical camera and compare that to the virtual controller to determine the exact FOV and offsets of the virtual camera to make things line up. Once the values are known, it feeds that data back into Unity. 

Obviously there’s lag issues to deal with from the HDMI feed with this solution, so i’m picturing something like you move it to one side, hold it there, move it to another side, hold it there, move it back, hold it there, etc. and it just “figures it out” I know this is way easier said than done lol :D

More options in the unity plugin for developers

I realize this is probably going to fall on the developer’s shoulders more than Valve’s, but it would be cool to give a few out of the box options for developers. Rather than using z-clipping, popping full objects from FG/BG would be ideal in a lot of cases. That’s what Colin did for Contraption, and it worked super well since you didn’t have to worry and roto objects from one layer to another when an object is supposed to be in front, but is clipped between two layers because of how it’s depth sorted.

Another useful addition would be to hide the hands in the mixed reality footage in a lot of cases. We added a toggle for this so we wouldn’t notice any alignment/calibration issues with the controllers not being overlaid properly. 

Rather than render out a dedicated alpha like the Valve plugin does, we rendered our FG layer on 100% blue, though we had the ability to change this to any colour. This allowed us to extract the alpha in post and save a quadrant so we had FG/BG/FG+BG Composite/Smoothed Companion cam as our four layers. I think this is a bit more useful, though it requires more experience in post to deal with the blue antialiased pixels in the FG layer.


Making Mixed Reality Trailers and Videos - VRLA Summer 2016


Special Thanks

The Mixed reality trailers for Fantastic Contraption and Job Simulator wouldn’t have been possible if it wasn’t for the help of a small army of people. I’d love to thank the Northways and the team at Owlchemy Labs, Handcraft Creative for helping with the shoot and for creating the above behind the scenes video! Specifically Raymond from Handcraft for manning the Movi for almost 9 hours straight!

Midcan was a wonderful host for the Greenscreen shoot. All of our actors and helpers: Kevin Hnatiuk, Albertine Watson, Chris Pointion, Jon Luxford, and Alyson Shane! And last but not least, my wife and two boys for tolerating me being completely mentally absent for the month and a half we worked through all of these problems to bring these trailers to life! <3 Thank you all! <3


Questions or Comments? Please send me a tweet or shoot me an email