This page describes guidelines and best practices for capturing performance video to be used as the input to MetaHuman Animator (MHA).
Capture Hardware
The following devices are required to capture performances for the MetaHuman UE Plugin.
iPhone
There are two device classes for the iPhone:
iPhone 12
iPhone 13, 14 and newer models
They are very similar in most regards, but there is a difference between the depth sensors between these, and this needs to be reflected in the asset configuration.
The iPhone can be head mounted or tripod mounted. The depth sensor and the camera depth of field are sufficient to accommodate for some head movement.
Stereo Head-Mounted Cameras (HMC)
Stereo hardware must be head mounted. While it’s technically possible to mount this hardware on a tripod, these devices are commonly optimized to the highest possible extent for parameters other than depth of field, which make static mount shoots very impractical.
In theory, any stereo configuration might work. Realistically speaking, a vertical couple is the only configuration that’s been well tested and is known to produce good results.
Lighting can be either on-device, studio lighting, or both of these combined.
Both visible and infrared light are supported. Visible light tends to perform better as it’s scattered less by skin, producing sharper results, but ultimately the most important attribute for footage is sharpness. If your particular shooting circumstances happen to yield sharper footage with Infrared light, this is well supported.
Technoprops Stereo HMC
Capture Conditions
This section outlines the capture conditions for optimal animation results.
MHA is designed to process video data captured within an indoor environment.
Tripod
This section shows optimal and sub-optimal capture examples. These examples can be applied to both standing and seated performers.
Performer Framing
The framing of the performer should be consistent throughout each take and the entire recording session.
Body movement should be minimized, and the performer should face the iPhone camera (not the iphone screen) at all times during the recording to maintain the optimal framing.
A slightly lower positioning with upward facing angle is usually preferable. It’s important that the lip seal and the inside of the upper eyelid are visible as often as possible.
Optimal framing
Below are examples of framing that is likely to produce poor quality results.
Performer out of frame | Performer out of frame | Performer too low | Performer too high | Off-axis | Performer too close |
Depth Preview in LiveLink Face
The Depth Preview in the iPhone application is a useful tool for evaluating performer distance from the camera.
Depth Preview is particularly useful with fixed devices to ensure the optimal distance from the subject without clipping near detail. Grey shading indicates the correct distance from the camera.
Correct gray depth map preview: The performer is at an optimal distance from the camera. | Black artifacts: The performer is too close or too far from the camera. |
Lighting
Ideal lighting is flat with no shadows. If ambient lighting is insufficient, additional lights can be attached to the tripod to produce frontal lighting that will illuminate the correct portions of the face.
Optimal lighting
Below are examples of inadequate lighting conditions that will produce poor quality results.
Environment
While it’s possible to get good results with any background, it’s a good practice to keep the background uniform and darker (less bright) than the face, if possible.
Nothing in the environment or the rig should block all or parts of the performer’s face.
Using a Head Mount
This section shows optimal and sub-optimal capture examples. The examples here have been shot on an iPhone, and this section includes everything you should be aware of when shooting with a head-mounted iPhone.
All of the content in this section also applies to stereo couples.
Performer Framing
A slightly lower positioning with upward facing angle is usually preferable. It’s important that the lip seal and the inside of the upper eyelid are visible as often as possible.
Optimal framing
The following examples show poor camera framing.
Stability
With a head mount, the performer has freedom to move around the volume. However, the camera must be stable relative to the face throughout a recording. This requires a well-fitted head-mount.
Lighting
The same recommendations that apply to tripod mounts also apply to head-mounted recording devices. Good lighting is uniform, shadow free, and predominantly frontal.
In a volume with freedom of movement for the performer, it’s far more difficult to obtain the same kind of uniform lighting for every frame like you would when the performer is static. In challenging situations, a slight underexposure is usually preferable to an overexposure.
In a challenging volume, pick your exposure to avoid "flat spots".
Environment
The ideal environment has a flat background that contrasts with the performer’s skin color. No objects should occlude the performer's head.
Using a Stereo Head Mount
All recommendations and requirements listed before still apply, with a few additional notes and a slightly different appearance specific to stereo couples. Note that stereo couples are usually found in higher-end settings (compared to an iPhone).
Performer Framing
Optimal framing has the center of the image aligned with the upper part of the philtrum (base of nostrils).
Below are several examples of poor image framing.
Stability
With a head mount, the performer has freedom to move around the volume. The camera must be stable relative to the face throughout a recording. This requires a well-fitted head mount.
Below is an example of excessive camera motion that could impact on the quality of the results.
Excessive camera motion
Focus
The lenses should be focused on the nasolabial area of the face (cheek surface to the side of the nostrils). Depending on the depth of field, it might not be possible to have all parts of the face in perfect focus.
Optimal focus
The following image shows an out of focus example.
Out of focus
Lighting (Visible on-board lights)
When using a HMC, both ambient and on-board lighting should be considered. Optimal lighting will have the face evenly lit, minimal shadows, and no patches of over exposure.
Optimal lighting
Below is a set of examples for sub-optimal lighting that can cause quality issues in the output animation data.
Environment
The ideal environment will have a flat background that contrasts with the performer’s skin colour. No objects should block the performer’s head.
Calibration
Calibration is a critical part of HMC capture to enable a high quality data reconstruction. This section shows an example of how to calibrate correctly.
Calibration board construction
Calibration requires a calibration board (checkerboard pattern) with a size of 89.5 mm x 134.2 mm and each square measuring 7.5 mm. You can use the sample from the MetaHuman Stereo Tools plugin found in the following directory:
C:\Program Files\Epic Games\UE_5.4\Engine\Plugins\Marketplace\StereoCaptureTools\Content\StereoCaptureTools.zip\StereoCaptureTools\Calibration App\stereo-calib-grid-16v11h-7p5mm.png
The pattern should be printed on high-quality paper and attached to a rigid surface with spray mount adhesive.
The calibration board should have a handle on the back so the operator can hold it without occluding the pattern.
Operation
Specific calibration footage should be taken at the start and the end of a capture session to bookend the shoot. Calibration footage may be captured at other times during the shoot if the helmet needs to be significantly adjusted or the cameras are otherwise compromised.
During calibration capture, you are attempting to paint the space with the calibration board. For each capture, the board should be moved across the field of view, including movements of up, down, left, right, roll, and pitch, in the plane where the performer’s face will be or was positioned. The video should be approximately 20 seconds long. See the example below.
Correct calibration
The following images show examples of poor calibration techniques.
Performer Requirements
This section lists the appearance parameters for MHA and what factors might have an impact on the quality of the output animation.
The system is designed to work with adult faces only.
For best results we recommend the following:
Limited facial hair only (for example, 1-2 days stubble growth).
No glasses or sunglasses.
Nothing should occlude facial features (for example: hats, long hair, face masks, hands). Anything occluding the face will degrade results.
No heavy makeup or face paint.
No facial piercings.