The workflow to generate animation from depth data consists of the following steps:
Create a new MetaHuman Performance asset by right-clicking in the Content Browser, then select MetaHuman > MetaHuman Performance.
Open the MetaHuman Performance asset editor, and navigate to the Details panel.
Select Depth Footage as the Input Type.
In the Footage Capture Data field, select the Capture Data asset that represents the performance take.
In the MetaHuman Identity field, select the MetaHuman Identity asset for the performer that features in the take.
In the Visualization Mesh, select your MetaHuman face mesh.
Click Process to generate the facial animations.
Depth Footage Processing Requirements
When the Input Type is set to Depth Footage, the MetaHuman Performance requires the following minimal setup to run:
Footage Capture Data needs to reference the performance’s footage.
MetaHuman Identity needs to reference a MetaHuman Identity Asset that’s been configured for the performer, the way they look in the performance being delivered.
Once these two attributes are configured correctly, you’re ready to start processing footage.
The head and neck transformation needs to be registered against a rest pose of sorts. By default, this is done automatically by analyzing all solved frames, and finding the “most in front facing frame”.
If you have a specific frame you wish to use that shows what you consider the head rest pose, you can disable this in the Advanced attributes section, and manually enter the reference frame (which must be inside the solved range).
If no frame in the take presents a rest pose, then the resulting animation will have some angular offset "baked" into it which will need correcting. This only truly applies to statically mounted shots, because head-mounted devices won’t offer desirable head spatial movement in the first place.
You can process a maximum of 36,000 frames (equivalent to 10 minutes of footage at 60fps, or 20 minutes at 30fps). Your Capture Data length can be longer than this maximum.
To learn more about the optimal settings for this asset, read the Recommended Unreal Engine Project Settings page.
Depth Footage Time/Quality Parameters
When the Input Type is set to Depth Footage, solving animation can be a very processing-intensive and long process.
Below you will find several options that can be used to improve iteration times while reducing accuracy. With these options you can inspect the quality of the results and make quick dry runs before running the final, more expensive solution.
The most common time tradeoff is the number of frames processed: You can set the boundaries of the footage to process, and directly affect the output by setting Start and End Frame to Process attributes.
The Solve Type has three options: Preview, Standard, and Additional Tweakers. While not necessarily "quality" parameters, they are ordered in processing time requirements as follows:
The Preview Mode solver is very quick, and it’s used to offer frame previews while the animation is being solved. In some cases, this might be all you’re looking for, while in other cases, it might be something that you use in combination with a very reduced frame range to have a dry run on your footage and see if everything is in good order.
The Standard solver is a full quality solver and it produces animation for a large number of channels, but not every channel. This might be the setting chosen for most final quality solutions.
The Additional Tweakers solver is similar to the Standard solver, but will produce animation for some additional channels specified on the Tweaker controls.
The Skip Filtering attribute determines whether the animation curves should be post processed for smoothness or not. You can skip this step to obtain the unprocessed curves. It is currently not possible to use your own filters directly with the MetaHuman Performance assets.
The Skip Per Vertex Solve attribute determines whether ‘per-vertex’ solving is turned on or off during processing - this attribute is enabled by default which means per-vertex solving is turned off. Per-vertex solving is a potentially more accurate, but very time-consuming solver option that allows the solver to go 'outside' the constraints of the rig by adding a tiny offset to each vertex so that it exactly matches the target solve. The result of the per-vertex solve can be applied by the Standard solver or Additional Tweakers solver.
The per-vertex solve is very dependent on the quality of the input data and it is typically only worth turning per-vertex solve on (by disabling this option) if your final target MetaHuman is a high-quality digital double rig that can also be used as the identity. The performance must also have been captured using a well calibrated stereo head mounted camera pair. It is not generally recommended to turn this option on unless you are working with this type of data.
Lastly, the Audio to Tongue solver requires audio, and your take might not have audio, or simply not show or require tongue animation at all. Skip Tongue Solve offers control on whether Audio to Tongue runs or not.
It’s worth noting that while the tongue solver does have a processing cost, it is very low, and skipping the step won’t save a lot of time compared to the rest of the solution time for each frame.