Audio Capture Recommendations | MetaHuman Documentation

The quality of the generated facial animations will depend on the clarity of the recorded audio.

Follow these recommendations to improve the animation results:

Recommendation	Configuration
A minimum 16kHz sampling rate	Lower sampling rates will remove important speech related frequencies from the signal.
Minimize background noise	Audio with a higher signal-to-noise ratio will result in cleaner animations.
Avoid reverb and echoes	The animation quality will be reduced if these effects are present in the recording.
One speaker per audio file	The feature will animate to all voices present in the recording, so having multiple speakers or background voices could affect the resulting animation.

Next Up

Realtime Animation Guidelines

Guidelines for capturing video for real-time animation.