Technical blog: More details on how we built voice reporting

November 16, 2023
Today, we’re introducing voice reporting in Fortnite. Now, in addition to being able to block, mute and report other players, people will be able to use voice reporting to submit audio evidence when they report suspected violations of our Community Rules. You can read more in the Fortnite blog.

We’re sharing more details about our technical approach to explain how and why we built this feature the way we did.


Signing and attributing audio through cryptographic keys

When building the voice reporting system, we knew we needed to have confidence that the audio reported to us by players was correctly attributed to each of the participants in the voice chat. This is important because if a player is suspected of violating our Community Rules, we need to be sure that we take action against the right participant in the group chat.

To achieve this we use public key cryptography to generate digital signatures so voice packets can be attributed to the correct participant. Packets are signed by the participant’s private key and all other users can verify them using the sender’s public key, which prevents players from spoofing the system.

Now that voice reporting is live, when players launch Fortnite on their device, the Fortnite client generates an Ed25519 elliptic curve key pair for use in signing operations using OpenSSL. The client then sends a copy of the player’s public key (and only the public key) to our backend system using the player’s user authentication token and receives a signed token in response where Epic attests the public key was registered in our system and associated with the player’s Epic account.

We selected the Ed25519 algorithm for digital signatures for several reasons. Ed25519 has much better performance for both signing and verifying voice packets compared to traditional algorithms like RSA, especially on mobile devices. It also has shorter key sizes and signature blocks for equivalent protection, resulting in lower data transfer requirements, which is important in a multiplayer game where we want to minimize the data sent from each client.


Voice System Integration

Once the player has generated their signing key pair, the Fortnite client initializes the Epic Online Services (EOS) Voice subsystem with the key pair and the signed token from our backend. The signed token is then replicated to the backend RTCP signaling server, which validates the token and ensures it was signed by Epic and allows the participant to join the voice channel.

When a player joins a voice channel, the signed token is replicated by our signaling server to all other participants in the channel and the joining player receives copies of everyone else’s signed tokens. This allows all participants in the voice channel to exchange each others’ public keys and be assured that Epic’s backend systems have all correctly attributed the keys to each of the players.

As the player starts to talk in the voice chat, the EOS Voice subsystem sends voice packets from their Fortnite client to our Voice backend systems at roughly 60-millisecond intervals and each of those packets will be digitally signed using the Ed25519 private key generated when the player launches Fortnite, which gives us the assurance and non-repudiation the player was responsible for that audio packet.

When the Fortnite client receives inbound voice packets from the EOS Voice service, it will then perform Ed25519 digital signature verification operations with the sender’s public key to ensure that the received voice packets are correctly signed by that player’s private key, thus ensuring that each player only hears audio that can be cryptographically attributed to the other participants in the channel.


Voice reporting: audio capture and user choice

In addition to signing and attributing audio using cryptographic key generation, we wanted to build our voice reporting system in a way that ensured it was the participants’ devices (not the Epic Games servers) that captured audio, and that participants could affirmatively choose whether to submit audio evidence to Epic for review. We made a deliberate design choice to build a system that does not capture or monitor all voice traffic out of respect for players’ privacy and choices.

This means our backend services do not store any audio traffic as part of the system; the audio is only processed by Epic's backend services in transit, and then is captured on participants’ devices in the runtime memory of each participant in the voice channel. As a result, if you close your copy of Fortnite and re-open it, the buffered audio doesn’t exist anymore and you cannot report previous conversations. The only way Epic ever receives a copy of the audio clip is if voice reporting is on and a participant affirmatively reports the voice chat using the in-game voice reporting feature.

When voice reporting is on, the EOS SDK begins buffering the digitally signed audio packets from all voice participants into a separate region of process memory. For performance reasons, the size of the memory buffer is constrained but it should hold approximately the last five minutes of voice chat audio. We chose a 10MB buffer size to balance performance with ensuring participants can submit enough evidence to be effective for moderation.

When a participant submits a voice report, their Fortnite client will request a ticket from another of Epic’s backend services in order to upload its audio buffer. This audio buffer contains all captured audio from the last five minutes, including audio from other participants in the conversation. The reporter’s Fortnite client will then upload the signed audio chunks from its audio buffer into an encrypted S3 bucket. All operations to the bucket are tracked and monitored to ensure that only authorized systems interact with the uploaded audio. When the upload is finished, the Fortnite client attaches the upload ID to a player report and sends that data to our backend.

Epic backend systems ingest the uploaded audio, which consists of 60ms digitally signed multi-track audio. The systems then verify the digital signatures of all audio, ensuring that the account responsible for each audio chunk correctly signed it with a key countersigned by Epic’s backend service when the user launched Fortnite. The audio is then transcoded into individual Opus audio tracks per participant and forwarded to Epic’s internal moderation team for review.
 
Voice Reporting Details Moderation V3
Moderator interface for reviewing voice reports

We utilize automated tools built by Contex.ai to complement our human moderation efforts. Contex.ai is a company that develops moderation tools that use machine learning to detect inappropriate audio, text, and images, and joined Epic Games earlier this year. Contex.ai’s technology is already being used to help flag content in Unreal Editor for Fortnite (UEFN) that violates our guidelines before it’s published to the community.
 
If the suspected violation—or a new violation—of our rules is found during the review process, the reported audio will be copied into a separate location under control of our sanction management system.

All copies of the uploaded audio (including the original) will be deleted after 14 days or the duration of the sanction, whichever is longer. In the event a sanction is appealed, retention may be extended for up to 14 days so the sanction decision can be reviewed. If Epic Games needs to retain an audio clip to comply with legal obligations, it will be retained for as long as legally required.

You can read the voice reporting FAQs to learn more, or visit safety.epicgames.com.

    Ready to get started?

    Learn more about Epic Online Services’ Voice and download the SDK.