Auto-mixing
What are some common usage scenarios?
Choosing the Mixing Mode:
How to choose the Voice Replacement mode:
Simply set the
original_sound_volumeparameter to0.0. This will eliminate the background sound, and no mixing will actually take place. All other parameters that control the background sound and sound overlay will be ignored.How to choose the Voiceover mode:
If
original_sound_volumeis not equal to0.0, and the list of speaker IDs to replace with synthesis is empty (narrator_speaker_ids=[]), then the synthesis overlay mode is used for all speakers.Note: It's important that the parameter
remove_voiceis set to0with these settings. Otherwise, all speakers will be passed tonarrator_speaker_ids, and the final mode will be set to Smart Voiceover.How to choose the Custom Voiceover mode:
If
original_sound_volumeis not equal to0.0, and the list of speaker IDs for voice replacement with synthesis (narrator_speaker_id) is not empty. Then, speech replacement will be performed for those speakers, and synthesis will be overlaid on the speech of the remaining speakers.How to choose the Smart Voiceover mode:
If
original_sound_volumeis not equal to0.0, and all Speaker IDs for voice replacement are listed innarrator_speaker_ids, then voice replacement will occur for all speakers, resulting in the Smart Voiceover mode. Similarly, if the list of voice replacement Speaker IDs is emptynarrator_speaker_ids=[], butremove_voice=1, then all speakers will be passed tonarrator_speaker_ids.
Speaker Control for Speech Replacement and Synthesis Overlay:
I want no sounds other than synthesis at all.
Choose the Voice Replacement mode.
I want synthesis to be overlaid on the voice of all speakers.
Choose the Voiceover mode.
I want synthesis to replace the original speech for all speakers.
Choose the Smart Voiceover mode.
I want to select the speakers for which I want to do voice replacement, and overlay the speaker on top of the original speech for the rest.
Choose the Custom Voiceover mode.
Adjusting Volume/Sounds within Synthesis Overlay Segments:
When voiceover is used, the original speakers’ speech is too loud compared to the synthesis.
The parameter
voiceover_muffle_dbcontrols how much to reduce the volume of the original speech overlaid with the synthesis. The higher the value of this parameter, the quieter the original speech will be.The synthesized voice is too quiet/loud compared to the background track.
To adjust the volume of the synthesis, you can use the
transfer_voice_loudness=Trueflag. This will transfer the loudness of all lines from the original speech to the synthesis lines. Alternatively, you can manually adjust the overall loudness of the synthesis track by setting thetts_loudnessparameter. It is not recommended to simultaneously change thetransfer_voice_loudnessandtts_loudnessparameters at the same time, as their combined effect on the track volume may be unpredictable.Important: Changing these parameters does affect the volume of the original speech.
Artifacts/distortions of the background are audible during the synthesis overlay.
If no client track is used, we use AI to get our own background track, and artifacts may appear. By default, we leave the background volume unchanged and only adjust the volume of the original speech, making it quieter. This behavior can be disabled by setting
voiceover_fix_background_volume=False, in which case the background volume will also be reduced, and potential artifacts/distortions will be less noticeable.There is a client background track, but in moments of synthesis overlay, the background is very loud.
If we have a client track and need to adjust the volume, which should be reduced during the synthesis overlay, you can set the
background_muffle_dbparameter. The higher its value, the quieter the background will be at that moment.
Adjusting the Volume/Sounds outside Synthesis Overlay Segments:
The original speaker’s voice is loud but becomes quieter during the synthesis moments and returns to the original, which is very noticeable. How can it be made consistently quiet throughout the video?
To keep the volume of the original speech consistent throughout the entire video, you can set the
voiceover_fix_vocals_volume=Trueparameter. This will fix and reduce the volume of the original voice by the selected value ofvoiceover_muffle_db.