May 8th, 2026
New
In addition to recreating the original speaker’s voice and performance, Emotion Transfer can now recreate the original chunk’s performance with any supported voice.
Choose the voice that fits the content, character, or localization direction while keeping the original delivery — including intonation, rhythm, intensity, and duration.
Why it matters
With this update, it is possible to:
choose any supported voice while keeping the delivery of the original speech;
save several generated takes and compare them before selecting the best one;
fine-tune intonation, intensity, variation, and duration;
recreate performance chunk by chunk, in bulk, or directly to speakers.
There is no need to choose between the right voice and the right performance. The original delivery can be preserved while using a different supported voice, giving more control over casting, style, and final dubbing quality.
Performance Mode is available for S-tier, A-tier, Expressive cloned, Standard cloned, and Voice Design voices.
Supported target languages and locales: English (US), English (British), German, Spanish (Castilian), Spanish (Latin American), French (Parisian), Italian and Russian.
Learn more
For setup details and usage guidance, see the documentation