Voice Cloning

Voice cloning in Dubformer Studio allows you to recreate a speaker’s voice and use it across your projects.

Two cloning modes are available: Expressive and Standard.

Expressive Voice Cloning

Expressive cloning delivers significantly improved naturalness, emotional range, and audio fidelity — making it the preferred choice for high-quality, production-ready content.

What it does:

Produces more lifelike and engaging voiceovers
Supports a broader set of languages

Emotional control

Expressive cloning supports emotion tags, similar to S-tier voices.

Use square brackets directly in the script to guide delivery:

[laughs]
[whispers]
[sighs]

Standard Voice Cloning

Standard cloning is a reliable and scalable option for typical dubbing workflows.

What it does

Recreates the speaker’s voice with consistent quality
Focuses on clarity and stability rather than expressive range
Optimized for high-volume production

How it works

1. During project creation

If you use Use automated voice selection, you can configure voice cloning upfront.

Select the cloning mode: Expressive or Standard
The selected mode will be applied automatically

Note
You can change the cloning mode later in the Speakers tab during Prooflistening or Lip-sync tasks.

2. During Voice selection, Prooflistening or Lipsync tasks

Go to the Speakers tab
Click Use cloning to apply cloning to all speakers at once

Select the cloning mode (Expressive or Standard)

Click Confirm

A notification will appear indicating that cloning is in progress.

Once cloning is complete, the voices will be displayed as Expressive clone or Standard clone, depending on the selected mode.

To change the cloning mode for an individual speaker:

Click the settings (gear) icon next to the cloned voice

Select the desired mode: Expressive or Standard
Click Accept
Save progress and Process the changes

If results need improvement, refine the reference chunks used for cloning