Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.aytada.app/llms.txt

Use this file to discover all available pages before exploring further.

Aytada generates professional voiceover as part of the video ad pipeline. The voiceover is synthesized from your ad script, delivered by an automatically selected voice matched to your persuasion trigger, and merged with your video clips during the final stitching step. You can also clone a spokesperson voice from a short audio sample and use it as a drop-in replacement on any project. Cost: 5 credits per voiceover.

Voice synthesis

Aytada uses a high-fidelity text-to-speech system as its primary voiceover model. It produces natural, expressive narration suitable for direct-response ads, brand films, and UGC-style content. Emotion tags embedded in the script (for example, [excited] at the hook or [urgently] at the CTA) allow different sections of the script to carry different emotional registers without changing the voice. If the primary synthesis service is unavailable or times out, a fast fallback model activates automatically. You receive the same credit deduction and a finished voiceover either way — the switch is transparent.

Voice selection

Aytada automatically selects the best-matched voice for your persuasion trigger. You can also choose manually from eight available voices in the Studio.
VoiceCharacterPersuasion trigger match
AdamProfessional maleAuthority
JessicaExpressive American femaleLiking, Reciprocity
ChrisCasual maleSocial Proof, Consistency
CharlotteEnergetic femaleScarcity & FOMO
GeorgeWarm British maleLuxury / Cinematic
SarahWarm femaleEmotional
DanielDeep maleDirect Response
LauraUpbeat femaleTrend / UGC
Voice auto-selection uses your persuasion trigger as the primary signal. If you manually override the voice in the Studio, your selection is preserved for all future regenerations on that project.

Tone-aware delivery settings

Each ad tone maps to a set of voice delivery parameters that control naturalness, expressiveness, and consistency. These settings are applied automatically when you generate voiceover.
Ad toneStabilitySimilarityStyleEffect
Funny0.350.750.70High expressiveness, comedic timing, natural variation
Luxury0.800.850.15Controlled, measured, minimal stylization
Aggressive0.400.800.65High energy, punchy delivery
Minimal0.700.800.20Clean and restrained
Professional0.600.800.30Balanced clarity and warmth
You do not need to configure these settings manually. Setting your ad tone in Step 2 of the wizard is sufficient.

How voiceover is merged with video

After the voiceover is generated, it is stored as an audio asset attached to your project. During the Stitch step, the cloud render service sequences all video clips on one track and overlays the voiceover on a separate track. The video’s own audio (ambient sound, music generated natively by the video model) is automatically ducked to 30% volume so the voiceover remains clear and intelligible throughout. If you generated a background music track or jingle separately, it can be mixed in at this same step as an additional audio layer.

Voice cloning

If you have a specific spokesperson — yourself, a brand character, or a talent you have rights to — you can clone their voice from a short audio sample and use it for all voiceovers on your account.
1

Record or prepare a sample

Record a clean audio sample of the voice you want to clone. The sample should be 5–30 seconds long, contain only one speaker, and be free of background music, echo, or noise. MP3 and WAV formats are both accepted.
2

Upload in the Studio

Open your project in the Studio and navigate to the Voiceover module. Select Clone a Voice, then upload the audio file.
3

Generate with the cloned voice

Once the clone is processed (typically under 30 seconds), it appears as a selectable voice in the voice picker. Select it and click Generate Voiceover — the cloned voice is used in place of the standard library voices.
Only clone voices for which you have explicit permission to do so — your own voice, a voice actor you have licensed, or a spokesperson who has consented. Cloning voices without permission may violate the rights of the voice owner and Aytada’s terms of service.
For best cloning results, record in a quiet room with a directional microphone. Avoid samples with music, reverb, or multiple speakers. A high-quality 10–15 second clip produces results comparable to a 30-second clip.

Regenerating voiceover

You can regenerate the voiceover on any existing project from the Studio without re-rendering the video clips. This is useful when:
  • You want to try a different voice or persuasion trigger
  • The script was edited after the initial voiceover was generated
  • You want to apply a cloned voice to a project that previously used a library voice
Each regeneration costs 5 credits.

Generate video ads

Walk through the full video ad creation workflow, including voiceover and stitch steps.

Brand jingles

Add a background music track that mixes with your voiceover in the final render.