Voice & Language

View as MarkdownOpen in Claude

Voice Configuration Overview

Language Configuration

ParameterDescriptionExample
nameHuman-readable name"English"
codeLanguage code for STT"en-US"
voiceTTS voice identifier"rime.spore" or "elevenlabs.josh:eleven_turbo_v2_5"

Fillers (Natural Speech)

ParameterDescriptionExample
speech_fillersUsed during natural conversation pauses["Um", "Well", "So"]
function_fillersUsed while executing a function["Let me check...", "One moment..."]

Adding a Language

Basic Configuration

The add_language method configures Text-to-Speech and Speech-to-Text for an agent:

LanguageSyntax
Pythonagent.add_language("English", "en-US", "rime.spore")
TypeScriptagent.addLanguage({ name: 'English', code: 'en-US', voice: 'rime.spore' })
1from signalwire import AgentBase
2
3class MyAgent(AgentBase):
4 def __init__(self):
5 super().__init__(name="my-agent")
6
7 # Basic language setup
8 self.add_language(
9 name="English", # Display name
10 code="en-US", # Language code for STT
11 voice="rime.spore" # TTS voice
12 )

Voice Format

The voice parameter uses the format engine.voice:model where model is optional:

1## Simple voice (engine.voice)
2self.add_language("English", "en-US", "rime.spore")
3
4## With model (engine.voice:model)
5self.add_language("English", "en-US", "elevenlabs.josh:eleven_turbo_v2_5")

Available TTS Engines

ProviderEngine CodeExample VoiceReference
Amazon Pollyamazonamazon.Joanna-NeuralVoice IDs
Cartesiacartesiacartesia.a167e0f3-df7e-4d52-a9c3-f949145efdabVoice IDs
Deepgramdeepgramdeepgram.aura-asteria-enVoice IDs
ElevenLabselevenlabselevenlabs.thomasVoice IDs
Google Cloudgcloudgcloud.en-US-Casual-KVoice IDs
Microsoft Azureazureazure.en-US-AvaNeuralVoice IDs
OpenAIopenaiopenai.alloyVoice IDs
Rimerimerime.luna:arcanaVoice IDs

Filler Phrases

Add natural pauses and filler words:

1self.add_language(
2 name="English",
3 code="en-US",
4 voice="rime.spore",
5 speech_fillers=[
6 "Um",
7 "Well",
8 "Let me think",
9 "So"
10 ],
11 function_fillers=[
12 "Let me check that for you",
13 "One moment please",
14 "I'm looking that up now",
15 "Bear with me"
16 ]
17)

Speech fillers: Used during natural conversation pauses

Function fillers: Used while the AI is executing a function

Multi-Language Support

Use code="multi" for automatic language detection and matching:

1class MultilingualAgent(AgentBase):
2 def __init__(self):
3 super().__init__(name="multilingual-agent")
4
5 # Multi-language support (auto-detects and matches caller's language)
6 self.add_language(
7 name="Multilingual",
8 code="multi",
9 voice="rime.spore"
10 )
11
12 self.prompt_add_section(
13 "Language",
14 "Automatically detect and match the caller's language without "
15 "prompting or asking them to verify. Respond naturally in whatever "
16 "language they speak."
17 )

The multi code supports: English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch.

Speech recognition hints do not work when using code="multi". If you need hints for specific terms, use individual language codes instead.

For more control over individual languages with custom fillers:

1class CustomMultilingualAgent(AgentBase):
2 def __init__(self):
3 super().__init__(name="custom-multilingual")
4
5 # English (primary)
6 self.add_language(
7 name="English", code="en-US", voice="rime.spore",
8 speech_fillers=["Um", "Well", "So"],
9 function_fillers=["Let me check that"]
10 )
11
12 # Spanish
13 self.add_language(
14 name="Spanish", code="es-MX", voice="rime.luna",
15 speech_fillers=["Eh", "Pues", "Bueno"],
16 function_fillers=["Dejame verificar", "Un momento"]
17 )
18
19 # French
20 self.add_language(
21 name="French", code="fr-FR", voice="rime.claire",
22 speech_fillers=["Euh", "Alors", "Bon"],
23 function_fillers=["Laissez-moi verifier", "Un instant"]
24 )
25
26 self.prompt_add_section(
27 "Language",
28 "Automatically detect and match the caller's language without "
29 "prompting or asking them to verify."
30 )

Pronunciation Rules

Fix pronunciation of specific words:

LanguageSyntax
Pythonagent.add_pronunciation(replace="ACME", with_text="Ack-me")
TypeScriptagent.addPronunciation({ replace: 'ACME', with: 'Ack-me' })
1class AgentWithPronunciation(AgentBase):
2 def __init__(self):
3 super().__init__(name="pronunciation-agent")
4 self.add_language("English", "en-US", "rime.spore")
5
6 # Fix brand names
7 self.add_pronunciation(replace="ACME", with_text="Ack-me")
8
9 # Fix technical terms
10 self.add_pronunciation(replace="SQL", with_text="sequel")
11
12 # Case-insensitive matching
13 self.add_pronunciation(replace="api", with_text="A P I", ignore_case=True)
14
15 # Fix names
16 self.add_pronunciation(replace="Nguyen", with_text="win")

Set Multiple Pronunciations

1## Set all pronunciations at once
2self.set_pronunciations([
3 {"replace": "ACME", "with": "Ack-me"},
4 {"replace": "SQL", "with": "sequel"},
5 {"replace": "API", "with": "A P I", "ignore_case": True},
6 {"replace": "CEO", "with": "C E O"},
7 {"replace": "ASAP", "with": "A sap"}
8])

Voice Selection Guide

Choosing the right TTS engine and voice significantly impacts caller experience.

Use Case Recommendations

Use CaseRecommended Voice Style
Customer ServiceWarm, friendly (rime.spore)
Technical SupportClear, professional (rime.marsh)
SalesEnergetic, persuasive (elevenlabs voices)
HealthcareCalm, reassuring
Legal/FinanceFormal, authoritative

TTS Engine Comparison

EngineLatencyQualityCostBest For
RimeVery fastGoodLowProduction, low-latency needs
ElevenLabsMediumExcellentHigherPremium experiences, emotion
Google CloudMediumVery goodMediumMultilingual, SSML features
Amazon PollyFastGoodLowAWS integration, Neural voices
OpenAIMediumExcellentMediumNatural conversation style
AzureMediumVery goodMediumMicrosoft ecosystem
DeepgramFastGoodMediumSpeech-focused applications
CartesiaFastGoodMediumSpecialized voices

Choosing an Engine

Prioritize latency (Rime, Polly, Deepgram):

  • Interactive conversations where quick response matters
  • High-volume production systems
  • Cost-sensitive deployments

Prioritize quality (ElevenLabs, OpenAI):

  • Premium customer experiences
  • Brand-sensitive applications
  • When voice quality directly impacts business outcomes

Prioritize features (Google Cloud, Azure):

  • Need SSML for fine-grained control
  • Complex multilingual requirements
  • Specific enterprise integrations

Testing and Evaluation Process

Before selecting a voice for production:

  1. Create test content with domain-specific terms, company names, and typical phrases
  2. Test multiple candidates from your shortlisted engines
  3. Evaluate each voice: Pronunciation accuracy, natural pacing, emotional appropriateness, handling of numbers/dates/prices
  4. Test with real users if possible — internal team members or beta callers
  5. Measure latency in your deployment environment

Dynamic Voice Selection

Change voice based on context:

1class DynamicVoiceAgent(AgentBase):
2 DEPARTMENT_VOICES = {
3 "support": {"voice": "rime.spore", "name": "Alex"},
4 "sales": {"voice": "rime.marsh", "name": "Jordan"},
5 "billing": {"voice": "rime.coral", "name": "Morgan"}
6 }
7
8 def __init__(self):
9 super().__init__(name="dynamic-voice")
10
11 def on_swml_request(self, request_data=None, callback_path=None, request=None):
12 call_data = (request_data or {}).get("call", {})
13 called_num = call_data.get("to", "")
14
15 if "555-1000" in called_num:
16 dept = "support"
17 elif "555-2000" in called_num:
18 dept = "sales"
19 else:
20 dept = "billing"
21
22 config = self.DEPARTMENT_VOICES[dept]
23 self.add_language("English", "en-US", config["voice"])
24 self.prompt_add_section("Role", f"You are {config['name']}, a {dept} representative.")

Language Codes Reference

Supported language codes:

LanguageCodes
Multilingualmulti (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch)
Arabicar, ar-AE, ar-SA, ar-QA, ar-KW, ar-SY, ar-LB, ar-PS, ar-JO, ar-EG, ar-SD, ar-TD, ar-MA, ar-DZ, ar-TN, ar-IQ, ar-IR
Belarusianbe
Bengalibn
Bosnianbs
Bulgarianbg
Catalanca
Croatianhr
Czechcs
Danishda, da-DK
Dutchnl
Englishen, en-US, en-AU, en-GB, en-IN, en-NZ
Estonianet
Finnishfi
Flemishnl-BE
Frenchfr, fr-CA
Germande
German (Switzerland)de-CH
Greekel
Hebrewhe
Hindihi
Hungarianhu
Indonesianid
Italianit
Japaneseja
Kannadakn
Koreanko, ko-KR
Latvianlv
Lithuanianlt
Macedonianmk
Malayms
Marathimr
Norwegianno
Persianfa
Polishpl
Portuguesept, pt-BR, pt-PT
Romanianro
Russianru
Serbiansr
Slovaksk
Sloveniansl
Spanishes, es-419
Swedishsv, sv-SE
Tagalogtl
Tamilta
Telugute
Turkishtr
Ukrainianuk
Urduur
Vietnamesevi

Complete Voice Configuration Example

1from signalwire import AgentBase
2
3class FullyConfiguredVoiceAgent(AgentBase):
4 def __init__(self):
5 super().__init__(name="voice-configured")
6
7 # Primary language with all options
8 self.add_language(
9 name="English", code="en-US", voice="rime.spore",
10 speech_fillers=["Um", "Well", "Let me see", "So"],
11 function_fillers=["Let me look that up for you", "One moment while I check", "I'm searching for that now", "Just a second"]
12 )
13
14 # Secondary language
15 self.add_language(
16 name="Spanish", code="es-MX", voice="rime.luna",
17 speech_fillers=["Pues", "Bueno"],
18 function_fillers=["Un momento", "Dejame ver"]
19 )
20
21 # Pronunciation fixes
22 self.set_pronunciations([
23 {"replace": "ACME", "with": "Ack-me"},
24 {"replace": "www", "with": "dub dub dub"},
25 {"replace": ".com", "with": "dot com"},
26 {"replace": "@", "with": "at"}
27 ])
28
29 self.prompt_add_section("Role", "You are a friendly customer service agent.")