Voices and languages

View as MarkdownOpen in Claude

A grid of logos for TTS providers on the SignalWire platform.

SignalWire integrates natively with leading third-party text-to-speech (TTS) providers. This guide describes supported engines, voices, and languages. Refer to each provider’s documentation for up-to-date model details and service information.

Browse and audition voices

Choose a provider to browse and audition its full voice catalog. Press play to audition a voice, and use copy config to grab the engine and voice values for your SWML or SDK code. Each provider’s complete voice list lives on its reference page, linked in the table below.

Compare providers and models

SignalWire’s TTS providers offer a wide range of voice engines optimized for various applications. Select a provider, model, and voice according to the following considerations:

Language support: At time of writing, engine language support is as follows. Consult each provider’s reference documentation for the most up-to-date information.

  • Rime voices support English, Spanish, French, and German.
  • Deepgram voices support English, Spanish, German, French, Dutch, Italian, and Japanese.
  • Amazon Polly, Azure, Cartesia, and Google Cloud offer a wide range of supported languages.
  • Inworld voices support English plus Arabic, Chinese (Mandarin), Dutch, French, German, Hebrew, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, and Spanish.
  • MiniMax voices span more than 20 languages, including English, Spanish, Portuguese, French, German, Italian, Chinese, Japanese, Korean, and many more, with automatic language detection.
  • All ElevenLabs and OpenAI voices are fully multilingual.

SSML support: Google Cloud and Amazon Polly support SSML (Speech Synthesis Markup Language) as a string wrapped in <speak> tags. Consult Google Cloud’s SSML docs for details. Refer to the Amazon Polly docs for more information on using SSML and supported SSML tags.

Use voice identifier strings

Compose voice identifier strings using the following general format:

engine.voice:model
IdentifierDescription
engine
required
The TTS provider (e.g., elevenlabs, rime, openai)
voice
required
The voice identifier (name or ID depending on engine)
model
optional
Model variant (not all engines support this)

Since voice ID strings are case insensitive, the following strings are equivalent:

gcloud.en-US-Neural2-A
gcloud.en-us-neural2-a
GCLOUD.EN-US-NEURAL2-A

For detailed instructions for each provider, consult the voice ID references linked in the Usage column of the below table.

TTS providerSample voice ID stringUsage
Amazon Pollyamazon.Joanna-NeuralReference
Azureazure.en-US-AvaNeuralReference
Cartesiacartesia.a167e0f3-df7e-4d52-a9c3-f949145efdabReference
Deepgramdeepgram.aura-asteria-enReference
ElevenLabselevenlabs.thomasReference
Google Cloudgcloud.en-US-Casual-KReference
Inworldinworld.Lauren:inworld-tts-1.5-miniReference
MiniMaxminimax.English_CalmWoman:speech-2.6-turboReference
OpenAIopenai.alloyReference
Rimerime.luna:arcanaReference


Pricing

See the Voice API Pricing page for up-to-date pricing information.