Voices and languages

A grid of logos for TTS providers on the SignalWire platform.

SignalWire integrates natively with leading third-party text-to-speech (TTS) providers. This guide describes supported engines, voices, and languages. Refer to each provider’s documentation for up-to-date model details and service information.

Browse and audition voices

Choose a provider to browse and audition its full voice catalog. Press play to audition a voice, and use copy config to grab the engine and voice values for your SWML or SDK code. Each provider’s complete voice list lives on its reference page, linked in the table below.

Compare providers and models

SignalWire’s TTS providers offer a wide range of voice engines optimized for various applications. Select a provider, model, and voice according to the following considerations:

Language support: At time of writing, engine language support is as follows. Consult each provider’s reference documentation for the most up-to-date information.

Rime voices support English, Spanish, French, and German.
Deepgram voices support English, Spanish, German, French, Dutch, Italian, and Japanese.
Amazon Polly, Azure, Cartesia, and Google Cloud offer a wide range of supported languages.
Inworld voices support English plus Arabic, Chinese (Mandarin), Dutch, French, German, Hebrew, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, and Spanish.
MiniMax voices span more than 20 languages, including English, Spanish, Portuguese, French, German, Italian, Chinese, Japanese, Korean, and many more, with automatic language detection.
All ElevenLabs and OpenAI voices are fully multilingual.

SSML support: Google Cloud and Amazon Polly support SSML (Speech Synthesis Markup Language) as a string wrapped in <speak> tags. Consult Google Cloud’s SSML docs for details. Refer to the Amazon Polly docs for more information on using SSML and supported SSML tags.

Use voice identifier strings

Compose voice identifier strings using the following general format:

engine.voice:model

Identifier	Description
`engine` required	The TTS provider (e.g., `elevenlabs`, `rime`, `openai`)
`voice` required	The voice identifier (name or ID depending on engine)
`model` optional	Model variant (not all engines support this)

Since voice ID strings are case insensitive, the following strings are equivalent:

gcloud.en-US-Neural2-A
gcloud.en-us-neural2-a
GCLOUD.EN-US-NEURAL2-A

For detailed instructions for each provider, consult the voice ID references linked in the Usage column of the below table.

TTS provider	Sample voice ID string	Usage
Amazon Polly	`amazon.Joanna-Neural`	Reference
Azure	`azure.en-US-AvaNeural`	Reference
Cartesia	`cartesia.a167e0f3-df7e-4d52-a9c3-f949145efdab`	Reference
Deepgram	`deepgram.aura-asteria-en`	Reference
ElevenLabs	`elevenlabs.thomas`	Reference
Google Cloud	`gcloud.en-US-Casual-K`	Reference
Inworld	`inworld.Lauren:inworld-tts-1.5-mini`	Reference
MiniMax	`minimax.English_CalmWoman:speech-2.6-turbo`	Reference
OpenAI	`openai.alloy`	Reference
Rime	`rime.luna:arcana`	Reference

Pricing

See the Voice API Pricing page for up-to-date pricing information.