Contact Sales

All fields are required

Create a Real Time Voice Translation App with SWML | SignalWire
Developers

Create a Real Time Voice Translation App with SWML

Translate anyone on a call in real time

Bringing the babel fish to life

Real-time voice translation has lived in the realm of science fiction for decades. The babel fish from The Hitchhiker’s Guide to the Galaxy was our inspiration: a small yellow fish that could instantly translate any spoken language by being placed in someone’s ear. This allowed any person to hear any spoken language as their own native language, making universal communication possible.

SignalWire’s live_translate isn't science fiction. It's a production-ready technology that developers can use right now to build real-time translation into phone calls, IVRs, or contact centers, powered by a modern telecom stack that integrates AI directly into the media layer.

What is live_translate?

live_translate enables real-time voice translation for live phone calls. It’s a feature of SignalWire Markup Language (SWML), a YAML or JSON based markup language that allows you to define many types of workflows within SignalWire, including AI agents. The live_translate method allows developers to:

  • Translate a live conversation between two parties in real time

  • Specify translation direction from the local caller, the remote caller, or both

  • Inject translated messages into the call dynamically

  • Generate live AI summaries of what was said

  • Start and stop translation on demand

This is part of SignalWire’s Call Fabric infrastructure, where AI is embedded directly in the telecom stack. That means low latency, integrated audio processing, and precise control through SWML scripting.

See how live translation works in practice in the demo below.


There aren’t many usable solutions so far for call translation systems; SignalWire gives you the tools to build it yourself.

In many modern AI communication systems, media is streamed out through WebSockets, processed in the cloud, then returned with high latency. This causes delays, fragmented context, and poor conversational flow, which is especially painful in customer support.

SignalWire eliminates these flaws by handling everything within the media stack itself. The result is sub-500ms round-trip latency with no external infrastructure required.

This approach is faster, reliable, and scalable. And it means developers can focus on building applications instead of piecing together various services.

How to build real-time translation with SWML

The following code samples show how live_translate is used inside a SWML script.

Start a live translation session

In the below YAML example, the SWML script answers an incoming call and then starts recording the call. The script then starts a live translation session by using the start action. The script will translate the conversation from English to Spanish. After the translation session has started, the script connects the call to a destination number.

This script uses the ElevenLabs voice rachel. You can use any of SignalWire’s high-quality voices from leading text-to-speech providers.


Stop the translation mid-call

After the call is connected, the script plays a message to the caller stating that the translation session is ending.

The script stops the live translation by passing the stop action. In this example, translation begins during the call and is later stopped after a message is played, which is a useful pattern for support scenarios or limited translation time windows.


Summarize the conversation

The script initiates a summarize action to recap the conversation. Here, the summarize action sends the conversation summary to a webhook, guided by the prompt parameter that is used to provide instructions to the AI on how to summarize the conversation.


Inject a message into the call

Once the call is answered and recording has started, the call is then connected to a destination number. After the call is connected, the script injects a custom message into the conversation for the remote-caller to hear.


The inject action could be used for notifications, automated instructions, or even alerts during a live call.

Built for developers, from the creators of FreeSWITCH

Everything described here runs inside SignalWire’s Programmable Unified Communications (PUC) architecture, not on separate cloud services or third-party APIs. This means that

  • The media never leaves the SignalWire stack

  • Translation, speech synthesis, and transcription are all orchestrated via SWML, using declarative scripting

  • You control when, how, and what gets translated or recorded

  • You can mix AI Agents, Rooms, or call flows all from the same framework

Translation is just one feature of what can be built using SWML. It can be combined with digital agents, AI summarization, and real-time routing logic to build fully automated, intelligent voice experiences across any telecom channel: PSTN, SIP, WebRTC, or mobile apps.

Use cases

Developers can use live_translate to power:

  • Multilingual customer service

  • In-call live interpretation

  • Medical support across language barriers

  • Live translation for global conferencing

  • Global sales and intake

Because translation is programmable, the same script can easily adapt to different voices, languages, and contexts with just a few variable changes.

Start building today and connect customers globally with live translation

With SignalWire’s live_translate, the idea of real-time, multilingual phone conversations is no longer a dream for the future. It's available right now in a few lines of SWML.

This is not just a new feature. It’s part of a broader shift in telecom: where voice, AI, and automation converge into one programmable, developer-first stack.

It’s never been easier to build real-time translation into your apps and contact centers. Sign up today for a SignalWire space and join the conversation on our community Discord.

Related Articles