All fields are required
Multi-vendor voice AI creates failure modes that do not appear in development. One stack eliminates the vendor chain that breaks under load.
Your telephony provider has per-account concurrent call limits. Your speech-to-text provider has request rate limits. Your language model has token rate limits. Each scales differently. When one throttles, the entire pipeline degrades.
Multiple webhooks for the same call arrive at your server. Both read state. Both mutate. Last write wins. Data loss. This does not happen with one concurrent call in development.
Speech-to-text delays push language model calls later, which push text-to-speech later, which overlap with the next conversation turn. Each hop compounds delay. Six vendors means six potential bottlenecks.
The caller interrupts during AI generation. Your app receives the full response and sends it to text-to-speech. The caller hears an answer to a question they already corrected.
from signalwire_agents import AgentBase
from signalwire_agents.core.function_result import SwaigFunctionResult
class SupportAgent(AgentBase):
def __init__(self):
super().__init__(name="Support Agent", route="/support")
self.prompt_add_section("Instructions",
body="You are a customer support agent. "
"Greet the caller and resolve their issue.")
self.add_language("English", "en-US", "rime.spore:mistv2")
@AgentBase.tool(name="check_order")
def check_order(self, order_id: str):
"""Check the status of a customer order.
Args:
order_id: The order ID to look up
"""
return SwaigFunctionResult(f"Order {order_id}: shipped, ETA April 2nd")
agent = SupportAgent()
agent.run()
| Failure Mode | Multi-Vendor Cause | Single-Stack Resolution |
|---|---|---|
| Scaling mismatch | Each vendor throttles at different thresholds | Uniform scaling, no cross-vendor bottlenecks |
| Race conditions | Distributed state across vendors | Sequential state processing within the platform |
| Out-of-order events | Network-dependent webhook delivery | In-process event ordering |
| Zombie calls | State desync between systems | Platform owns the full call lifecycle |
| Stale context | Network-delayed cancellation after interruption | In-process cancellation, zero round-trip |
| Cascading latency | Each network hop compounds delay | Zero internal network hops between components |
| Error Type | Fatal? | Platform Response |
|---|---|---|
| STT failure | No | Recovery phrase asks caller to repeat |
| LLM timeout | No | Retry with fallback model |
| TTS failure | No | Fallback voice or text-based response |
| Tool handler timeout | No | Inform caller, retry or skip |
| Tool handler error | No | Error-specific recovery phrase |
| Network partition | Yes | Graceful hangup with state capture |
| Authentication failure | Yes | Redirect to appropriate flow |
Define the agent in Python or YAML. Test locally with real phone calls.
Ship the same code to production. The architecture does not change with scale.
The system that handles one call handles ten thousand calls the same way. No vendor chain to coordinate.
Per-component latency, barge-in analytics, and error classification are built in. No third-party APM required.
In development, you test one call at a time on a fast network. In production, multiple concurrent calls hit vendors with different scaling limits, different failure modes, and different SLAs. The failure modes are structural, not bugs in your code.
State lives inside the platform. The media engine holds it. Events are processed sequentially within each call's context. There are no concurrent webhook deliveries and no external state stores to race against.
The platform classifies the error into one of 10 types. Non-fatal errors trigger automatic recovery (retry, fallback model, recovery phrase). Fatal errors execute a graceful shutdown and capture final state.
Yes. The Python agent framework and declarative YAML both support existing conversation logic. Your tool handlers receive authoritative context from the platform.
The infrastructure processes 2.7 billion minutes annually across 2,000+ companies. Built by the team behind FreeSWITCH, the open-source telephony engine powering carrier-grade deployments worldwide.
Trusted by 2,000+ Companies
Build on a single stack where production behaves the same as development. No vendor chain. No surprises under load.