SignalWire just made multi-agent voice apps as simple to build as web apps.
The new SignalWire Agents SDK gives you the tools, design patterns, and agent templates to launch multi-agent voice apps and scale them in a fraction of the time—without getting bogged down in SIP, transfers that don't work, arcane telephony issues, and complex DevOps.
If you want to replace an outdated IVR with AI, create your own agent building studio, or build an AI-operated contact center from scratch, this is a great place to start.
We know there's a ton of of noise in the "multi-agent" and "voice AI" space, but run through these steps to find out for yourself:
This SDK is pure signal.
1. Install the SDK in your agentic coding tool of choice (Cursor, Windsurf, Claude Code, or Codex)
2. Point your coding agent to the "Fred" tutorial that's packaged with the SDK
3. Prompt your coding agent of choice to build your first agent
4. Connect your local server to SignalWire's infrastructure with ngrok
5. Test the conversation flow from the CLI
6. Buy a phone phone number and point it to your agent or route calls to it from any PBX with SIP
That's it: you’ve prototyped your first agent.
Why we built an Agents SDK
In 2025, if you're looking for voice agents that can handle basic tasks—simple lead qualification, booking appointments, debt collection, or Q&A with RAG—there are plenty of available solutions.
Need something more advanced?
Multi-agent orchestration for triage or automating complex phone calls
Multi-tenant capabilities for white-labeled agent builders
Personalization or A/B testing frameworks
Omni-channel agents that maintain context across voice, chat, video, and text messaging channels
Reliable transfers to other AI agents or humans with context intact
Easy integrations with contact center infrastructure
Advanced testing and debugging tools with detailed latency metrics and SIP errors.
You start running into walls.
Not anymore.
Tentpole concept: Voice agents as microservices
Each one is a web app with HTTP endpoints, an AI persona with access to specific tools, and communication endpoint that handles voice, video, and messaging, and a scalable service ready for production deployment.
This design pattern means you're not just scripting behaviors with step-by-step prompts or visual workflow builders.
You're building teams of real time voice agents that can be deployed, scaled, and managed like any modern microservice.
If you can build a web app, you can build a multi-agent voice app
You can connect your agents to phone numbers and integrate them with contact centers and business phone systems with SIP.
The parallels to web development are intentional:
Routes: Agents have URL endpoints like web apps (YOURDOMAIN/sales-agent)
Request/Response: Uses standard HTTP patterns
Agent Components: Create custom skills, prompt sections, and reusable agents. Organize them the way you'd organize UI components and API integrations in a modern web app
Serverless Deployments: Instant serverless deployment across Lambda, Google Cloud Functions, and Azure
Dynamic configurations. Configure agents at runtime using parameters from the HTTP requests. Build personalized agents, multi-tenant agent studios, and A/B testing frameworks (/sales-agent?VIP?long_prompt_test)
No telecom expertise or SIP middleware required
With other platforms and DIY implementations, you’ll need telephony expertise to navigate issues with SIP, call transfers, PBX integrations, arcane voice codecs, and WebRTC <> SIP translations…
But why should you need a decade of telecommunications development experience or telephony middleware to build voice agents that reliably transfer calls to SIP endpoints?
The SignalWire Agents SDK abstracts away all of this complexity. This is all you need to set up an agent that greets incoming callers:
Simply add a skill to set up transfers to phone numbers or SIP addresses:
No complex SIP configurations to wrestle with, codecs to negotiate DTMF handling, or low-level telephony protocols to learn or hire a team to figure out.
It just works.
The SDK also includes an array of advanced features we think you’ll love.
Composable Skills: One line tools
The SDK includes a framework for building a library of custom tools for your voice agents. We call these tools Skills.
Once you've defined a skill, adding its capabilities to any agent requires one line of code:
We've included a few skills to get you started. Show them to your favorite coding agent if you want to build your own:
Web search + scrape. Combines Google Custom Search and BeautifulSoup to query Google and scrape results.
Datetime. Handles timezone conversions and natural language date parsing.
Math. Safely evaluates expressions.
Native vector search provides offline document search with a local RAG database.
And yes, you can add your favorite MCPs (and make them more secure) with the MCP Gateway.
Composable prompts: The Prompt Object Model (POM)
Traditional AI prompting relies on large text blocks that become increasingly unwieldy as prompts evolve and become more complex.
Consider a support agent that needs personality traits, technical knowledge, compliance guidelines, department routing rules, instructions on how to use tools, and error handling procedures.
In a text-based approach, these elements blend into an amorphous blob, structured with Markdown or XML tags.
Want to test a different instruction set or change personality traits? Dig into a long text string, find the section you want to change, edit, and run your test.
POM introduces structure and composability through a section-based architecture that mirrors how humans organize complex instructions and web developers organize CSS styles.
Instead of wrestling with one massive prompt string, POM turns prompt sections into organized, maintainable components:
Need to update compliance guidelines? Change one section and unit test its impact. Want to A/B test different personalization prompts for 3 different personas? Swap sections dynamically. Building industry-specific variations of a an advanced sales agent? Layer sections like CSS classes.
Contexts and steps: Dynamic conversation flows
To channel the unpredictability of LLMs into a more predictable, deterministic structure and make conversation design accessible without code, many players in the voice agent space have built visual conversation designers.These rigid workflows may make sense for 2025’s low latency models…and a lot less sense when models become intelligent and dynamic enough to solve problems on their own.
The SDK's Contexts and Steps framework provides structured, step-by-step conversation design patterns for today's LLMs. But unlike node-based visual workflows, its all-code approach offers a much more straightforward upgrade path when models evolve:
But contexts aren't just workflow containers. They're mini-agents within your agent. Switch contexts to completely transform personality, tools, and behavior mid-conversation:
As models improve, adjusting a few parameters and prompts will transition your multi-agent teams from deterministic conversation flows to dynamic, intelligent orchestration. No top-to-bottom rewrites or node-by-node overhauls required.
Serverless functions (No webhooks required)
DataMaps is a serverless function framework for building simple tools making REST API calls without setting up and scaling webhooks. DataMaps tools are ideal when you need to connect voice agents to REST APIs and structure the payload with the least impact on latency and the most efficient use of tokens.They're also refreshingly simple to implement:
DataMap tools are 100% serverless. They execute on SignalWire's servers and handle authentication, retries, error cases, and response processing automatically.
Use them for simple tools like customer lookups in a CRM, checking the status of a flight, or changing a reservation. When your agents need more advanced tools, use Skills.
Local (offline) knowledge search: Embedded RAG for every agent and multi-agent team
In long-running tasks (like deep research or agentic coding) multi-agent systems work in parallel—spawning sub-agents that complete parts of a task and report back to an orchestrator who synthesizes their outputs.
But voice applications work differently. It rarely make sense to call an agent and wait 3-10 minutes on the phone for its task to complete.
Because of this distinction, multi-agent orchestration in real time apps is fundamentally sequential:
1. A greeter agent identifies intent and routes to sales or support.
2. Tier 1 support triages issues and escalates when needed.
3. Specialized tier 2 agents handle product-specific troubleshooting.
With embedded RAG DBs, each agent has built in access to exactly what it needs to complete its task: Tier 1 support agents have product catalogs and warranty policies. Tier 2 agents have detailed product manuals and troubleshooting guides (in multiple languages) for the products they specialize in.
The SignalWire RAG search uses SQLLite DBs and hybrid keyword and similarity search.
The architecture also supports hosting on a network, making them useful as persistent data stores or shared KBs for multi-agent teams.
For voice applications where workflows are well-defined and knowledge bases are relatively stable, embedding RAG inside each agent eliminates an entire class of operational complexity.
Every agent is self-contained, with the knowledge it needs to complete its task.
Prefab agents: Voice agent design patterns
Prefab agents are a bit like "prebuilt UI components" you might repurpose to build landing pages or web apps—except for voice UI and multi-agent voice apps.We've included a handful of working examples that embody our current best practices. We'll be adding more regularly:
Info Gatherer Agent: Collect structured data with validation at each step. Like a web form, but for voice bots
FAQBot: The voice agent equivalent of a support portal or knowledge base
Concierge Agent: Answer questions about the business, and wire it up to booking and ordering systems to complete transition
SurveyAgent: Multi-step survey agent with branching logic. If the InfoGathererAgent is the voice agent version of a web form, the SurveyAgent is like your favorite survey builder or quiz app
Receptionist Agent: Identify caller intent and transfer to the correct department. Transfers can point to traditional phone numbers, SIP endpoints, or other voice agents
Just like web UI components, each prefab voice agent is a complete, working agent you can customize or use as inspiration in your own multi-agent voice apps.
CLI testing tools
No more deploying to production to test. No need to spin up a chatbot to test your voice agents. No more print debugging. You can now unit test every feature of your agent or simulate entire multi-agent call flows from the command line.
Optimized for for your favorite coding agents
The SDK repo comes bundled with extensive documentation, tutorials, and example patterns that you can feed directly into your coding agent or LLM-integrated IDE of choice.
Quick Start Guide: Get an agent deployed in 5 minutes
Comprehensive API Reference: Every method documented
Real Examples: Not just "foo" and "bar" and weather
Architecture Guides: Understand the why, not just the how
Tutorials for single agent and multi-agent deployments with multiple custom skills
COMING SOON Cursor Rules and CLAUDE.MD files for setting your coding agents up for success
What are you waiting for? Discover the SignalWire Agents SDK.