Contact Sales

All fields are required

One SDK Replaces Five Vendors | SignalWire
Python Agents SDK

One SDK Replaces Five Vendors

You are managing five SDKs, five invoices, and five escalation paths to make one voice call. There is a simpler architecture.

2,000+
companies in production
< 1.2s
typical AI response latency
$0.16
per minute, AI processing
2.7B
minutes processed

The problem with bolt-on voice AI

Building a voice AI agent today means stitching together a telephony provider, a speech-to-text service, a language model, a text-to-speech engine, and your own orchestration layer. Your code becomes the glue between five independent systems with different timing guarantees, different error models, and different billing.

Every network hop between components adds latency. Every vendor boundary adds a failure mode. Every integration adds state you need to track. The result: 2-4 seconds of response latency, race conditions at scale, and five support queues when something breaks at 2am.

The architecture difference

Multi-Vendor Stack

  • Telephony from one vendor, STT from another, LLM from a third, TTS from a fourth
  • Your application is the glue between all of them
  • State split across five systems with different timing guarantees
  • Six network hops minimum per conversational turn
  • Five SDKs, five billing relationships, five escalation paths

SignalWire

  • One platform handles telephony, STT, LLM, TTS, and media processing
  • AI runs inside the call, not alongside it
  • State lives in the platform, not your application
  • AI orchestrated inside the media stack
  • One SDK, one invoice, one support path

Build a Voice AI Agent

from signalwire_agents import AgentBase
from signalwire_agents.core.function_result import SwaigFunctionResult

class SupportAgent(AgentBase):
    def __init__(self):
        super().__init__(name="Support Agent", route="/support")
        self.prompt_add_section("Instructions",
            body="You are a customer support agent. "
                 "Greet the caller and resolve their issue.")
        self.add_language("English", "en-US", "rime.spore:mistv2")

    @AgentBase.tool(name="check_order")
    def check_order(self, order_id: str):
        """Check the status of a customer order.

        Args:
            order_id: The order ID to look up
        """
        return SwaigFunctionResult(f"Order {order_id}: shipped, ETA April 2nd")

agent = SupportAgent()
agent.run()

Add tool calling for business logic

The platform sends tool call requests to your agent when tools are invoked. Your agent returns results. The platform handles everything between the caller and your code.

python
@agent.tool("check_availability",
    description="Check available appointment slots for a given date",
    parameters={
        "date": {"type": "string", "description": "Date in YYYY-MM-DD format"}
    })
def check_availability(args):
    slots = scheduling_api.get_slots(args["date"])
    if slots:
        return f"Available times: {', '.join(slots)}"
    return "No availability on that date."

@agent.tool("book_appointment",
    description="Book an appointment",
    parameters={
        "date": {"type": "string"},
        "time": {"type": "string"},
        "patient_name": {"type": "string"}
    })
def book_appointment(args):
    result = scheduling_api.book(args["date"], args["time"], args["patient_name"])
    return f"Appointment confirmed for {args['patient_name']} on {args['date']} at {args['time']}"

Structure conversations with steps

Each step has its own prompt scope and tool scope. The model in the "identify" step cannot access scheduling tools. The model in the "schedule" step cannot access patient lookup. The model handles language; your code handles truth.

python
agent.add_step(
    name="identify",
    prompt="Ask for the caller's name and date of birth to look up their account.",
    tools=["lookup_patient"],
    transitions={"patient_found": "schedule", "not_found": "new_patient"}
)

agent.add_step(
    name="schedule",
    prompt="Help the caller book, reschedule, or cancel an appointment.",
    tools=["check_availability", "book_appointment", "cancel_appointment"],
    transitions={"billing_question": "transfer_front_desk", "done": "farewell"}
)

What makes this architecture different

AI embedded in the media stack

The AI kernel orchestrates speech recognition, language model inference, and speech synthesis from inside the media stack with direct access to the audio stream. The kernel eliminates the hops between the audio and the orchestration layer. 800-1200ms typical response latency vs. 2-4 seconds for bolt-on architectures.

Platform-native state management

Call state, conversation context, and step transitions live inside the platform. Your application does not reconstruct state from webhooks or manage distributed state across vendors.

Model-agnostic design

The platform integrates with multiple LLM providers. Switch models without changing agent code. When models commoditize, your investment in agent architecture remains.

Pre-built skill packages

Modular tool packages for common tasks: datetime parsing, search integration, data lookups. Write your own skills for reusable business logic across agents.

From pip install to production

1

Install the SDK

pip install signalwire-agents. One package, no vendor chain to configure.

2

Define your agent

Set the prompt, add tools for your business logic, define conversation steps if needed.

3

Test locally

Run the agent as a standard Python process. Call it from a browser or SIP client.

4

Deploy anywhere

Container, VM, or serverless function. Your agent is a standard Python microservice. Connect a phone number and start taking calls.

Declarative flows and real-time control

Declarative call flows in YAML

Define multi-step call flows as configuration. The platform executes them. No server infrastructure required for standard patterns.

Real-time bidirectional control

For applications that need to react to events outside the call (supervisor intervention, CRM triggers), a persistent bidirectional connection provides live control over active calls.

REST API for active calls

Transfer, inject prompts, and modify state on any active call via REST calls. Common operations without maintaining a persistent connection.

FAQ

Can I bring my own LLM provider?

Yes. The platform is model-agnostic. You can use OpenAI, Anthropic, Google, or self-hosted models. Switch providers without changing your agent code.

What does $0.16/min include?

AI processing: speech-to-text, language model inference, text-to-speech, and orchestration. Transport (PSTN, SIP) is billed separately at carrier rates. One invoice. No per-component billing for AI.

How does this compare to building on Twilio + Deepgram + OpenAI?

That stack requires your application to orchestrate audio between three services, manage state across all of them, and handle failures independently. SignalWire's AI kernel orchestrates all of those components from inside the media stack. Fewer hops, fewer failure modes, one vendor to call.

Can I start with YAML and move to Python later?

Yes. Declarative YAML flows and the Python SDK work on the same platform. Start with YAML for standard patterns, switch to Python when you need custom logic. Both approaches deploy the same way.

Trusted by

One stack. One SDK. Ship this week.

Install the Python SDK and build your first voice AI agent in an afternoon.