Contact Sales

All fields are required

One Stack Means One Set of Logs | SignalWire
Observability for Voice AI

One Stack Means One Set of Logs

When your voice AI spans five vendors, diagnosing a failure means correlating logs from five systems with no shared trace ID, timestamp format, or error taxonomy.

5 min
incident diagnosis on SignalWire
4 hrs
typical multi-vendor diagnosis
10
typed error categories
1
unified log stream

Trusted by 2,000+ companies

The problem

Multi-vendor stacks hide failures in the gaps

No shared correlation IDs

Twilio has a CallSid. Deepgram has a request_id. ElevenLabs has a generation_id. Correlating these requires custom mapping code that you build and maintain.

Incompatible error taxonomies

Twilio reports 'completed' for calls that ended normally. Your TTS provider reports a 422 for quota exhaustion. A 'successful' call in one dashboard can be a failed call for the caller.

Swallowed errors cause silent outages

A production deployment ran out of TTS characters mid-month. The orchestration layer swallowed the 422 error. Callers heard silence for three days before anyone noticed.

No end-to-end latency measurement

You can measure STT latency and TTS latency separately, but not the total time the caller waited. That measurement spans all five vendors plus your server.

Build a Voice AI Agent

from signalwire_agents import AgentBase
from signalwire_agents.core.function_result import SwaigFunctionResult

class SupportAgent(AgentBase):
    def __init__(self):
        super().__init__(name="Support Agent", route="/support")
        self.prompt_add_section("Instructions",
            body="You are a customer support agent. "
                 "Greet the caller and resolve their issue.")
        self.add_language("English", "en-US", "rime.spore:mistv2")

    @AgentBase.tool(name="check_order")
    def check_order(self, order_id: str):
        """Check the status of a customer order.

        Args:
            order_id: The order ID to look up
        """
        return SwaigFunctionResult(f"Order {order_id}: shipped, ETA April 2nd")

agent = SupportAgent()
agent.run()

Diagnosing a TTS outage

Multi-vendor stack

  • Check telephony dashboard: call connected, no errors
  • Check server logs: webhook delivered, response sent
  • Check STT provider: transcription correct
  • Check LLM provider: response generated
  • Check TTS provider: discover 422 quota error
  • Correlate timestamps across four systems
  • Discover a 3-day silent outage
  • Time: 4 hours, two engineers, three time zones

SignalWire

  • Check enriched call log
  • See TTS error: type TTS, cause quota_exhausted
  • See first occurrence: 3 days ago
  • See fallback attempted on every affected call
  • Time: 5 minutes, one engineer

10-type error taxonomy

Error typeWhat it covers
STT errorsTranscription failures, confidence drops
TTS errorsSynthesis failures, quota exhaustion
LLM errorsInference timeouts, token limits, rate limits
Tool errorsTool function failures, webhook timeouts
Transfer errorsFailed transfers, destination unavailable
State errorsVariable conflicts, step transition failures
Media errorsCodec negotiation, audio quality degradation
Connection errorsCall setup, teardown, network issues
Timeout errorsEnd-of-speech, inactivity, session limits
Auth errorsTool authentication, webhook signature failures

Running out of TTS characters mid-month is not theoretical. It happened to multiple production deployments. Teams discovered they had been out of budget for days before noticing.

How single-stack observability works

1

Every event gets a typed structure

State transitions, tool calls, transfers, and barge-in events log with timestamps, causal triggers, and typed metadata. No manual instrumentation.

2

Per-component latency is measured inside the engine

How long STT, LLM inference, TTS, and tool calls took. Measured at the source, not approximated from external API response times.

3

Causal triggers link events

Not 'step changed from greeting to main' but 'step changed because the caller said I need help with billing, which matched the billing_inquiry transition.'

4

Barge-in analytics surface conversation quality

When the caller interrupted, how the AI responded, and whether the interruption was a real barge-in or a conversational acknowledgment.

Proactive monitoring from single-stack data

Latency drift detection

If response time increases by 200ms over a week, the enriched log shows which component is responsible. Multi-vendor stacks require manual correlation across dashboards.

Error rate trends

A gradual increase in STT confidence drops might indicate microphone quality issues or background noise patterns. SignalWire surfaces this as a trend in structured data.

Barge-in patterns

If callers interrupt the AI more frequently on certain prompts, barge-in analytics reveal which parts of the conversation flow need tuning.

Transfer failure rates

What percentage of transfers fail? Which destinations are unreachable? How often does context get lost? Single-stack observability answers these without custom instrumentation.

Every vendor in the chain was good individually. The failures live in the gaps between vendors: timing mismatches, error classification differences, missing correlation IDs, and swallowed errors that no single vendor can see.

FAQ

Does SignalWire replace my monitoring tools?

No. SignalWire provides the structured event data. You can export it to your existing monitoring stack (Datadog, Grafana, custom dashboards). The difference is that the data comes from one source with one schema.

How does the error taxonomy work?

Every error is classified into one of 10 types (STT, TTS, LLM, Tool, Transfer, State, Media, Connection, Timeout, Auth) with severity, cause, and timestamp. No ambiguity about which component failed.

What is an enriched call log?

A structured record of every event in a call: state transitions, tool calls, transfers, barge-in events, per-component latency, and errors. All in one timeline with causal links between events.

Can I set alerts on specific error types?

Yes. Because errors are typed and structured, you can alert on TTS quota errors, transfer failures, latency thresholds, or any combination. No regex parsing of unstructured logs.

How does this help with the 'three days of silence' scenario?

The TTS error would appear immediately in the enriched call log as a typed TTS error with cause quota_exhausted. An alert on TTS errors would fire on the first affected call, not three days later.

Trusted by

Stop correlating. Start seeing.

Get one view of the entire call lifecycle, from PSTN ingress to AI processing to resolution.