Dani Plicka

Phantom transfers happen when a call transfer completes after the caller has already hung up, so a real agent receives a “successful” transfer with no caller, no audio, and no context. This article explains why event-driven, split-brain call control lets transfer execution outlive hangup finality, how concurrency and network jitter make the bug appear randomly at scale, and why a control plane model prevents the transfer from executing unless the call is still alive at the moment of action.

Orchestration Horror Stories: Phantom Call Transfers

At small scale, call transfers feel deterministic. A caller asks for a human. The system transfers the call. An agent answers. But at production scale, transfers can become unpredictable.

A transfer succeeds after the caller has already hung up. An agent answers a call with no caller. Metrics say the transfer worked, but the experience says it didn’t. Nothing is broken and nothing crashed. The system did exactly what it was told to do.

This series breaks down real failure modes that emerge when voice and AI systems scale, and what kind of architecture prevents them entirely.

Welcome to part 3: When the caller is gone… but the call still reaches a human

This post breaks down The Phantom Transfer, a failure mode that appears when call control is split across asynchronous systems, why transfers can outlive the calls that triggered them, and how orchestrated call execution prevents actions from escaping reality.

The Zombie Call is about state that won’t die, and The Double Update is about state that vanishes. The Phantom Transfer is something else: A call that no longer exists still manages to interrupt a real person.

No audio. No caller. No context. Just a confused agent asking, “Hello?”

Act I: Transfers work as normal… until they don’t

Transfers are straightforward. A caller asks to speak to a human; your system initiates a transfer; the call is bridged to an agent.

At low volume, this works reliably, even if call state lives outside the call and control is driven by callbacks.

But transfers sit at the intersection of everything that breaks at scale:

Call control
Media state
External signaling
Human availability

And that makes them fertile ground for ghosts.

Act II: The caller hangs up at the worst possible moment

In this scenario, the AI agent decides to transfer the call. A transfer request is sent, but while the request is in flight, the caller hangs up

From the caller’s perspective, the interaction is over.

But from the system’s perspective? That depends on timing.

If hangup detection and transfer execution are handled in separate systems, or even separate processes, both can succeed independently.

The result:

The caller is gone
The transfer still completes
An agent receives… nothing

It becomes a call with no caller, a transfer with no source. A phantom.

Act III: Human cost, not just technical cost

The human side of the system is where the damage compounds.

Human agents waste time answering empty calls and miss real customers.

Managers see inflated transfer metrics and can’t reproduce the issue reliably.

And engineers add checks everywhere, race hangup detection against transfer execution, and introduce timing thresholds and heuristics.

Still, under enough concurrency and network jitter, the phantom returns. Because the system doesn’t actually know what’s happening in the present moment.

Why phantom transfers happen

Phantom transfers are caused by split-brain call control.

Specifically:

Transfer logic runs asynchronously
Hangup detection runs asynchronously
Neither owns the authoritative call lifecycle

Each component makes a locally correct decision. But globally, the system is wrong.

Once a transfer request escapes the call’s execution context, there’s no guarantee the call still exists when it completes.

Why event-driven architectures struggle with phantom transfers

Event-driven systems excel at throughput, but they struggle with finality. A hangup is final.
A transfer must not be.

When both are processed as independent events:

Ordering becomes probabilistic
Cancellation becomes best-effort
Humans become collateral damage

Retries don’t help, and more webhooks don’t help. The architecture has already lost the race.

The control plane is the difference between “requested” and “allowed”

In a webhook-driven model, a transfer is a request that leaves the call.

Your application decides to transfer, sends an API command, and then waits. If the caller hangs up while that command is in flight, the decision and the execution can land in two different realities. One system decides, another system acts, and neither is guaranteed to re-check liveness at the moment the action is applied.

A control plane flips that model.

In a control plane, the call is a stateful object that the platform owns. Actions like transfer are not side effects kicked off by an external server. They are operations applied inside the call’s execution context, where the platform can make one authoritative check: is this call still alive right now?

That is the difference between “requested” and “allowed.”

Requested: “My app wants to transfer.”
Allowed: “The call is still live, the leg still exists, and the transfer can still execute safely.”

If the call ended even milliseconds earlier, the transfer never executes. There is no cleanup, no rollback, and no phantom, because nothing escaped the call context in the first place.

This is why SignalWire treats transfers as first-class control flow, not asynchronous side effects. The system doesn’t ask, “Should I transfer?” It asks, “Can this transfer still happen right now?” If the answer is no, nothing happens.

This is the core idea behind SignalWire’s orchestration model: logic stays attached to the call. Transfers, holds, bridges, and handoffs execute where call state is authoritative, not where state is reconstructed after the fact. Call control stops being event-chasing and becomes controlled, serialized execution.

The real lesson of the phantom transfer

When call control is externalized, actions can outlive the call, and humans see the consequences.

If your system allows:

Transfers without authoritative liveness
Decisions without execution guarantees
Control flow split across services

Then phantom calls aren’t edge cases. At scale, it’s only a matter of time until one finds you…

Next, we’ll look at The Stale Response: when your AI answers a question the caller stopped caring about thirty seconds ago, and nobody can explain why.

Read the rest of the series:

Part 1: The Zombie Call
Part 2: The Double Update
Part 3: The Phantom Transfer
Part 4: The Stale Response
Part 5: The Agent Disappears

While you wait for the next installments in this series, join our community of developers on Discord.

Frequently asked questions

What is a phantom call transfer?

A phantom call transfer is when a transfer completes after the caller has already disconnected, resulting in an agent answering an “empty” call with no caller audio or context.

Why do phantom transfers happen at scale but not in testing?

They rely on timing, concurrency, and network jitter. Under low volume, the transfer request and hangup detection usually happen in a predictable order. Under production load, those events overlap, and independent systems can each make a locally correct decision that produces a globally wrong outcome.

How can an agent receive a transferred call with no caller?

If transfer execution and hangup detection are handled asynchronously in separate components, the hangup can be finalized in one place while the transfer completes in another. The agent leg can still be created and connected even though the caller leg no longer exists.

What does a control plane have to do with call transfers?

A control plane owns the call lifecycle as the system of record. Instead of executing transfers as out-of-band requests, it applies the transfer inside the call’s execution context, where it can validate liveness at the moment of action and block transfers that can no longer happen safely.

How do you prevent phantom transfers in voice and AI systems?

Prevent them by keeping transfer decisions and execution inside a single call execution context that owns call liveness. In a control-plane model, a transfer is only allowed to execute if the call is still alive at the moment of action. Otherwise it never executes, so there is no rollback, cleanup, or phantom call to reach an agent.

Developer Horror Stories: The Phantom Transfer