Voice AI to live-agent handoff: the patterns that survive production
- Heads of Ops
- CX directors
The single most predictive measure of post-launch satisfaction is not containment; it is the experience of the escalated caller. Get the handoff right and a 30%-contained deployment outperforms a 60%-contained one with blind transfers.
Five handoff patterns and when to use each
| Pattern | Use when | Context carried | Failure mode |
|---|---|---|---|
| Warm transfer with context | Caller is on the line, intent captured, action partially complete | Transcript summary, intent, identity, action attempted, reason for escalation | Summary too long, agent ignores it |
| Cold transfer with screen-pop | High volume, simple intents, agent can pick up the context visually | Identity, intent code, two-line summary, link to full transcript | Screen-pop fails to load before agent answers |
| Asynchronous follow-up | Action requires back-office work, no SLA on real-time response | Full transcript, intent, caller-preferred channel and time | Follow-up SLA not measured |
| Scheduled callback | Caller declines to wait, queue depth is the constraint | Identity, intent, agreed callback window, transcript summary | Callback misses the agreed window |
| Supervised handoff | First 90 days of any new intent; high-risk intents in regulated industries | Full transcript, AI confidence score per turn, supervisor flag on any low-confidence turn | Supervisor staffing not held past launch |
What context to carry — the non-negotiables
Every handoff carries the same five fields regardless of pattern. Anything less and the caller re-explains, which is the single biggest CSAT killer at handoff.
- Identity — verified or stated, with the verification status named
- Intent — captured in machine-readable form, not just a free-text summary
- Action attempted — what the AI tried to do, and the system response if any
- Reason for escalation — named, not inferred from absence
- Caller emotional state — flagged if frustration, distress, or vulnerability signal was detected
Measuring handoff quality in production
Four metrics, every week, on every handoff pattern. Without them the handoff seam stays opaque and degrades silently.
- Re-explanation rate — % of escalated calls where the agent asks the caller to re-state the intent
- Handoff handle-time penalty — agent handle time on escalated calls vs baseline pre-AI calls for the same intent
- Post-handoff CSAT — escalated calls scored separately, not blended with contained
- Repeat-contact within 7 days following handoff — separate from overall re-contact
What the supervised-handoff pattern actually costs
Every new intent gets the supervised pattern for the first 30 days post-launch. A supervisor — usually a senior contact-centre operator — listens to a sample of live AI calls and flags low-confidence turns for live intervention. The cost is 0.1 to 0.2 FTE per concurrent live deployment for the first quarter. Skipping it saves the FTE and pays it back two-fold in post-launch firefighting.
Pull last week's escalated calls. Tag the first ten by handoff pattern and measure the re-explanation rate. If you cannot, that is your week-one observability gap.
- Handoff experience predicts post-launch CSAT more reliably than containment rate.
- Five named patterns cover the realistic cases — match pattern to intent and queue depth, not vendor preference.
- Every handoff carries identity, intent, action attempted, escalation reason, and emotional state — no exceptions.
- Measure re-explanation rate, handoff handle-time penalty, post-handoff CSAT, and post-handoff re-contact separately every week.
- Run the supervised pattern for 30 days on new intents, 90 days on regulated ones.
Frequently asked questions
- Why is warm transfer not always the right answer?
- Because warm transfer at high volume queues callers behind agents who have to read the summary before answering. For simple intents, screen-pop is faster and the context still arrives in time.
- What is the single biggest cause of bad handoffs?
- The transcript summary is too long. Agents do not read past line three. Two lines and a structured intent code is the right shape for cold transfer; three to five lines for warm.
- How long should the supervised pattern run for a new intent?
- Thirty days for low-risk intents, ninety days for regulated or high-risk intents. Removing the supervisor before failure modes have surfaced at scale is the most common cause of post-launch incident clusters.
Terms used in this guide
- Voice AI— Voice AI is software that answers the phone, understands what the caller wants, and takes action — not just a smarter IVR.
- IVR replacement— IVR replacement swaps menus and keypad input for natural conversation and actual resolution.
Lewis Crook — 20 years in enterprise technology, from FTSE 100 voice deployments to over a million AI-handled minutes a month across Asia-Pacific. Buyer, builder, and now working with CX leaders on enterprise voice AI. Writes The Voice AI Brief. Connect on LinkedIn. More about Lewis.
Related guides
Plus the Voice AI Readiness Diagnostic in the welcome email.
Welcome email includes the Voice AI Readiness Diagnostic. No second list, no extra form.