What is a good voice AI containment rate?

There is no single good number — it depends entirely on intent mix. A blended 35% on a complex enterprise call mix can be a stronger result than a blended 65% on a transactional mix. Compare against your own baseline, not a vendor average.

How is containment rate different from deflection rate?

The terms are used interchangeably by most vendors. Where they differ, deflection usually refers to calls prevented from reaching the queue at all, and containment to calls that entered the AI flow and were not escalated. Always check the specific definition in a vendor proposal.

Should re-contact be subtracted from containment?

Yes, for any internal measurement. A call that is "contained" today but returns tomorrow for the same intent has not been resolved; counting it as containment overstates the system's effectiveness.

Why are vendor containment rates often higher than measured?

Most vendor figures exclude short abandons, out-of-hours, and out-of-scope intents, and do not subtract re-contact. Each adjustment is individually defensible; together they typically lift the reported figure by 15–30 percentage points.

Metrics

Voice AI containment rate: what's real vs what vendors claim

CX directors
VP / COO
Heads of Ops

By Lewis CrookPublished June 15, 2026

Bottom line up front

Containment rate is the single most-quoted and least-defined number in enterprise voice AI. Vendor figures of 70%+ are not wrong, but they usually measure a narrower denominator than the one a CX leader cares about.

What containment rate measures

Containment rate is the share of calls handled end-to-end by the automated system without escalation to a human agent. It is sometimes called deflection rate, automation rate, or self-service rate; the term varies by vendor and platform.

The number depends entirely on two definitions: the numerator (what counts as "handled") and the denominator (which calls are in scope). Two deployments quoting 65% containment can differ by 30 percentage points on a like-for-like comparison.

Where the definitions diverge

The most common adjustments that inflate a reported containment rate are: excluding calls that hung up in the first 10 seconds, excluding calls routed to a human at the IVR before the AI was offered, excluding out-of-hours calls, and counting any call that did not transfer as "contained" even if the customer called back the next day.

Numerator: does "handled" require the customer's stated intent to be resolved, or only that no transfer occurred?
Denominator: are abandoned, out-of-hours, and pre-routed calls included?
Time window: is re-contact within 7 or 14 days deducted from the numerator?
Intent scope: is containment measured across all calls or only intents the AI is configured to handle?

A defensible measurement

A measurement defensible to a finance team and a regulator typically includes all inbound calls in scope, requires evidence of resolution (a fulfilled action or an explicit confirmation), and subtracts 7-day re-contact for the same intent. Numbers calculated this way are usually 15–30 percentage points lower than the vendor headline.

What a healthy production range looks like

Across the deployments I have seen, defensible containment on a representative call mix sits in three bands. Transactional intents (balance, status, simple changes) routinely reach 60–80%. Mixed intents (billing questions, account changes) tend to land at 30–50%. Complex intents (claims, disputes, retention) typically remain under 30% unless heavily redesigned. Blended figures depend on the intent mix.

The formula in full, with every adjustment named

A defensible containment formula has four moving parts and each one is contestable. Writing the full version down before a vendor proposal arrives is the cheapest way to keep an evaluation honest.

Net containment = (contained calls − re-contact within window) / (in-scope calls − exclusions). Each clause is a negotiation. "Contained" requires a definition of resolution. "Re-contact window" is usually 7 days; some teams use 14 for claims or disputes. "In-scope" decides whether out-of-hours, abandoned, and pre-routed calls are counted. "Exclusions" decides whether intents the AI is not configured to handle are removed from the denominator or counted as fails.

Five vendor headline boosters and what each one hides

Most reported containment figures use one or more of the following adjustments. None is fraudulent on its own; combined, they routinely lift the headline by 20 to 30 percentage points.

Excluding short abandons (under 10 seconds) — hides callers who hung up because they wanted a human
Excluding pre-routed calls — hides intents the IVR removed from the AI's denominator before scoring
Counting any non-transfer as contained — hides callers who hung up in frustration mid-flow
Ignoring re-contact — hides intents that bounced back the next day
Reporting on the configured intent set only — hides the long tail the AI cannot handle

How to set a containment baseline before launch

A baseline measured before launch is the only defence against post-launch goalpost-moving. Pull a recent four-week sample of calls handled by the existing IVR or a human-only queue, classify each by intent, and tag whether the call ended with a resolution event (a fulfilled action, an explicit confirmation, or no re-contact within the window). The blended baseline becomes the comparison point — and crucially, it sets the measurement methodology that the AI deployment must use.

Skip this step and the post-launch containment rate will be measured against whatever methodology produces the most flattering number. With it, the comparison is honest and the operating-model team has a target it can chase.

The three bands worth modelling separately

Blended containment numbers hide more than they reveal. Splitting the call mix into three intent bands produces a more honest model and a more usable target for the operating-model team.

Transactional band — balance, status, simple changes — production containment of 60–80% is achievable
Mixed band — billing questions, account changes, scheduling — production containment of 30–50% is realistic
Complex band — claims, disputes, retention, exceptions — production containment under 30% unless heavily redesigned

What to do when the vendor and the finance team disagree

They will. The vendor will quote the headline; finance will demand net containment with re-contact deducted. The right answer is to publish both, separately and side by side, with a one-line explanation of the methodology gap. Suppressing one or the other invites a longer audit later. Publishing both forces a productive conversation about which number the business is actually being asked to plan against.

15–30pp

Typical gap between vendor-headline containment and a defensibly measured rate

Source: Practitioner observations, 2024–2026

Key takeaways

Containment rate is the most-cited and most loosely-defined metric in voice AI procurement.
Vendor headlines and defensibly measured rates typically differ by 15–30 percentage points.
A defensible measurement requires evidence of resolution and subtracts 7-day re-contact for the same intent.
Transactional intents commonly reach 60–80%; complex enterprise mixes more often sit at 25–45%.
Compare against your own baseline, not a blended vendor average.

Frequently asked questions

What is a good voice AI containment rate?: There is no single good number — it depends entirely on intent mix. A blended 35% on a complex enterprise call mix can be a stronger result than a blended 65% on a transactional mix. Compare against your own baseline, not a vendor average.
How is containment rate different from deflection rate?: The terms are used interchangeably by most vendors. Where they differ, deflection usually refers to calls prevented from reaching the queue at all, and containment to calls that entered the AI flow and were not escalated. Always check the specific definition in a vendor proposal.
Should re-contact be subtracted from containment?: Yes, for any internal measurement. A call that is "contained" today but returns tomorrow for the same intent has not been resolved; counting it as containment overstates the system's effectiveness.
Why are vendor containment rates often higher than measured?: Most vendor figures exclude short abandons, out-of-hours, and out-of-scope intents, and do not subtract re-contact. Each adjustment is individually defensible; together they typically lift the reported figure by 15–30 percentage points.

Terms used in this guide

Containment rate— Containment rate is the percentage of calls the automation finished on its own.
Autonomous resolution rate— Autonomous resolution rate is containment rate that survives re-contact.
Intent recognition— Intent recognition is figuring out what the caller actually wants.

Last reviewed: 2026-06-15. This guide is updated when production patterns shift; see the corrections page to flag anything that no longer matches reality.

About the author

Lewis Crook

Practitioner writer on enterprise voice AI

Lewis Crook — 20 years in enterprise technology, from FTSE 100 voice deployments to over a million AI-handled minutes a month across Asia-Pacific. Buyer, builder, and now working with CX leaders on enterprise voice AI. Writes The Voice AI Brief. Connect on LinkedIn. More about Lewis.

Field notes

Short, opinionated takes from practice that sit underneath this guide.

Why containment rate is the wrong KPI to put on a dashboard
Containment rate is easy to measure, easy to game, and a poor proxy for what actually matters: whether the customer's problem was solved. A note on the metric voice AI programmes should use instead.

Newsletter

Liked this? Get the next edition.

Plus the Voice AI Readiness Diagnostic in the welcome email.

Welcome email includes the Voice AI Readiness Diagnostic. No second list, no extra form.