Skip to content
Economics

Voice AI vs legacy IVR: the honest unit economics

  • CX directors
  • VP / COO
  • Heads of Ops
By Lewis CrookPublished
Bottom line up front

Voice AI is cheaper than a live agent and more expensive than a touch-tone IVR. The real question is not cost per call but cost per resolved call — and most vendor ROI models quietly assume containment rates that production deployments do not hit.

What does a voice AI call actually cost?

A voice AI call carries four cost lines that an IVR does not: speech-to-text, large language model inference, text-to-speech, and telephony. As of 2026, fully loaded per-minute costs for an enterprise-grade stack typically sit in the range of a few cents to low double-digit cents per minute, depending on model choice and how aggressively prompts and retrieval are cached.

A legacy IVR, by contrast, has effectively zero variable cost per call once licensed — its expense is the maintenance burden on the team that owns the call flows.

On a like-for-like minute basis, voice AI is therefore more expensive than IVR and roughly an order of magnitude cheaper than a live agent. That is the easy part of the comparison.

Why cost per call is the wrong unit

The number that actually matters is cost per resolved call — total platform and telephony spend divided by the number of calls the system handled end-to-end without escalating to a human.

This collapses the comparison. A voice AI deployment with a 25% containment rate at $0.30 per call has a cost per resolved call of $1.20. The same deployment at 55% containment has a cost per resolved call of $0.55. Vendor ROI models almost always quote the second number and project benefits from the first.

What does a real-world unit-economics model look like?

A defensible model has at least four inputs that vendor decks usually compress into one: measured containment rate, average handle time on contained calls, average handle time on escalated calls (which is often longer than a non-AI call because the customer has already explained the problem once), and the fully-loaded cost of the human agent who picks up the escalation.

  • Measured containment rate from a representative call sample, not a curated demo set
  • Average handle time on contained vs escalated calls
  • Re-contact rate — calls that come back within 7 days for the same issue
  • Fully loaded agent cost including supervision, QA, and attrition

Where the real ROI usually shows up

In most enterprise call centre automation programmes, the largest economic lever is not labour replacement on the contained portion — it is reducing average handle time and re-contact rate on the calls that still escalate. A voice AI that captures intent, identity, and verification before transfer can take 30–90 seconds off a human-handled call. At enterprise volumes, that line item often exceeds the savings from containment.

Note on terminology

This piece uses US spelling for technical terms ("call center automation"). Most UK and ANZ readers know the same category as call centre automation; the unit economics do not change.

A worked example: 1,000,000 inbound calls a year

Take a UK insurer with a million inbound calls a year, an average handle time of 7 minutes, and a fully loaded agent cost of £42 an hour. The pre-AI cost base is roughly £4.9 million in direct handling, before overhead.

Drop a voice AI in front of the queue with a measured net containment of 32% — a realistic blended figure for a mid-complexity claims and servicing mix in the first year. Set the AI at £0.16 per minute, including telephony. The 320,000 contained calls average 3.5 minutes on the agent, costing roughly £179,000. The 680,000 escalated calls each pay 75 seconds of agent capture before transfer, then 6 minutes of agent time. Pre-transfer AI cost lands at about £136,000 and the post-transfer agent time at about £2.85 million.

Total: £3.17 million. A £1.7 million saving — but only £179,000 of it comes from labour replacement on contained calls. The rest is handle-time reduction on the calls that still escalate. Most ROI decks invert that ratio.

The four numbers vendor decks usually omit

A defensible model includes four lines that vendor slides usually skip. Without them, the case overstates savings by 30 to 60%.

  • Pre-transfer AI minutes on escalated calls — the AI does not stop charging the moment it hands off
  • Post-transfer handle-time penalty when the agent has to re-anchor a customer who has already explained the issue once
  • Re-contact within 7 days for the same intent, charged to the channel that produced the re-contact, not the one that received it
  • Operating-model cost — conversation owner, platform owner, and observability tooling, typically £150k–£400k a year for a single high-volume deployment

Pricing models and what each one optimises for

Three pricing models dominate the enterprise market and each pushes behaviour in a different direction. Per-minute pricing is simple and transfers no risk; it rewards short calls regardless of resolution. Per-resolution pricing transfers containment risk to the vendor and aligns incentives but forces an agreement on what counts as resolved before the contract is signed. Platform pricing decouples cost from volume entirely and tends to favour buyers running at predictable scale.

Convert all three to cost per resolved call before comparing, and stress-test each at 0.5x and 2x your modelled containment. Per-minute almost always looks cheapest at the assumed rate and worst at half of it. Per-resolution is the inverse. The right answer is rarely the headline.

Where the case quietly breaks

Three patterns flip a positive business case negative inside the first year of production, and none of them are visible during a proof of value.

First, intent mix shifts as marketing campaigns or product launches push new call drivers into the queue. A voice AI tuned on last quarter's mix can lose 10 to 15 points of containment overnight. Second, integration latency degrades as the systems of record evolve; calls that used to resolve in 90 seconds drift to 130 and customers start escalating themselves out of the AI flow. Third, the operating model never gets staffed — the conversation owner role is left to a transformation team that loses interest by quarter two, and drift goes unaddressed.

Questions to ask before the financial case is signed

Five questions usually expose whether a business case will survive contact with production. Ask all five before approving.

  • Show me the call sample the containment rate in this model was measured on, and how it maps to my actual intent mix.
  • What is the assumed re-contact rate within 7 days, and what evidence supports it?
  • What is the pre-transfer AI minute charge on escalated calls assumed to be?
  • What handle-time penalty on escalated calls is built in for the customer re-explaining the issue?
  • What is the named operating-model cost, and which team owns it after go-live?

A regulated-industry variant: UK retail bank, FCA-leaning

The numbers above are an insurer; the shape changes for a UK retail bank under the FCA Consumer Duty regime. Take a £2.5bn-revenue retail bank with 800,000 inbound calls a year, a more conservative 22% net containment in year one (regulated intents are harder), and an agent cost of £36 an hour fully loaded. Pre-AI direct handling lands at roughly £3.6 million.

Post-AI: 176,000 contained calls at 3 minutes of AI time cost ~£85,000. The 624,000 escalated calls each pay 90 seconds of AI before transfer and 5.5 minutes of agent time afterwards. Pre-transfer AI lands at ~£150,000 and post-transfer agent time at ~£2.06 million. Total: £2.30 million — a £1.3 million saving.

But the FCA variant also carries three lines the insurer model does not. Vulnerable-customer detection telemetry and human-handover audit (~£80k/year of conversation-owner time), Consumer Duty fair-value evidence pack (~£40k/year of analyst time), and DPIA refresh every twelve months (~£15k/year). The defensible saving is closer to £1.16 million — still material, but 11% lower than the insurer headline at the same containment.

20–60%
Range of containment rates reported by enterprise voice AI deployments
Source: Aggregated vendor case studies and analyst reports, 2024–2026
30–90s
Typical handle-time reduction on escalated calls when AI captures intent before transfer
Source: Practitioner observations across multiple deployments
Key takeaways
  • Voice AI is roughly an order of magnitude cheaper per call than a live agent, and more expensive per minute than an IVR.
  • The right unit is cost per resolved call, not cost per call or per minute.
  • Vendor ROI models quietly assume containment rates production rarely hits — model your own measured rate.
  • The largest economic lever is usually AHT reduction on calls that still escalate, not labour replacement on contained calls.
  • Subtract 7-day re-contact for the same intent — a contained call that returns has not been resolved.

Frequently asked questions

Is voice AI cheaper than a live agent?
On a per-call basis, yes — fully loaded voice AI costs are roughly an order of magnitude lower than a live human agent. The relevant comparison, however, is cost per resolved call, which depends on the measured containment rate of the specific deployment.
Is voice AI cheaper than an IVR?
No. A legacy IVR has effectively zero variable cost per call. Voice AI adds speech-to-text, LLM, and text-to-speech costs that an IVR does not carry. Voice AI wins on resolution rate and customer experience, not on raw per-minute cost.
What is a realistic containment rate to model?
Containment varies sharply by use case. Account-balance and order-status calls regularly reach 60–80% containment; complex billing or claims calls more often sit in the 15–35% band in production. Model your specific intent mix, not a blended vendor average.
What is usually missing from vendor ROI models?
Three things: measured rather than projected containment, the handle-time penalty on escalated calls when the customer has to re-explain, and re-contact rate within 7 days. All three move the cost per resolved call materially.

Terms used in this guide

  • Voice AIVoice AI is software that answers the phone, understands what the caller wants, and takes action — not just a smarter IVR.
  • Containment rateContainment rate is the percentage of calls the automation finished on its own.
  • Autonomous resolution rateAutonomous resolution rate is containment rate that survives re-contact.
  • IVR replacementIVR replacement swaps menus and keypad input for natural conversation and actual resolution.
Last reviewed: 2026-06-15. This guide is updated when production patterns shift; see the corrections page to flag anything that no longer matches reality.
About the author
Lewis Crook
Practitioner writer on enterprise voice AI

Lewis Crook — 20 years in enterprise technology, from FTSE 100 voice deployments to over a million AI-handled minutes a month across Asia-Pacific. Buyer, builder, and now working with CX leaders on enterprise voice AI. Writes The Voice AI Brief. Connect on LinkedIn. More about Lewis.

Newsletter
Liked this? Get the next edition.

Plus the Voice AI Readiness Diagnostic in the welcome email.

Welcome email includes the Voice AI Readiness Diagnostic. No second list, no extra form.