Which voice AI pricing model is cheapest?

None universally. Per-minute is cheapest at high measured containment, per-resolution is cheapest at uncertain or low containment, and platform pricing is cheapest at high predictable volume. Convert to cost per resolved call before deciding.

How is 'resolved' defined in per-resolution pricing?

It varies by contract and is the single most negotiated definition in voice-AI procurement. The defensible definition requires evidence of resolution (a fulfilled action or explicit caller confirmation) and subtracts re-contact within a defined window for the same intent.

Do vendors negotiate pricing models?

Yes, especially for enterprise commitments. Hybrid models — per-resolution with a per-minute floor, or platform with a per-resolution overage — are common in 2026 enterprise contracts.

What is the most under-modelled cost?

Operating-model labour. A 0.5 to 1.0 FTE conversation owner plus engineering on-call is rarely in the vendor proposal and is the most common reason real cost diverges from rate-card cost.

Economics

Voice AI pricing models: per-minute, per-resolution, and platform compared

Procurement / IT-Sec
VP / COO
CX directors

By Lewis CrookPublished June 15, 2026

Bottom line up front

Three pricing models dominate enterprise voice AI: per-minute, per-resolution, and platform. Each transfers risk differently. Convert all three to cost per resolved call before comparing — the headline rate almost never wins.

Per-minute pricing

Per-minute is the simplest model and the easiest to compare on a rate card. It transfers no containment risk to the vendor — the buyer pays for every minute the AI is on the line regardless of outcome.

Typical 2026 enterprise per-minute pricing for a fully managed voice AI stack lands in the range of $0.10 to $0.35 per minute, with model choice and telephony bundling the largest swings. Self-hosted or build-it-yourself stacks routinely run lower per minute and higher per total cost of ownership.

Per-resolution pricing

Per-resolution pricing charges only for calls the AI resolves end-to-end, with a written definition of 'resolved' attached to the contract. It transfers containment risk to the vendor — which both aligns incentives and forces the contract to define resolution before signing.

Typical 2026 enterprise per-resolution pricing lands in the $1.00 to $4.00 range, depending on call complexity and what 'resolved' includes. The model is most defensible when the buyer is early in its voice-AI journey and the containment rate is genuinely uncertain.

Platform pricing

Platform pricing decouples cost from per-call volume. The buyer pays a fixed annual platform fee plus a usage component that is small relative to the platform fee. The model favours buyers running at predictable, high volume and tends to disadvantage low-volume or seasonal deployments.

Cost per resolved call — the only fair comparison

Convert each model to cost per resolved call using your modelled containment rate, then stress-test at 0.5x and 2x that rate. Three things almost always come out of that exercise: per-minute looks best at the assumed containment and worst at half of it; per-resolution is the inverse; platform pricing wins decisively at high volume and loses badly at low volume. The right answer is almost never the headline.

Hidden costs that move the answer

Five line items move the comparison materially and are almost never on the rate card: telephony pass-through, model API charges if not bundled, integration / connector fees, professional services for the operating model, and the change order for any custom voice or guardrail work. A 30 to 60% uplift on the headline number is normal once these are included.

A vendor-proposal teardown, line by line

Most enterprise voice AI proposals collapse onto a single per-minute or per-resolution headline. The defensible read is to expand that headline into the eight cost lines that actually show up on the first invoice and the four that show up later. Doing this once across three proposals usually re-orders the shortlist.

Headline rate — per-minute, per-resolution, or platform; check what is included and what is bundled separately
Telephony pass-through — SIP termination, inbound DIDs, geographic surcharges; often a 10–25% uplift
Model API charges — sometimes bundled into the headline, often passed through with a margin; ask whether you can bring your own model contract
Speech-to-text and text-to-speech — usually bundled, sometimes itemised; voice cloning typically extra
Recording and storage — retention beyond a default window is almost always a chargeable line
Integration connectors — generic connectors usually free, named systems of record sometimes a per-connector fee
Professional services — implementation, conversation design, prompt engineering; budgeted at 15–35% of year-one platform spend in practice
Change orders — the line item that catches everybody; tariff for net-new intents, prompt redesigns, and custom voices
Volume floor — the minimum committed minutes or resolutions; rarely advertised, almost always in the contract
Annual uplift — most contracts now embed a 3–7% annual price uplift; negotiate it down at signature, not at renewal
Egress and audit access — exporting transcripts or recordings in bulk for QA or audit sometimes carries an API charge
Exit fee — data return and provable destruction at contract end; non-zero, sometimes punitive

A worked GBP example: 600,000 calls a year

Take a UK contact centre with 600,000 inbound calls a year, average pre-AI handle time of 6 minutes, and a fully loaded agent cost of £38 an hour. Pre-AI direct handling cost is roughly £2.28 million.

Run the same volume against three pricing models at 35% measured net containment. Per-minute at £0.14/minute lands at platform cost of ~£840k (assuming 10 minutes of AI exposure per call across contained and pre-transfer). Per-resolution at £1.80/resolved-call lands at ~£378k for the 210,000 contained calls plus a per-minute floor on escalated calls (~£294k) — total ~£672k. Platform pricing at £750k flat plus £0.04/minute overage lands at ~£990k.

On these inputs per-resolution is cheapest at the modelled containment. Drop containment to 18% and per-minute wins on cost but per-resolution still wins on risk transfer. Push containment to 50% and platform pricing pulls ahead. The right answer changes with the containment assumption; pretending it doesn't is the most common pricing-model mistake.

Hybrid models that actually balance risk

Three hybrid patterns now appear in most enterprise proposals; each shifts risk in a different direction. Choose the one that matches what you actually need protected.

Per-resolution with a per-minute floor — protects the vendor against you running a low-containment deployment forever; appropriate when both sides believe containment will land above the floor
Platform with per-resolution overage — buyer absorbs base risk in exchange for capped marginal cost above forecast; appropriate at high volume with low variance
Per-minute with a containment SLA — vendor refunds a percentage of fees if measured net containment drops below an agreed band; rare, but the most aligned of the three

The five negotiation points worth the most

Across the proposals practitioners see, the same five negotiation points consistently move year-one spend by 15–30%. Each is easier to negotiate before signature than after.

Annual uplift cap — anchored to a published index (CPI), not to vendor list
Definition of 'resolved' — written into the contract, with worked examples and a stated re-contact window
Volume floor — proportional to your forecast confidence, not the vendor's pipeline target
Professional services — fixed-price phases against named milestones, not time-and-materials
Exit terms — data export format, timeline, and proof-of-destruction; the cheapest leverage you will negotiate

$0.10–$0.35

Typical 2026 enterprise per-minute pricing for a managed voice AI stack

Source: Aggregated vendor proposals, 2024–2026

$1.00–$4.00

Typical 2026 enterprise per-resolution pricing band

Source: Aggregated vendor proposals, 2024–2026

30–60%

Typical uplift from headline rate to fully-loaded cost

Source: Practitioner observations across multiple enterprise programmes

Key takeaways

Three models dominate: per-minute, per-resolution, and platform.
Per-minute transfers no containment risk; per-resolution transfers it to the vendor; platform decouples cost from per-call volume.
Convert all three to cost per resolved call at your modelled containment, then stress-test at 0.5x and 2x.
Five hidden costs (telephony, model API, integration, services, change orders) routinely add 30–60% to the headline.
Operating-model labour — usually 0.5–1.0 FTE — is the most under-modelled cost in vendor proposals.

Frequently asked questions

Which voice AI pricing model is cheapest?: None universally. Per-minute is cheapest at high measured containment, per-resolution is cheapest at uncertain or low containment, and platform pricing is cheapest at high predictable volume. Convert to cost per resolved call before deciding.
How is 'resolved' defined in per-resolution pricing?: It varies by contract and is the single most negotiated definition in voice-AI procurement. The defensible definition requires evidence of resolution (a fulfilled action or explicit caller confirmation) and subtracts re-contact within a defined window for the same intent.
Do vendors negotiate pricing models?: Yes, especially for enterprise commitments. Hybrid models — per-resolution with a per-minute floor, or platform with a per-resolution overage — are common in 2026 enterprise contracts.
What is the most under-modelled cost?: Operating-model labour. A 0.5 to 1.0 FTE conversation owner plus engineering on-call is rarely in the vendor proposal and is the most common reason real cost diverges from rate-card cost.

Terms used in this guide

Voice AI— Voice AI is software that answers the phone, understands what the caller wants, and takes action — not just a smarter IVR.
Containment rate— Containment rate is the percentage of calls the automation finished on its own.
Autonomous resolution rate— Autonomous resolution rate is containment rate that survives re-contact.

Last reviewed: 2026-06-15. This guide is updated when production patterns shift; see the corrections page to flag anything that no longer matches reality.

About the author

Lewis Crook

Practitioner writer on enterprise voice AI

Lewis Crook — 20 years in enterprise technology, from FTSE 100 voice deployments to over a million AI-handled minutes a month across Asia-Pacific. Buyer, builder, and now working with CX leaders on enterprise voice AI. Writes The Voice AI Brief. Connect on LinkedIn. More about Lewis.

Newsletter

Liked this? Get the next edition.

Plus the Voice AI Readiness Diagnostic in the welcome email.

Welcome email includes the Voice AI Readiness Diagnostic. No second list, no extra form.