Voice AI platform categories: how to shortlist without a beauty parade
Pick the platform category first. Four categories carry enterprise voice AI today. Each maps to a different operating model and a different risk profile. The vendor shortlist becomes obvious once the category is right.
Hyperscaler conversational platforms
Cloud-native conversational AI bundled with the rest of the cloud stack. Sold to teams already standardised on one hyperscaler.
- Enterprises already deep on one cloud with procurement aligned
- Use cases where audit, residency, and commercial model can live inside an existing master agreement
- Programmes that need ML/data services alongside the voice layer
- Teams that want a conversation owner outside engineering
- Workloads with latency budgets under 1 second end-to-end without serious optimisation
- Buyers who care about model choice independent of the hyperscaler's own stack
- Control surface
- Engineering-heavy. Strong infrastructure controls; conversation editing typically requires deploys.
- Pricing shape
- Per-minute or per-request, often with committed-use discounts via the master agreement.
- Risk profile
- Lock-in is the main risk. Migration cost is real because identity, audit, and observability typically rely on the hyperscaler's adjacent services.
Contact-centre-native voice AI
Voice AI shipped as a module of an existing CCaaS or contact-centre platform. Sold to the contact-centre team rather than the AI team.
- Operations teams whose primary integration surface is already the CCaaS
- Programmes that need queue logic, workforce management, and reporting in one place
- Buyers who weight operational fit over leading-edge model choice
- Use cases that need deep writes into systems of record outside the CCaaS
- Teams that need model isolation guarantees or bring-your-own-LLM
- Workloads where the underlying CCaaS routing logic is itself the bottleneck
- Control surface
- Conversation owner sits inside the contact-centre operations team. Editing is usually approachable; engineering still owns integrations.
- Pricing shape
- Per-minute or per-interaction on top of CCaaS seat licensing. Watch the floor on escalated calls.
- Risk profile
- Capped ceiling. Strong out-of-the-box operations, harder to push beyond the CCaaS's integration boundaries.
Voice-AI-native platforms
Independent platforms whose product is voice AI itself — orchestration, model choice, telephony, observability.
- Programmes that want model choice independent of cloud and CCaaS
- Use cases that need integration depth into multiple systems of record
- Teams that have an opinion on latency, barge-in, and per-call observability
- Buyers without a sponsor outside the contact centre — these platforms expect product partnership, not procurement-only engagement
- Programmes that need workforce management, queue logic, and full CCaaS reporting bundled in
- Organisations that cannot run a controlled editor in operations without an engineering ticket
- Control surface
- Designed for a non-engineer conversation owner with versioned config, diff review, staging, and rollback.
- Pricing shape
- Per-minute or per-resolved-call. Per-resolved-call is the more honest commercial model when the platform will accept it.
- Risk profile
- Newer balance sheets and shorter track records. The product fit is usually best; the procurement comfort is usually lowest.
Build-your-own on a voice stack
Compose your own from ASR, LLM, TTS, telephony, and orchestration components — most often when one or more of those layers is open-source.
- Teams with platform engineering capacity and a real reason to own the stack (latency, residency, model independence, IP)
- Use cases where the AI is a product differentiator, not an operating expense
- Organisations whose data or threat model makes any vendor dependency in the audio path unacceptable
- Programmes where the business case is labour cost reduction, full stop
- Teams that do not have an operating model for prompt and intent change separate from code deploys
- Anyone who treats observability and audit as a phase-two problem
- Control surface
- Whatever you build. The control surface is itself a meaningful design deliverable, not a free feature.
- Pricing shape
- Component cost — usually lower variable cost at scale, materially higher fixed engineering cost.
- Risk profile
- Two failure modes: under-investing in the operating model, and under-investing in observability. Both surface in month four.
Decision questions that select the category
| Question | What it points to |
|---|---|
| Do you already have a contact-centre platform you cannot displace? | Start with that platform's voice AI module; only widen the shortlist if its integration ceiling blocks the use case. |
| Is your sponsor in engineering or in contact-centre operations? | Engineering sponsor with platform capacity → voice-AI-native or build. Operations sponsor → contact-centre-native first. |
| Is model choice an explicit requirement? | Voice-AI-native or build. Hyperscaler and CCaaS modules constrain model choice by design. |
| Is the business case labour reduction, or product differentiation? | Labour reduction → buy. Product differentiation → build is on the table if the engineering capacity is real. |
| What latency budget do you actually need? | Sub-second p95 under load → voice-AI-native or build; CCaaS-native modules struggle with sub-second under realistic load. |
| How will a non-engineer change an intent? | If they cannot, the operating model will collapse in month three. Any category can support this — none does it by default. |
No platforms are named, ranked, or recommended on this site. We accept no vendor money. The category descriptions above are the deliverable — once the category fits, vendor selection inside it is a matter of applying the evaluation matrix to three candidates with real evidence.