When the cheapest AI voice vendor answered zero patient calls: the VERA framework
- Procurement / IT-Sec
- VP / COO
Originally published on LinkedIn. Reproduced here for archival reference.
A large healthcare network selected an AI voice vendor on price. Zero live patient calls were processed before the engagement was terminated. The Vendor Evaluation & Risk Assessment (VERA) framework — two gateway criteria and six weighted domains — exists to stop that failure mode in any regulated, high-volume operation.
The setup
Before joining the vendor side, I spent the better part of a year on the buy side of an AI voice procurement for a large multi-site healthcare enterprise. The organisation operated across multiple regions, ran a single mainstream practice management system (PMS) across every site, and had aggressive expansion plans.
I scanned the market. Seven AI voice vendors made the shortlist. I built an evaluation framework, ran the assessments, and produced a recommendation. The buyer bypassed the recommendation and selected on price.
The chosen vendor — a credible global enterprise software brand with big logos and public-sector deployments — passed every pre-sale criterion on paper. Implementation had three milestones: telephony integration, PMS integration via HL7v2, and pilot call testing with real patients. None were met. The engagement was wound down before the pilot ever ran.
What actually broke
Three distinct failures, in order of severity.
- Latency. Humans take conversational turns with gaps of roughly 200 milliseconds. Sub-second is the baseline for natural feel; past about 1.5 seconds the call reads as a broken connection. The deployed system averaged over 4 seconds, with spikes past 20. Patients hung up. The ones who didn't talked over the system, which compounded the latency.
- Accuracy. The system confidently produced wrong information — wrong appointment slots, wrong practitioner names, wrong availability. In a retail call centre that is a complaint; in healthcare it is clinical risk and regulatory exposure. The root cause was inadequate PMS integration.
- Delivery capability. The implementation team had never integrated with that PMS and had never worked inside local primary-care workflows. What was sold as 'configuration' was new development work the team could not do. There was no escalation path and no senior healthcare engineer to bring in.
The gap conventional procurement cannot see
Healthcare procurement frameworks were built for EMRs and clinical decision-support tools. They evaluate static products: you install the thing, you configure the thing, the thing runs. AI voice is not that. AI voice is a real-time autonomous system having unscripted conversations with patients about their healthcare.
The procurement question is no longer 'does the product work' — it is 'can this vendor deliver a working production system into our specific environment, against our specific systems, under our specific regulatory regime.' That question has three layers: compliance risk, technical integration risk, and implementation delivery risk.
Layers one and two are visible in an RFI. Layer three is invisible until after the contract is signed — and it is where this vendor failed completely.
VERA gateway 1 — Data Sovereignty (not residency)
Residency is where data is stored. Sovereignty is whose laws and authorities can reach it. Under GDPR-style regimes, you do not have to permanently store data abroad to make a cross-border transfer — making it available to a processor in another country counts, including transient real-time access. The moment live patient voice is routed to an overseas speech-to-text or inference service, you have almost certainly made a cross-border transfer, even though your database never left the country.
More than half of the vendors evaluated stored data in-country and routed live voice processing through US infrastructure. Several did not understand the distinction. When asked where the speech-to-text inference ran, they could not answer.
Where data physically sits does not, by itself, determine which governments can compel access to it. The US CLOUD Act lets US authorities compel US-based providers to hand over data in their control regardless of which country the servers are in. The legal ground under transatlantic transfers also moves: the EU framework authorising them has already been struck down once (Schrems II, 2020) and rebuilt (the 2023 Data Privacy Framework, itself now under appeal). The lesson is not the case name — it is that an adequacy decision can be revoked, so you want sovereignty you control. Outsourcing the processing does not outsource the accountability. You remain the controller.
VERA gateway 2 — Regulatory Compliance
Health-data rules are jurisdiction-specific, and that is the point: you map obligations to where care is delivered and where patients sit, rather than assuming one country's health-privacy law travels with the technology. In the US the sector-specific regime is HIPAA; in Europe, health data is 'special category' data under GDPR with heightened safeguards. They are not interchangeable, and a vendor quoting the wrong one at you is a tell.
- A jurisdiction-specific assessment for the actual market being bought for — not a generic policy retrofitted from somewhere else.
- A clear position on AI-specific rules now arriving. The EU AI Act's duty to tell people they are speaking to an AI takes effect in 2026, with heavier 'high-risk' obligations pushed to late 2027 under a 2026 amendment.
- Clarity on medical-device classification, which turns on intended use. An appointment-booking assistant is generally not a medical device; one that assesses symptoms or steers clinical decisions can be — and that triggers a far heavier regulatory pathway.
The six weighted VERA domains
If a vendor fails either gateway, the assessment stops. Price is not a tiebreaker until the gates are passed. The weighted domains, in order:
- PMS / systems integration — verified, not promised. Sandbox test against the actual system. Reject vendors who have never seen the customer's PMS before.
- Telephony — domestic SIP, domestic PSTN, documented redundancy.
- Clinical safety — human handoff path, override controls, adverse-event reporting, patient AI disclosure, bias monitoring.
- Scalability — references at comparable scale, multi-site configuration management, centralised administration.
- Implementation delivery capability — the domain that did not exist before this failure. Named, healthcare-experienced delivery team. Milestone-based schedule with written acceptance criteria. Executive escalation path. Independent reference interviews with clients of comparable size — and you call them yourselves, not the references the vendor hands you.
- Commercial — price comes last, after everything else.
What changed after
The replacement vendor cleared both gateways. Voice processing inside the customer's jurisdiction. Compliance documentation written for the actual regulatory regime, not retrofitted from somewhere else. PMS integration verified in a sandbox before contract. Independent reference calls with comparable healthcare networks, all confirming milestone delivery within agreed timelines. Calls got answered.
Three lessons for operators
A polished demo predicts nothing about delivery. The failed vendor's demo was the best of the seven; their delivery was the worst. A demo measures sales capability. Implementation measures engineering capability. These are different functions inside the vendor, and you must assess them separately. The most reliable way is to make the vendor prove it before contract — sandbox the integration, simulate the calls, test against your actual systems and edge cases. What a vendor will not demonstrate before you sign, they usually cannot deliver after.
Price is the last filter, not the first. Every unit saved on AI voice procurement evaporates the moment the system goes down on a Monday at 9am with dozens of patients in the queue. The buyer spent more on the failed engagement than a year of the right vendor would have cost.
Data sovereignty is the question most vendors cannot answer cleanly. If you ask only one question in your next AI voice evaluation, ask where the speech-to-text inference physically runs. Get it in writing. Get the subprocessors named. If they hedge, the gate is closed.
The sandbox-integration test that catches delivery risk
Implementation delivery capability is invisible in an RFI and is where most vendor failures actually happen. The single most reliable test is to make the vendor demonstrate a working integration against your actual system in a sandbox — before contract, against your real PMS or CRM, with your data structures, under your authentication.
- Pick three intents that span the platform's claimed capabilities — one read, one write, one orchestration across systems
- Provide sanitised production data structures, not synthetic samples — the difference exposes assumptions the vendor made about your schema
- Demand the integration runs against your sandbox environment, not the vendor's emulator — emulators hide latency, auth quirks, and edge-case error responses
- Require the vendor to name the engineer who will lead the integration — not the account manager, not the solutions architect on the pitch
- Measure end-to-end latency under realistic load (parallel sessions), not single-threaded; production deployments fail on tail latency, not median
The sovereignty conversation script
Sovereignty questions are easy to ask and hard for a non-specialist vendor to answer. This four-question script consistently exposes the gap between marketing claims and actual data-flow.
- Where does the speech-to-text inference physically run, per call leg, per region? Name the data centre and the legal entity that operates it.
- Where does the language model inference physically run? Name the model provider, the deployment region, and the data-handling clause that governs retention.
- Where is the call recording stored, and which sub-processors have access? Provide the sub-processor list and the change-notification process.
- Under which jurisdiction's law could those processors be compelled to disclose call data? Name the regimes (CLOUD Act, equivalents) you have assessed and the mitigation in place.
The reference-call protocol
Vendor-supplied references usually answer pre-screened questions for pre-screened buyers. The references worth talking to are the ones you find yourself — competitors of comparable size who deployed the same platform against comparable systems. Three questions per call usually surface what matters.
- What did the vendor under-deliver on, and how was it resolved?
- If you were re-running the procurement today, what would you weight differently?
- What did the post-launch operating model actually cost you, in FTE and tooling?
Why VERA generalises beyond healthcare
The VERA framework was born from a healthcare procurement failure but the structure applies to any regulated, high-volume voice AI deployment. Financial services, insurance, public sector, and utilities all share the same three properties: real-time data flow under regulatory scrutiny, customers who cannot tolerate clinical-grade error rates, and an implementation surface that exceeds what vendor RFI responses describe.
Where healthcare uses HIPAA or special-category GDPR, financial services use FCA conduct rules and PCI; insurance uses sector-specific complaint logging; the public sector uses accessibility duties and procurement transparency. The gateway pattern — fail either, you are out — and the weighted-domain pattern with implementation delivery capability as a named domain transfer cleanly. Only the regulatory checklist changes.
- Choose on capability and delivery risk; price is the last filter, not the first.
- Data sovereignty (whose laws reach the data in real time) is distinct from residency and is the question most vendors cannot answer cleanly.
- Compliance must be jurisdiction-specific for where care is delivered — not a retrofitted generic policy.
- Implementation delivery capability is invisible in an RFI and is where most failures happen. Demand a named team with relevant experience and independent references you call yourself.
- Make the vendor prove integration in a sandbox before contract. What they will not demonstrate before signing, they usually cannot deliver after.
Frequently asked questions
- What is the VERA framework?
- Vendor Evaluation & Risk Assessment: two gateway criteria (data sovereignty and regulatory compliance) and six weighted domains (integration, telephony, clinical safety, scalability, implementation delivery capability, and commercial). A vendor that fails either gateway is out, regardless of price.
- Why distinguish data sovereignty from data residency?
- Residency is where data is stored. Sovereignty is whose laws and authorities can reach it. Live voice routed to an overseas inference service is a cross-border transfer even if the database never leaves the country — and outsourcing the processing does not outsource the accountability.
- Is an AI appointment-booking assistant a medical device?
- Generally not. Classification turns on intended use. An assistant that assesses symptoms or steers clinical decisions can fall into the medical-device pathway, which is significantly heavier than non-clinical scheduling automation.
- What is the single most useful question to ask an AI voice vendor?
- Where does the speech-to-text inference physically run, and which subprocessors handle it. Get the answer in writing. Vendors that hedge on this question rarely have a defensible sovereignty story.
Terms used in this guide
- Voice AI— Voice AI is software that answers the phone, understands what the caller wants, and takes action — not just a smarter IVR.
- Containment rate— Containment rate is the percentage of calls the automation finished on its own.
- IVR replacement— IVR replacement swaps menus and keypad input for natural conversation and actual resolution.
Lewis Crook — 20 years in enterprise technology, from FTSE 100 voice deployments to over a million AI-handled minutes a month across Asia-Pacific. Buyer, builder, and now working with CX leaders on enterprise voice AI. Writes The Voice AI Brief. Connect on LinkedIn. More about Lewis.
Related reading
Plus the Voice AI Readiness Diagnostic in the welcome email.
Welcome email includes the Voice AI Readiness Diagnostic. No second list, no extra form.