Nine chapters.
Everything TransVoix does.
A walk through every shippable capability — voice, language, trust, integration. Read top-to-bottom for the full tour, or jump straight to what you came for using the rail on the left.
Speak in theirvoice.
The agent's voice is cloned once at enrollment and used in both directions — outbound and inbound, in either language. The customer hears the agent. The agent hears the customer. Neither hears a generic TTS.
Bidirectional
voice clone.
One enrollment, both directions, both languages. Refreshed quarterly or on demand. Prosody, accent, and micro-pauses preserved end-to-end.
Sub-second
time-to-speak.
~740 ms p50. Slow enough that the customer doesn't get whiplash, fast enough that they don't start repeating themselves.
Languages thatswitch mid-call.
6 languages live across 30 bidirectional pairs, 9 more in beta, 13 on the roadmap. The customer never picks a menu — we detect, switch, and keep going. If the call rotates between three languages in five minutes, the agent stays in their native language and the platform handles the rest.
Auto-detect &
mid-call switch.
No IVR menus, no language prompts. The customer changes their mind, or another household member takes the phone. The platform tracks it.
Glossary you
control.
Each tenant owns a termbase: locked translations, pronunciations, dialect variants. Versioned, signed, A/B-able per queue. The model defers to your terms before its own.
Audit-ready bydefault.
The hard part of voice AI in regulated industries isn't the voice — it's the audit. Every utterance is signed, every correction is append-only, every minute is exportable.
Corrections that
append, never edit.
A mistranslation gets a signed correction utterance threaded into the audit log. The original is preserved. Reviewers see both. Regulators see both.
Provenance,
signed.
One C2PA manifest per call. Cryptographically signed. Verifiable without our cooperation. The audit answer to 'where did this audio come from' lives in the file itself.
PII redaction by
zone, not globally.
Different queues have different rules. Healthcare redacts SSN/DOB/MRN before storage. Sales redacts only credit cards on export. Both honoured.
Plugs into thestack you have.
Generic by design. Anything that speaks SIP, WebRTC, or HTTPS connects in days, not quarters. We don't ship a contact-centre suite — we slot into the one you bought.
Standard
protocols.
SIP, WebRTC, PSTN, REST, webhook, gRPC. If your contact-centre platform speaks any of them — and they all do — we connect in days.
Write-back that
lands clean.
Call summary, glossary hits, corrections, redactions, manifest URL, outcome — written back to the customer record as structured fields. Your reps don't transcribe; your analysts get clean data.
Against the
two real alternatives.
The other paths most teams weigh: a human interpreter on the line, or a generic MT-plus-TTS layer cobbled together. We optimise for the calls where the voice and the audit trail both matter.
Things buyers ask.
Does TransVoix replace our human interpreters? +
How good is the cloned voice? +
What happens if the model mistranslates something critical? +
Can we control terminology — drug names, plan codes, internal jargon? +
Where does the audio actually live? +
How long does setup take? +
See it on your own queue.
Free. We align scope on a discovery call first, then you run 500 minutes or 30 days (whichever comes first) across 3–5 agents — no fee, no credit card, no commitment.