Gradium Translates Voice Live — Faster Than GPT
Gradium launched two real-time speech translation models that beat OpenAI's gpt-realtime-translate on speed and accuracy — available via API today.
Evgenii Arsentev · PhD3.0 seconds. That's how long Gradium's new models take, on average, to hear you speak in English and produce translated speech in French, German, Spanish, or Portuguese. OpenAI's gpt-realtime-translate takes 3.6 seconds. Google's Gemini Live Translate comes in at 2.9 seconds. Gradium sits between them — but on accuracy, it leads both.
The startup just launched two models: stt-translate, which turns your speech directly into translated text, and s2s-translate, which goes all the way to translated speech on the other end. Together they cover 20 bidirectional language pairs across English, French, German, Spanish, and Portuguese.
Why This Approach Is Different
Most voice translation today works as a relay race: one system transcribes what you said, a second translates the text, a third reads it back in the target language. Each handoff adds a delay and creates a new place for errors — a mistranscription in step one quietly poisons everything downstream.
Gradium collapses this into two models instead of three. Their stt-translate handles transcription and translation in a single pass, without the handoff. The whole exchange runs over one persistent connection, which is why the latency stays tight even as accuracy holds up. The s2s-translate model also lets you pick the output voice or clone an existing one — something OpenAI's realtime translation doesn't offer.
On benchmarks, Gradium reports leading gpt-realtime-translate and gemini-3.5-live-translate on BLEU (a standard measure of how closely the translation captures the original meaning), and beating GPT on MetricX as well. These are self-reported numbers — worth keeping in mind — but the architecture at least makes the claim plausible.
What This Means If You're Building
If you're working on anything voice-powered — a meeting assistant, language-learning app, multilingual customer service bot, or just an experiment with live translation — Gradium is now a third serious option alongside OpenAI and Google. The API is live at gradium.ai/translate. A Python SDK with async streaming is available, so wiring it in doesn't require rebuilding your setup.
Five languages only for now — no Hindi, Mandarin, Japanese, Arabic, or Russian yet. Gradium is a new name with no public pricing, so it's early days. But the direction is clear: real-time voice translation is becoming a commodity API, and the gap between the best options is narrowing fast.
Related guides

Author
Evgenii Arsentev
PhD · Chief Product Officer at a tech company
Want to actually build this?
Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.
◉ Start the free courseSource: marktechpost.com