A 4-stage pipeline.
Sub-second latency.
MOD Translate is a single live pipeline that captures audio, transcribes it, translates it, and delivers it back to listeners — all in less time than it takes you to draw breath between sentences.
01
Capture
Audio enters from your broadcaster device
A phone, a laptop, or our hardware appliance feeds audio chunks every 250ms over a persistent WebSocket. Chunks are simultaneously archived to object storage so every service is recoverable.
02
Transcribe
Streaming speech-to-text in the speaker's language
We run two ASR engines in parallel — Deepgram Nova for low latency and AssemblyAI Universal for accuracy — and fall back automatically if either lags. Your custom glossary is injected at this layer.
03
Translate
Sentence-by-sentence translation, not word-by-word
A sentence buffer collects partial transcripts, fires translation at natural sentence boundaries, and rewrites in flight if context corrects an earlier phrase. Your glossary maps proper nouns and theological terms.
04
Deliver
To listeners on any device, in any language
Translated text streams over Server-Sent Events to listener phones. Optional neural TTS audio streams in parallel. Each language is its own channel — adding listeners is free; adding languages is one click.
Under the hood
Built on the Cloudflare edge.
Every component runs as close to your listeners as physically possible — because translation latency is the cost of feeling left out.
One Durable Object per service
Each live session gets its own coordinator — managing the broadcaster, listener pool, and translation state. State stays sticky so reconnects are seamless.
Audio archived to R2
Every chunk is written to a per-session prefix. After the service, we stitch them into a single file. Recordings stay yours; we never train on them.
Transcripts in D1
Translated lines land in a relational store you can query. Word-level timestamps, language codes, speaker attribution. Export anytime as JSON or CSV.
Glossaries you control
Add proper nouns, theological terms, and language-specific overrides. Glossaries apply at both transcription and translation, so 'Yahweh' never becomes 'Jehovah' downstream.
Pacing presets
Quick (700ms target), Standard (1.1s), or Careful (1.6s). Trade latency for fluency depending on whether you’re translating energetic preaching or deliberate teaching.
Failover, not fingers-crossed
If a vendor falters mid-service we hot-swap to the backup in under two seconds. Your service never pauses for our infrastructure problems.
What you ship
The deliverables a host actually cares about.
During the service
- ✦ Listener QR + URL
- ✦ Captions in 30+ languages
- ✦ Optional voice in earbuds
- ✦ Decision Moment cards
- ✦ Live moderator console
After the service
- ✦ Full audio recording
- ✦ Transcript in every language
- ✦ Listener analytics by language
- ✦ Decision response ledger
- ✦ Shareable replay link
Every week
- ✦ Glossary refinements
- ✦ Accuracy reports
- ✦ Service health digest
- ✦ New languages on request
- ✦ Direct line to our team
Make every word land.
Run a free live test in your own service. We will help you set it up in 20 minutes.