Replacing a $480/mo cloud transcription vendor with on-device voice
An operator running clinical notes through a per-minute cloud vendor moved the entire workflow on-device in a single deploy session.
A solo operator was transcribing clinical notes through a cloud transcription vendor at roughly $480/month, billed per minute. Usage growth meant the bill only went up.
Worse than the cost was the exposure: sensitive patient audio was leaving the operator's machine and sitting with a third-party vendor, with the renewal anxiety that comes with depending on someone else's pricing and uptime.
We deployed a private voice stack on the operator's own hardware: Whisper for high-accuracy speech-to-text, packaged as a self-contained service with a clean local API.
Model selection was tuned for the operator's accent and vocabulary, and the workflow was wired so a recording becomes a structured note without anything leaving the laptop.
The whole deploy — install, model selection, and a working integration — happened in a single session.
Recording ──▶ Whisper (local) ──▶ Structured note
│
▼
Stays on operator's machine
[ no cloud vendor · no per-minute meter · no audio leaving ]The $480/month cloud vendor was eliminated entirely. The replacement runs on hardware the operator already owned.
No audio leaves the machine, removing the third-party privacy exposure and the renewal anxiety.
Transcription is now effectively free at the margin, so the operator uses it more, not less.
- ▸Local voice is no longer a research project — on commodity hardware it's a same-day deploy.
- ▸For privacy-sensitive work, 'no audio leaves the machine' is a stronger selling point than raw accuracy.
- ▸Removing per-minute billing changes behavior: people use the tool more once the meter is gone.