I built my own dictation app instead of renting one
Leif ·
A coworker showed me a dictation app he liked: you talk, it transcribes, an AI cleans the text up. I looked at it for a minute, saw it was a subscription at somewhere around twelve dollars a month per person, and figured I could build that myself. So I did. It is called Quill, and it runs entirely on your machine.
I talk more than I type. For the volume of prompts and chat and notes I get through in a day, speaking is faster, and a tool that pastes into whatever field I am focused on takes a real bite out of the typing. The problem with the paid ones is not only the bill. It is that your voice and your transcripts go to someone else's cloud. I did not want that for something I use all day.
Fully local, by default
Quill is a pipeline of four stages, and the two that matter both run on-device. Whisper does the transcription with whisper-rs, Metal on Apple Silicon and the CPU elsewhere. Then an embedded llama.cpp runs a polish pass over the raw transcript, against a model whose hash is pinned and checked before first use. Your audio never leaves the machine. No API key, no transcript round-trip, and telemetry and crash reporting are both off by default. The daemon will not even log your raw transcripts.
The fast path is one key: capture, transcribe, paste, no polish, lowest latency. The enhanced path adds polish, and polish comes in styles, because a bug report and an email do not want the same cleanup. There is technical, which leaves code and command names alone, plus email, bullets, concise, and a no-polish passthrough for the raw Whisper text.
Free local, paid cloud
The local experience is the free one, and I mean the whole thing, not a crippled trial. Quill is fast and private with nothing to pay. Pro is for what genuinely needs a server: syncing across your devices, and the option to use cloud models for speech-to-text or for polishing and translating when you want more than the local models give you. That will be a small subscription or a one-time lifetime price, because I would rather sell you the thing once than rent you your own voice.
It is an invite-only alpha right now, gated through our Discord while we dogfood it. But the shape is set: talk instead of type, keep it on your machine, pay only if you want the cloud.