A larger feature release that bundles work queued behind three hotfix cycles. Three themes: an opt-in Quickshell OSD frontend, a Soniox cloud streaming backend, and a rebuild of meeting mode around source diarization. Release artifacts are now signed end-to-end in CI by a dedicated release key, removing the manual signing step from every cut.
Quickshell OSD frontend (opt-in)
A QML-native on-screen display that replaces the GTK4 surface for users who already run a Quickshell-based desktop. Three composed surfaces ship with the package: a waveform overlay during recording, an engine picker overlay for switching transcription backends without leaving the keyboard, and a meeting controls overlay for starting and stopping meeting captures. A new voxtype-audio-bridge sidecar streams audio levels to the QML side as NDJSON over a UNIX socket.
Why use it: if you already run Quickshell, the OSD blends into your existing shell instead of pulling in a GTK4 surface. The QML stack composes with Hyprland / Niri layer-shell rules cleanly, and the waveform renders at full compositor refresh rate.
[osd]
frontend = "quickshell"
The default frontend stays gtk4. Run voxtype setup quickshell to install the QML files and the sidecar binary into the right Quickshell config paths. Known issue: the Quickshell overlay captures pointer events across the whole screen (#440). v0.7.6 will fix.
Soniox cloud streaming backend (#411)
A new engine that streams audio over a WebSocket to Soniox and types each partial transcript at the cursor as it arrives. Lower end-to-end latency than the local Parakeet streaming pipeline, at the cost of network dependence and a third-party API key.
Why use it: the lowest-latency option when you can tolerate cloud transcription. Good for live captioning, long-form interview dictation, or any case where the half-second cost of local Parakeet streaming feels too long.
engine = "soniox"
[soniox]
api_key = "your-soniox-api-key"
Meeting mode: source diarization and VAD sub-windows
Three changes converge: a source-diarization path that attributes each speaker by which input device they came from (#341, contributed by sjug), VAD sub-windows that let the ECAPA-TDNN ML diarizer run on shorter chunks for better turn-by-turn accuracy (#418), and a per-meeting --diarization CLI flag so you can switch backends without editing the config file (#420).
Why use it: meetings with a clear host-plus-remote structure (loopback for remote audio, mic for the host) now get correct speaker labels without any ML cost. The per-meeting flag means you can run one meeting with full ML diarization and the next with cheap source diarization without restarting the daemon.
# Source-based (fast, works when speakers are on different input devices)
voxtype meeting start --diarization source
# ML-based (slower, works on a single mixed audio stream)
voxtype meeting start --diarization ml
Release signing migrated to a dedicated CI key (#437)
Every release artifact (binaries, companion .so files, SHA256SUMS.txt, .deb, .rpm, and the GitHub source archive) is now signed in CI by a dedicated release primary key (9CCF7915...) that is cross-signed by the offline maintainer key. Both fingerprints are in voxtype-bin's validpgpkeys, so yay auto-fetches the new key from a keyserver on first upgrade with no manual gpg --recv-keys step.
Why this matters: v0.7.4 used a back-signed signing subkey that yay could not auto-fetch by keyid, which broke yay -Syu voxtype-bin for every user with a stale local keyring. v0.7.5's standalone primary keyserver-resolves correctly on every keyserver implementation. The signing step itself is also now fully in CI, removing the maintainer-laptop step between tag and release.
Variant safety improvements
Two changes around the /usr/bin/voxtype dispatcher wrapper:
- Wrapper decision uses basename, not canonicalize (#443). The previous logic followed symlinks and could pick the wrong dispatch shape when
/usr/bin/voxtypealready pointed at a non-canonical path. Closed-set basename match is more reliable and doesn't depend on filesystem state. - TUI surfaces engine-vs-binary mismatch (#450). A persistent banner appears across every section when the configured engine cannot be served by the running binary.
F2jumps to the General section's variant picker. The daemon also fires a desktop notification at startup so users who do not open the TUI still see the warning.
Smaller fixes
- Streaming Parakeet validator accepts the
bobNightnaming convention (closes the v0.7.4 known issue). - Per-language XKB variants for
eitype(#424, contributed by AlexCzar). - Cohere ONNX remote-path prefix fix for the new
onnx/subdirectory layout (#363, contributed by jpds). - CPU
__cpuidwrapped inunsafeblock for rustc 1.91+ (#419, contributed by materemias). - NixOS CI workflow now builds on every PR (closes #369).
wl-copyfallback under X11 sessions (#346).dotooldaemon fast path (#410).- Audio socket listener watchdog (#391, #392).
Acknowledgments
- sjug for the source-diarization patch (#341).
- AlexCzar for the per-language XKB variants (#424).
- jpds for the Cohere ONNX path fix (#363).
- materemias for the rustc 1.91
__cpuidfix (#419).
Downloads
Full changelog, signed binaries, and SHA256 sums: v0.7.5 on GitHub.