The Week Voice AI Went Local

Voice AI weekly digest

Davit Baghdasaryan

Apr 13, 2026

Top Updates 💪

Krisp brings Accent Conversion to YouTube with free Chrome Extension for 2.7B users (LinkedIn)
Google quietly ships AI Edge Eloquent, a free offline-first dictation app for iOS running on-device Gemma models with filler removal and no subscription. (TechCrunch)

Mistral launches Voxtral TTS, a 4B open-weights streaming speech model in 9 languages that beats ElevenLabs Flash v2.5 in voice cloning win rates. (Slator)
ByteDance introduces Seeduplex, a native full-duplex speech LLM that listens while speaking and cuts false interruption rates in half vs half-duplex Doubao. (ByteDance Seed)
Willow launches Atlas-1, a new frontier STT model built on human-powered transcription infrastructure that claims to beat ElevenLabs, Deepgram, and OpenAI. (VP-Land)
Telnyx launches LiveKit on Telnyx, a hosted platform running LiveKit agents on Telnyx infrastructure with 50% lower cost and sub-200ms latency. (Telecom Reseller)
Natter raises $23M Series A led by Renegade Partners to replace enterprise surveys with AI-moderated 1:1 video conversations at scale. (VentureBurn)
Twilio Q4 voice AI revenue grew 60% as the company closed its biggest enterprise deal ever and repositioned as AI infrastructure. (CX Today)
Regal AI launches Copilot, a self-improving voice agent builder that learns from call outcomes and flags underperformance automatically. (SiliconANGLE)
Exotel acqui-hires Dubverse core team to lead conversation quality analytics and AI, deepening its voice AI stack for Indian enterprises. (TechCircle)
Californians sue Sutter and MemorialCare over use of Abridge AI scribe that allegedly recorded doctor-patient visits without clear patient consent. (Ars Technica)
Five9 expands Fusion ecosystem with AI Agent Connect API, letting enterprises wire voice AI agents into third-party systems and Assembled WFM. (Yahoo Finance)
Weya AI open-sources Hush, an 8MB speech enhancement model with 1.8M params that isolates the primary speaker in under 1ms per frame, CPU-only. (IndianWeb2)
Shunya Labs launches voice AI platform for dubbing, translation, lip-sync, and low-shot voice cloning for entertainment localization. (Passionate in Marketing)
Beaver AI launches Magic Whiteboard, a privacy-first meeting assistant that transcribes in real time but never records or stores audio. (PRWeb)

Engineering Corner 😎

AWS on Nova Multimodal Embeddings for semantic audio search across tone, emotion, and events, unified with text/image/video in a single vector space. (AWS Blog)

Voxtral TTS surgery: deep-dive into reconstructing codec audio from intermediate model states. (Towards Data Science)
Kokoro 82M TTS runs fully offline on CPU with 8 languages and 26 voices in a ~350MB footprint. (Geeky Gadgets)
docker-whisper: self-hosted Whisper ASR in a container for easy local deployment. (GitHub)
Browser-based STT with Whisper: tutorial on running Whisper inference entirely in the browser. (dev.to)
Lightweight offline TTS for Node.js using a minimal dependency chain. (dev.to)
Designing a real-time voice agent with RAG, SIP, and compliance guardrails. (HackerNoon)
Open-source Amazon Lex connector for Cisco Webex Contact Center for adding virtual agents without a platform rebuild. (AWS APN Blog)

Voice AI Newsletter

Discussion about this post

Ready for more?