Top Updates 💪
Soprano: 80M weights open-source TTS model (X)
Top ten language AI use cases in 2025 (Slator)
Resemble AI drops Chatterbox Turbo, an open-source TTS model (The Decoder)
ByteDance launches voice-driven AI workspace AnyGen (EqualOcean)
Gnani.ai launches Vachana STT, a foundational Indic STT model (CXO Today)
Google Health AI releases a conformer-based medical STT (MarkTechPost)
Gemini adds dynamic pacing control for natural speech (Small Business Trends)
Hyper AI audio glasses debut at CES as a voice recorder (USA Today)
Jeff Dean on how a compute-intensive speech recognition feature made Google develop its own TPUs in 2015 (OfficeChai)
Alibaba open-sources voice interaction model Fun-Audio-Chat (Pandaily)
TicNote AI-powered voice recorder launches in the Philippines (Technobaboy)
NotebookLM may introduce long ‘Lecture’ audio mode (Business Standard)
NARRIS partners with Heartfulness Institute for speech AI (SMEStreet)
How TTS technology is transforming content creation (TopTrade)
Why they chose voice over chat for AI interviews (DEV)
Engineering Corner 😎
Asterisk AI voice agent (GitHub)
Deploy Mistral AI’s Voxtral on Amazon SageMaker AI (AWS)
SpeakerLM: End-to-end versatile speaker diarization and recognition with multimodal large language models (arXiv)
Whisper statistics 2026 (About Chromebooks)
Implementing real-time streaming with VAPI for live support chat systems (DEV)
An intelligent english-speaking training system using generative AI and speech recognition (MDPI)
MiraTTS: A finetune of the Spark-TTS (GitHub)
Voice chat using WebGPU in a browser (X)
24/7 answering service powered by AI phone answering (TechBullion)
Human and AI voice identities evoke shared neural signatures during speaker recognition across changes in speech content and prosody (bioRxiv)

