Updates from MS, Nvidia, Amazon and much more!

Voice AI weekly digest

Dec 08, 2025

Krisp is hiring!
Krisp’s SDK team has three key openings: Sr Product Manager, Sr Solution Enginer and BD Manager. If you know exceptional people who might be a great fit, please share these roles with them. Thank you 🙏

Top Updates 💪

Microsoft released VibeVoice-Realtime-0.5B (X)
Amazon Nova 2 Sonic for real-time conversational AI (AWS)
NVIDIA released a new suite of open-weight models (X)
Bytedance rolls out AI voice assistant on Chinese smartphones (Reuters)
Amazon Connect introduces agentic self-service voice features (AWS)
Krisp wins CXA Innovation Awards for Best Use of AI (LinkedIn)
SynthFlow AI launches OpenAI-powered BELL framework (MarTechSeries)
StepFun AI releases Step-Audio-R1, a new audio LLM (MarkTechPost)
Agentic Voice AI converses at human speed (SoundHound)
Deepgram launches streaming speech & voice agents on SageMaker (Morningstar)
AI voice startup Gradium nabs $70M seed (TechCrunch)
Amazon Connect adds support for third-party STT and TTS AI models (AWS)
Pixel Recorder with enhanced clear voice and account switcher (WebProNews)
Telnyx & CommsPlus expand carrier-grade voice and AI in ANZ (GlobeNewswire)
3CLogic boosts voice-enabled customer service with ServiceNow (PR Newswire)
Voice AI agents for scheduling and customer reminders (AnalyticsWeek)
The economics of Voice AI: 1 cent/minute with Falcon’s architecture (FromDev)
Aseto AI advances Greek and Cypriot speech recognition (Philenews)
ElevenLabs competes with tech giants in AI voice market — Forbes analysis reveals business opportunities in natural-AI audio tools (Blockchain.news)
Voice AI Podcast 🎙️

Engineering Corner 😎

New Qwen3-TTS is here (X)

The Massive Sound Embedding Benchmark (MSEB): Definitive, open-source platform for measuring machine sound intelligence (Google Research Blog)
Integrate Voice AI with Salesforce for enhanced customer support (DEV)
30+ AI Agents from growing SaaS and interesting startups (StarCIO)
Vision-Agents: AI agents that watch, listen understand video (GitHub)
Speech recognition API for voice input (DEV)
Speech synthesis API for TTS (DEV)
How to use Google Docs’ Gemini audio TTS feature (Times of India)
VMEG: AI video dubbing online (VMEG)
VibeVoice: A frontier open-source TTS model (Hugging Face)
VoxCPM: Tokenizer-free TTS and true-to-life voice clonning (X)
Toucan: TTS for over 7000 languages (X)

Voice AI Newsletter

Voice AI Podcast 🎙️

Discussion about this post

Ready for more?