ChatGPT voice mode in conversations and much more this week!

Voice AI weekly digest

Dec 01, 2025

Krisp is hiring!
Krisp’s SDK team has three key openings: Sr Product Manager, Sr Solution Enginer and BD Manager. If you know exceptional people who might be a great fit, please share these roles with them. Thank you 🙏

Top Updates 💪

ChatGPT brings voice mode into conversations (PCMag)
Microsoft Dragon Copilot expands to radiologists (Microsoft)
Tencent, US startup Cartesia advances voice AI for developers (Tech in Asia)
Krisp launches 3.5x smaller Voice Isolation model for Voice AI Agents (LinkedIn)
SecretsAI unveils global AI voice-call generator (The Globe and Mail)
ElevenLabs enters South Korea to build Asia voice AI hub (Chosun Business)
SageMaker AI Inference enables bidirectional streaming (AWS)
Generative AI drives rapid growth in voice cloning market (OpenPR)
Voice AIs’ missing piece: The ability to listen while they talk (Fast Company)
Speechify pivots to voice input with Chrome AI assistant (TechBuzz.ai)
ElevenLabs crosses $300M ARR milestone (Tech in Asia)
TGH adopts Hyro’s voice AI agent skills (HospitalManagement.net)
StreamUnlimited launches customizable voice LLM reference integration for audio agent products (PR Newswire)
Telnyx and Yeastar partner on unified-communications control (GlobeNewswire)
Comulytic launches dual-scenario AI recording device (PR Newswire)
Hedy AI introduces Topic Insights, industry’s first cross-session meeting intelligence technology (Manila Times)
EAR-BUS F06: AI live call translator for smartphone with 0.3-sec delay (Gadgetify)

Engineering Corner 😎

Spatial audio improves UX in AI live speech translation, research finds (Slator)
VoiceRadar: Voice deepfake detection using micro-frequency and compositional analysis (Security Boulevard)
AIVocal TTS transforms text into natural expressive audio (NerdBot)
How to turn WhatsApp voice messages into text (Times of India)
Step-Audio-R1 technical report (arXiv)
Asterisk AI Voice Agent (GitHub)
Dia2: A streaming dialogue TTS model created by Nari Labs (GitHub)
Unmasking bias: How vocal cues skew speech translation (DEV)
Watch Gemini 3 spin up client-ready voice AI sites (Geeky Gadgets)

Voice AI Newsletter

Discussion about this post

Ready for more?