OpenAI adds noise cancellation to improve its audio models

Voice AI weekly digest

Davit Baghdasaryan

Mar 24, 2025

Top Updates 💪

OpenAI adds noise cancellation and semantic VAD to improve its models for AI Voice Agents (WinBuzzer)
Krisp launches noise cancellation SDK to improve turn-taking for AI voice agents (PRWeb)
Google adds its voice model Chirp 3 to its Vertex AI platform (TechCrunch)
MacWhisper 1.2 adds top-requested feature to leading transcription app (9to5Mac)
What we know about PlayAI TTS and voice cloning platform (TechRadar)
Speech-to-speech translation market to reach $800M by 2030 (GlobeNewswire)
Taco Bell accelerates AI innovation with Nvidia (PYMNTS)
Webex rolls out AI tools to boost customer support (Webex Blog)
Genesys launches AI for supervisors to enhance employee performance (Genesys)
Talkdesk adds autonomous AI voice agents to its CX toolbox (CX Today)
Google’s CX Expansion: Disrupting AI-powered contact centers (Everest Group)
Anthropic to launch voice mode soon (Analytics India Mag)
Cisco paves the way with Agentic AI collaboration (CRN India)
AI agent startup reshapes the future of AI-powered conversations (TechBullion)
Rising AI self-service adoption drives 8x8 CPaaS APIs growth (ANTARA News)

Voice AI Podcast 🎙️

In case you missed the latest episode of Voice AI Podcast…

Podcast

AI Agents running on WebRTC | Russ d'Sa (Co-Founder & CEO of LiveKit)

Davit Baghdasaryan and Russ d'Sa

March 20, 2025

AI Agents running on WebRTC | Russ d'Sa (Co-Founder & CEO of LiveKit)

In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?

Listen now

Notable on X

Engineering Corner 😎

OpenAI’s audio models in the API (YouTube)
Advancements in speech recognition: A systematic review of deep learning transformer models (IEEE Xplore)
Low-resource speech recognition of radiotelephony communications based on continuous learning of in-domain and out-of-domain knowledge (IEEE Xplore)
A unit-diffusion model for code-switching speech synthesis (IEEE Xplore)
The unofficial guide to OpenAI realtime WebRTC API (WebRTC Hacks)
Speech-to-speech foundation models pave the way for seamless multilingual interactions (MarkTechPost)
How to build a real-time AI communication agent (Geeky Gadgets)
NVIDIA open sources Canary AI models for multilingual speech recognition and translation (Medium)
Improve AI voice assistant voice detection with turn detection and diarization (Geeky Gadgets)

Voice AI Newsletter

AI Agents running on WebRTC | Russ d'Sa (Co-Founder & CEO of LiveKit)

Discussion about this post

Ready for more?