Crazy week in Voice AI! 🤯

Voice AI weekly digest

Davit Baghdasaryan

Mar 06, 2025

Minimalist flat AI illustration of dark skinned woman and man laugh while looking at a laptop displaying abstract shapes in blue, white, and black emitting a word cloud reading Octave TTS

Top Updates 💪

Hume launches TTS model Octave with customizable AI voices (VentureBeat)
ElevenLabs is launching its own STT model (Techcrunch)
ConverzAI raises $16M for AI recruiters with 30% efficiency boost (VentureBeat)
Salesforce and Google bring Gemini to Agentforce, enabling more customer choice in major partnership expansion (Salesforce)
Announcing free, unlimited access to Think Deeper and Voice (Microsoft)
Empowering innovation: The next generation of the Phi family (Microsoft)
Real-time translation, accent smoothing, AI agents – Krisp & CX Today explore the future of CX (CX Today)
Scammers use voice clips to create AI clones (CNET)
Zoom secures its largest-ever contact center deal (CX Today)
Speechmatics unveils speaker diarization to improve meetings AI (UC Today)
How AI voice will change advertising (Voices)
Telnyx unveils Voice AI for human-like conversations at scale (GlobeNewswire)
Deepdub partners with AWS to advance AI media localization (PR Newswire)
Cresta announces rapidly scaling AI voice agents in production (PR Newswire)
Bliro raises €28M for AI-powered conversation intelligence platform (Tech)
Bridgetown Research raises $19M in Series A funding (Finsmes)
GibberLink: Breakthrough in how voice assistants communicate AI-to-AI (eWeek)

Voice AI Podcast 🎙️

In case you missed the latest episode of Voice AI Podcast…

Podcast

Immersive Experiences with Voice AI | Alex Bordanova (Chief Product & Technology Officer at Voicemod AI)

Davit Baghdasaryan

February 27, 2025

Immersive Experiences with Voice AI | Alex Bordanova (Chief Product & Technology Officer at Voicemod AI)

In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?

Listen now

Notable on X

Engineering Corner 😎

Speech emotion recognition using fine-tuned Wav2vec2.0 and neural controlled differential equations classifier (ResearchGate)
The technical blueprint behind Superdial’s healthcare voice agents (OpusResearch)
Deepgram’s STT model secret: synthetic data generation (DataScienceCentral)
How the Emilia dataset advances multilingual voice synthesis (MarkTechPost)
Combining TF-gridNet and mixture encoder for continuous speech separation for meeting transcription (Arxiv)
Enhancing multimodal AI: bridging audio, text, and vector search (Dev)
Unlocking scalable audio transcription with Gemini (Cloud.google)
Best AI voice agents (Play)

Voice AI Newsletter

Immersive Experiences with Voice AI | Alex Bordanova (Chief Product & Technology Officer at Voicemod AI)

Discussion about this post

Ready for more?