Sam Altman is right and very wrong about AI-faked voices. And much more!

Voice AI weekly digest

Davit Baghdasaryan

Jul 28, 2025

Top Updates 💪

OpenAI CEO Sam Altman is right and very wrong about AI-faked voices (WashingtonPost)
Microsoft is doubling down on multilingual large language models – and Europe stands to benefit the most (ITPro)
Voice AI landscape in Europe 2025 (Sifted)
Freed says 20,000 clinicians are using its medical AI transcription ‘scribe,’ but competition is rising fast (VentureBeat)
Amazon is acquiring Bee, maker of a wearable AI assistant that listens to conversations (Geekwire)
Voice Fraud Prevention in the Age of AI and Hybrid Work (UCToday)
Gupshup raises $60M+ to expand its conversational AI and messaging platform (SiliconAngle)
AI voice company Hyper raises $6.3M to help automate 911 calls (TechCrunch)
Voxtral technical report (X)
Why AI Should Prioritize Conversations Over Automation in Outbound Sales (UniteAI)
AI Voice Assistants in UC: Build, Buy, or Bridge? (UCToday)
Speechmatics shipped realtime speaker diarization for voice agents (X)
How AI speech-to-text technology is tuning in to a digital Saudi Arabia (ArabNews)
Best AI Meeting Notes Assistants for Fintech Teams (Medium)
SayWrite.ai Is Redefining Productivity with AI-Powered Voice Note Taking (Medium)
Amplify Launches Custom AI-Powered Automatic Speech Recognition System (TheJournal)
Hume AI delivers speech models on SambaCloud (SambaNova)
Lightning Captions: Real-Time Transcription and Translation for the Classroom (PRNews)
Leena AI unveils conversational AI ‘colleagues’ for the enterprise (ComputerWorld)

Voice AI Podcast 🎙️

In case you missed the latest episode of Voice AI Podcast…

Engineering Corner 😎

Introducing Version 2 of Higgs Audio Generation (Boson)
NAR-SREC: Nonautoregressive End-to-End Speech Recognition With Error Correction Decoder
RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow Matching
ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
Mureka TTS V1: A new voice model from Mureka AI, integrated into their AI music and audio platform
Micdrop v2 Launch: Micdrop, an open-source set of TypeScript packages for building real-time voice conversations with AI agents, launched its v2
macos-local-voice-agents: A new open-source GitHub repository was introduced for running Pipecat voice AI agents locally on macOS
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding.

Operactive Arts

Aug 5, 2025

Hi Davit, I was wondering if you would be interested in participating in our research about the future of AI in Creative Industries? Would be really keen to hear your perspectives. It only takes 10mins and I am sure you will find it interesting.

https://form.typeform.com/to/EZlPfCGm

Voice AI Newsletter

Discussion about this post

Ready for more?