Updates from Google, MS, Twilio, Alibaba, Genesys and more 🔥
Voice AI weekly digest
Top Updates 💪
- Google’s Gemini AI can now analyse your audio files (Business Standard) 
- Microsoft’s Copilot AI TTS gets new, cleaner “scripted mode” (PCWorld) 
- Twilio adds OpenAI API for seamless AI voice warm transfers (WebProNews) 
- Alibaba unveils its most powerful speech model (36Kr Europe) 
- SoundHound AI acquires Interactions for $60MN (CX Today) 
- Genesys deepens its ServiceNow partnership, releases new agentic AI orchestration capabilities (CX Today) 
- Podonos raises $2.4M in pre-seed funding (FINSMES) 
- Verbit and Deepdub partner to automate multilingual dubbing (PR Newswire) 
- Milagro and Revmo AI partner to transform restaurant guest engagement with conversational voice AI (PR Newswire) 
- Intella raises $12.5M to offer AI STT for 25+ Arabic dialects (MENAbytes) 
- TwinMind raises $5.7M to launch AI second brain for note-taking (Dataconomy) 
- AiOla brings voice AI to complex industries – with Snowflake (Snowflake) 
- Why big investors are all ears for voice AI startups (Crunchbase) 
- Ralph Lauren begins rollout of AI conversational shopping experience (PYMNTS) 
- Deepdub launches Lightning, a real-time voice model (Yahoo Finance) 
- Ai phone call assistant: Merging human touch with voice AI (London Daily News) 
- How AI transcription is revolutionizing business communication (TechBullion) 
- Bolna AI bets on powering all voice models (Analytics India Magazine) 
- RingCentral acquires CommunityWFM to expand RingCX portfolio (RingCentral) 
- CallTrackingMetrics launches VoiceAI for contact centers (PR Newswire) 
- Startup Hello Patient raised $22.5 million in series A (FierceHealthcare) 
- AI voice agents helped improve accuracy of blood pressure measurements in older adults (News-Medical) 
Voice AI Podcast 🎙️
In case you missed the latest episode of Voice AI Podcast…
Beyond Cascades to Speech-to-Speech | Anshul Shrivastava & Kumar Saurav (Co-Founders at Vodex.ai)
In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?
Engineering Corner 😎
- Best TTS chrome extensions (AboutChromebooks) 
- OLMoASR: a series of open automatic speech recognition models (AllenAI) 
- Dual-stream former: A dual-branch transformer architecture for visual speech recognition (MDPI) 
- Building a speech-enhancement and automatic speech recognition pipeline in Python using SpeechBrain (MarkTechPost) 
- The future of conversational voice AI is here by Pannalabs.ai (DEV) 
- Voice recognition vs speech recognition: What you need to know (ClickUp Blog) 
- Stanford & UC Santa Cruz launch benchmark for audio-language models (Slator) 
- News Corp Australia rolls out TTS audio innovation (Newscorp Australia) 
Fullband 2025 registration is open🔔
Join us for Fullband 2025, Krisp’s annual voice innovation conference, where CX leaders, analysts, and innovators explore the role of voice in customer contact and the technologies shaping its future.



