Updates from Google, MS, Twilio, Alibaba, Genesys and more 🔥

Voice AI weekly digest

Davit Baghdasaryan

Sep 15, 2025

Top Updates 💪

Google’s Gemini AI can now analyse your audio files (Business Standard)
Microsoft’s Copilot AI TTS gets new, cleaner “scripted mode” (PCWorld)
Twilio adds OpenAI API for seamless AI voice warm transfers (WebProNews)
Alibaba unveils its most powerful speech model (36Kr Europe)
SoundHound AI acquires Interactions for $60MN (CX Today)
Genesys deepens its ServiceNow partnership, releases new agentic AI orchestration capabilities (CX Today)
Podonos raises $2.4M in pre-seed funding (FINSMES)
Verbit and Deepdub partner to automate multilingual dubbing (PR Newswire)
Milagro and Revmo AI partner to transform restaurant guest engagement with conversational voice AI (PR Newswire)
Intella raises $12.5M to offer AI STT for 25+ Arabic dialects (MENAbytes)
TwinMind raises $5.7M to launch AI second brain for note-taking (Dataconomy)
AiOla brings voice AI to complex industries – with Snowflake (Snowflake)
Why big investors are all ears for voice AI startups (Crunchbase)
Ralph Lauren begins rollout of AI conversational shopping experience (PYMNTS)
Deepdub launches Lightning, a real-time voice model (Yahoo Finance)
Ai phone call assistant: Merging human touch with voice AI (London Daily News)
How AI transcription is revolutionizing business communication (TechBullion)
Bolna AI bets on powering all voice models (Analytics India Magazine)
RingCentral acquires CommunityWFM to expand RingCX portfolio (RingCentral)
CallTrackingMetrics launches VoiceAI for contact centers (PR Newswire)
Startup Hello Patient raised $22.5 million in series A (FierceHealthcare)
AI voice agents helped improve accuracy of blood pressure measurements in older adults (News-Medical)

Voice AI Podcast 🎙️

In case you missed the latest episode of Voice AI Podcast…

Engineering Corner 😎

Best TTS chrome extensions (AboutChromebooks)
OLMoASR: a series of open automatic speech recognition models (AllenAI)
Dual-stream former: A dual-branch transformer architecture for visual speech recognition (MDPI)
Building a speech-enhancement and automatic speech recognition pipeline in Python using SpeechBrain (MarkTechPost)
The future of conversational voice AI is here by Pannalabs.ai (DEV)
Voice recognition vs speech recognition: What you need to know (ClickUp Blog)
Stanford & UC Santa Cruz launch benchmark for audio-language models (Slator)
News Corp Australia rolls out TTS audio innovation (Newscorp Australia)

Fullband 2025 registration is open🔔

Join us for Fullband 2025, Krisp’s annual voice innovation conference, where CX leaders, analysts, and innovators explore the role of voice in customer contact and the technologies shaping its future.

Voice AI Newsletter

Discussion about this post

Ready for more?