Google I/O Goes Voice-First, Corti Beats OpenAI on Medical STT

Voice AI weekly digest

Davit Baghdasaryan

May 25, 2026

Top Updates 💪

Google adds voice to Gmail, Docs and Keep letting users search their inbox and dictate by speaking instead of typing. (TechCrunch)

Google unveils audio-powered smart glasses at I/O 2026, taking on Meta in the wearable AI race. (TechCrunch)
Spotify launches an ElevenLabs-powered tool that lets authors create audiobooks from text. (TechCrunch)
Corti’s Symphony model outperforms OpenAI’s Whisper on medical terminology accuracy for speech-to-text. (VentureBeat)
Zoom opens its AI Translator and Summarizer as standalone APIs for third-party developers. (Zoom | Slator)
Twilio shares surged 60% as voice AI adoption accelerates across its communications platform. (Sebastian Barros)
Zendesk expands its AI agents across ChatGPT, Gemini, voice and messaging channels. (TechRadar)
Kardome ships its voice AI in LG OLED TVs, reaching mass-market consumers for the first time. (AudioXpress)
Amazon’s Alexa can now generate full podcast episodes on any topic you ask for. (TechBuzz)
Alibaba releases Qwen3.5 LiveTranslate Flash, a real-time interpreter covering 60 languages at 2.8-second latency. (MarkTechPost)
NTSB shuts down its public docket after people used AI to recreate dead pilots’ voices from spectrograms. (Engadget)
Columbia researchers pass the first human trial of a brain-controlled hearing system that isolates one speaker in noise. (Medscape)
iProov launches a deepfake detection system designed specifically for enterprise video calls. (FinanceFeeds)
Halsa Global launches Voice IQ, a Salesforce-native conversational AI for enterprise sales. (Newswire)
Korean tech firms double down on voice AI with localized models and in-car assistants. (Korea Times)
Tamber launches its AI music creation platform after raising $5M from Adobe Ventures. (Music Business Worldwide)
TalkSign launches Palm 1.0 and Echo 1.0, AI models for sign language recognition and generation. (TechCabal)
CMU research shows adding audio cues like typing sounds makes AI feel more human but also more rude. (TechXplore)
Office workers shift from typing to voice dictation as AI transcription apps go mainstream. (The Week)
Synthflow AI handles over 5 million calls a month as call centres move to voice AI at scale. (Tech.eu)

Engineering Corner 😎

AWS publishes a guide to building real-time voice apps with SageMaker AI and vLLM using bidirectional streaming. (AWS Blog)

VoiceBox is an open-source voice cloning app that runs locally from 3 seconds of audio with no cloud uploads. (TechTimes)
Vowen is a free offline voice dictation tool for Windows and macOS that transcribes speech system-wide. (MajorGeeks)
NoteSnip turns video transcripts into source-grounded AI study notes across YouTube, podcasts and PDFs. (Dev.to)
IEEE Spectrum covers how Maori researchers are building indigenous AI voice models to preserve te reo Maori. (IEEE Spectrum)
Memeburn ranks the best AI voice generators of 2026 by use case, from cloning to e-learning. (Memeburn)

Voice AI Newsletter

Discussion about this post

Ready for more?