Events
AI Engineer World’s Fair is a flagship AI engineering conference with a dedicated Voice & Realtime AI miniconference featured this year (Jun 29-Jul 2, SF | AI Engineer)
Low Latency Lounge by Deepgram is an invite only evening for engineers building the fastest AI in the stack. Together AI and Runware are cohosting (Jun 30, SF | LUMA)
Real-Time Voice AI × Device Builders Meetup “Give Voice to Robots!” Runs alongside IVS Kyoto (Jul 2, Kyoto | Voice AI Space)
Top Updates 💪
AssemblyAI launches Universal-3.5 Pro Realtime, the first streaming STT model that takes the agent’s question as input (AssemblyAI Blog)
Five9 launches Voice AI Agents and AI Agent Studio at CCW, bringing agentic CX to enterprise contact centers. (CX Today)
Krisp launches Voice Security for deepfake detection and fraud detection for contact centers. (CX Today)
CallMiner launches real-time AI guidance that lets contact center agents initiate AI assistance on demand with human-in-the-loop controls. (BusinessWire)
Assort Health raises $120M Series C led by Menlo Ventures at a $1.2B valuation to scale its voice AI agent platform across healthcare. (Fierce Healthcare)
Prosper AI raises $30M Series A led by a16z to scale its autonomous patient journey platform, reporting 5x revenue growth in six months. (HackerNoon)
Coval raises $28M Series A led by Norwest to advance its voice AI evaluation and testing platform, founded by an ex-Waymo engineer. (Pulse2)
Kotoba Technologies raises $10M seed led by Kindred Ventures for its real-time East Asian voice translation platform with sub-2s latency. (VentureBeat)
Valence AI raises $5M seed and secures US patents on real-time emotional detection from live speech. (PR Newswire)
TELUS Digital partners with ElevenLabs as a preferred implementation partner to scale voice AI alongside frontline customer care teams. (PR Newswire)
OpenAI’s GPT-Bidi-1 leaks as a full-duplex voice model that can listen and speak simultaneously, enabling true bidirectional conversation. (Crypto Briefing)
Conduent unveils a next-gen CX platform with real-time translation across 90+ languages to accelerate agent performance. (Conduent)
Speechify brings free voice typing to all iPhone and Mac users, adding AI-powered dictation across every app. (9to5Mac)
Modulate launches an AI music detection API with 95% precision across 76 genres to help platforms verify AI-generated music. (Morningstar)
ByteDance releases Seed Audio 1.0, a unified model that generates speech, music, and ambient sound from a single architecture. (CityBuzz)
Amazon launches Alexa Plus Hindi beta in India, targeting 600M+ Hindi speakers with its upgraded AI assistant. (The Next Web)
ElevenLabs adopts Google’s SynthID watermarking to tag all AI-generated speech, making synthetic voices easier to detect. (Digital Trends)
Shure says audio quality is now the critical bottleneck for AI-powered meetings, and microphone clarity drives everything. (InAVate)
Attention Labs launches SAA, a selective auditory attention layer that lets voice AI detect when it is being directly addressed. (Dispatch)
Deepgram and Fortanix partner to run voice AI on-premises with NVIDIA confidential computing, keeping audio data encrypted during processing. (RadioInfo)
Engineering Corner 😎
Gradium releases STT-Translate and S2S-Translate, real-time speech translation models that beat GPT Realtime Translate on accuracy and latency. (MarkTechPost)
AWS publishes a full tutorial on building a healthcare appointment agent with Amazon Nova 2 Sonic and Bedrock AgentCore. (AWS Blog)
AssemblyAI shares four techniques for prompting Claude to build production-ready voice agents in about 30 seconds. (AssemblyAI Blog)
Deepgram discusses voice AI infrastructure and the path to production-grade agents on the Telecom Reseller podcast. (Telecom Reseller)
ACL 2026 publishes 10 voice AI papers covering noise-robust ASR, accented speech recognition, environment-aware TTS, controllable speech synthesis, multi-speaker diarization, and multilingual translation. (ACL Anthology)

