This GPT-4o voice demo is 🤯 Updates from Meta, PolyAI, GreyLabs, WellSaid and much more 🔥

Davit Baghdasaryan

Aug 12, 2024

GPT-4o voice mode early access demo showing its capabilities:

AI researchers developed a new Listening-While-Speaking Language Model:

Top Updates 💪

SoundHound acquires Amelia AI for $80M after it raised $189M+
OpenAI finds that GPT-4o does some truly bizarre stuff sometimes
PolyAI partners with AWS to boost of next-Gen voice AI in customer service
How AI is reshaping the customer experience
WhatsApp will give ability to convert speech to text
Bolt upgrades Driver App chat with ability to translate speech to text

SoftBank’s balancing act; Analysing conversations with GenAI
RingCentral sees double-digit revenue growth, enjoys a surge in RingCX bookings
GreyLabs AI bets on GenAI to analyze customer conversations and get insights
Interra Systems presents Media QC, monitoring, analysis solutions at IBC2024
Trial lawyer’s AI-powered voice tool brought Lori Cohen back to her life
Google project Astra: The AI assistant we have been waiting for?
WellSaid unveils verbal cues, phonetic respellings, and enhanced security
Ema raises $36M to build universal AI employees for enterprises
People can now speak all languages in their own voice with GalaxyVoice.ai
Bee raises $7M for its wearable AI assistant that learns from your conversations

Noteworthy 💪

How popular is machine translation post-editing?
The bright future of voice self-service

Transforming business communication: The power of AI-driven phone calls
FCC to require improved closed captioning accessibility for English and Spanish

UC round table: Conversational intelligence and analytics
This caller does not exist: Using AI to conduct vishing attacks
Deepfakes: The AI scam you didn’t see coming

A guide to AI voice agents for business owners and leaders
TTS and virtual reality agents in primary school classroom environments
AI in business: Elevating CX and energising employees

Science and Demo Corner 😎

Bytedance researchers present cross language agent
A real time speech translation on VoIP number
Guidebook to reduce latency for Azure STT and TTS applications
Audio-powered robots: A new frontier in AI development
VioLA: Conditional language models for STT, TTS, and translation
A Beginner’s guide to TTS algorithms with real-life examples
Speech recognition: Metrics and architecture
Multi-granularity generative error correction with LLM for joint accent and STT
Keyword guided target speech recognition
Tibetan speech synthesis based on pre-traind mixture alignment FastSpeech2
Speaker identification in single track productions

Discussion about this post

I've never been so excited to read a news latter, thanks for it !

Expand full comment

No posts

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts