Moshi beats OpenAI's voice model 👀 Updates from NVIDIA, Twilio, MS, ElevenLabs and much more 🔥
Moshi’s AI voice assistant beat OpenAI to one of ChatGPT's most anticipated features. According to Kyutai, Moshi can speak in various accents and has 70 different emotional and speaking styles.
Top Updates 💪
Conversational AI market to grow $87B by 2034
Speech-to-text API market estimated to reach $12.1 billion by 2031
NVIDIA NeMo T5-TTS model tackles hallucinations in speech synthesis
Twilio announces a mobile app for CCaaS, “Personalized IVR”, & more
ElevenLabs Voice Isolator: Remove background noise for film, podcast, interview
Microsoft to add name mispronunciation detector to Teams
Philips Speech and Sembly AI join to introduce AI meeting transcription solution
The best AI Meeting Assistant: 12 options for productivity
Microsoft expands Azure AI Speech with multilingual voices and avatars
Vida, provider of AI voice agents, raises $3M funding
Best Text-to-Speech software in 2024
Noteworthy 💪
Zoom vs Zoom Workplace: Is Zoom Workplace Just Zoom?
Are voice notes the latest weapon in the deepfake arsenal?
Get more from audio data with Conversational Intelligence
Make your voice chatbots more engaging with new text to speech features
Assessment of Pepper Robot’s STT system through the lens of machine learning
Science and Demo Corner 😎
VALL-E 2: Breakthrough in human-like TTS technology
Moshi: A GPT-4o-like model that can see hear and talk natively
Enhancing AI-powered video to text transcription on Google Cloud
Audio AI: The best AI voice generators & text to voice tools
Central Kurdish TTS with novel end-to-end transformer training
Enhancing multilingual speech recognition in air traffic control
Best video translators to translate video 2024
Best Ringly.IO alternatives
Best Voiceplug.ai alternatives