Google presents on-device, real-time Voice translation 👀, Updates from DeepMind, Krisp, Zoom, NICE and others🔥
Google presents the first on-device, real-time speech-to-speech translation model
Top Updates 💪
DeepMind’s new AI generates soundtracks and dialogue for videos
RingSense receives “Significant” sales performance enhancements
Meta has created a way to watermark AI-generated speech
Krisp launches AI Accent Localization SDK Early Access
Parsing Mpower: NICE’s integrated “CX AI” offering
HeyGen raises $60M in series A funding
Meta releases five new AI models for audio and visual research
Zoom adds new agent-assist, translation, & SMS capabilities to its Contact Center
Spring Labs introduces AI copilot for fintechs
IZEA introduces AI voice cloning and speech synthesis in FormAI
NXP introduces audio DSPs with AI audio functions for infotainment
Tandem Health raises $9.5m to scale its healthcare co-pilot
GreyLabs AI raises seed funding for speech analytics in banking and fintech
Hark, provider of a Voice of Customer (VoC) platform, raises $3.5M in seed funding
Noteworthy 💪
WhatsApp works on a Voice Note to Transcribe feature: Here is how it may work
Listen to this page: Chrome's new text-to-speech feature
Voiceitt Chrome extension empowers people with speech disabilities
The power of AI transcription for streamlined communications
Vocal robots: Who am I speaking with?
How to build a startup in real-time AI speech translation
New research alert: How AI is changing employee and customer experiences
Science and Demo Corner 😎
Toucan TTS: MIT licensed text-to-speech in 7000 languages
Host the Whisper model with streaming mode on Amazon EKS and Ray Serve
Domain adaptive dual-relaxation regression for speech emotion recognition
AI-powered virtual assistants for businesses
Simplify transcription with Oracle Cloud Infrastructure generative AI and speech
Designing the API for building voice assistants, with Nikhil Gupta from Vapi
Seismic for meetings: What sales meetings have been missing
Waveform-domain speech enhancement using spectrogram encoding for ASR
DFNet: Decoupled fusion network for dialectal speech recognition
Adversarial meta sampling for multilingual low-resource speech recognition
Conformer-based speech recognition on extreme edge-computing devices