An NVIDIA research team introduces Audio Flamingo, a groundbreaking audio language model that incorporates in-context learning (ICL), retrieval augmented generation (RAG), and multi-turn dialogue capabilities, achieving SOTA performance across various audio understanding tasks.
Top Updates 💪
FCC makes AI-generated voices in robocalls illegal
Satya Nadella: India is leading the world with AI translation system Bhashini
Roblox rolls out real-time AI translation for all users
Microsoft’s new AI features to Windows 11 coming later this year
Evolving "Hey Google": Will Gemini rewrite the future of digital assistants?
Echo AI unveils expansion in Conversation Intelligence
AI can use human perception to help tune out noisy audio
Noteworthy 💪
Why AI-generated audio is so hard to detect
Crafting AI that understands the symphony of speech
Crosstalk: Voice AI that speaks when you're done talking, and stops when you interrupt it
IBM warns of AI-powered audio-jacking in live calls
Audio cloning can take over a phone call without the speakers knowing
TransLinguist enhances product with voice speech AI
Cresta wins Five9 VoiceStream innovation partner of the year award
AI-powered voice-based agents for enterprises: two key challenges
Science and Demo Corner 😎
AI voice cloning and synthetic voice creation using MetaVoice 1B
A paper from Apple proposes a new way to cut WER in ASR systems
OpenVoice: versatile instant voice cloning
AI voices vs voice actors for training
It's never too late: Fusing acoustic information into LLM for ASR
Re-tell: API for building human-like conversational voice AI
Audio Notes is live. Generate concise notes from the spoken words