<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Voice AI Newsletter]]></title><description><![CDATA[Voice AI insights from Krisp's CEO]]></description><link>https://voice-ai-newsletter.krisp.ai</link><image><url>https://substackcdn.com/image/fetch/$s_!YLgs!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png</url><title>Voice AI Newsletter</title><link>https://voice-ai-newsletter.krisp.ai</link></image><generator>Substack</generator><lastBuildDate>Fri, 10 Apr 2026 10:15:31 GMT</lastBuildDate><atom:link href="https://voice-ai-newsletter.krisp.ai/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Krisp Technologies]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[krispai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[krispai@substack.com]]></itunes:email><itunes:name><![CDATA[Davit Baghdasaryan]]></itunes:name></itunes:owner><itunes:author><![CDATA[Davit Baghdasaryan]]></itunes:author><googleplay:owner><![CDATA[krispai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[krispai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Davit Baghdasaryan]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Microsoft Enters the Voice AI Race]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/microsoft-enters-the-voice-ai-race</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/microsoft-enters-the-voice-ai-race</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 06 Apr 2026 14:03:48 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9fa20f24-6261-4c1c-a5a2-644f8de51ddd_1902x946.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p><strong>Microsoft launches MAI-Transcribe-1 and MAI-Voice-1</strong> - Two new in-house models: a batch transcription model (top 25 languages, 2.5x faster than Azure Fast) and a voice generation model that produces 60s of audio in 1s. Available now in Foundry. (<a href="https://venturebeat.com/technology/microsoft-launches-3-new-ai-models-in-direct-shot-at-openai-and-google">VentureBeat</a>) (<a href="https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/">Microsoft AI</a>)</p></li></ul><ul><li><p><strong>Microsoft open-sources VibeVoice</strong> - Family of TTS and ASR models under MIT license. TTS handles up to 90 minutes with 4 speakers. ASR transcribes 60-minute audio in a single pass with speaker diarization. Already at 27K GitHub stars. (<a href="https://github.com/microsoft/VibeVoice">GitHub</a>)</p></li><li><p><strong>Alibaba releases Qwen3.5-Omni</strong> - Native multimodal model processing text, audio, video in one pipeline. Speech recognition in 113 languages, generation in 36. Built-in turn-taking recognition that distinguishes backchanneling from real interruptions. Closed source, API only. (<a href="https://www.marktechpost.com/2026/03/30/alibaba-qwen-team-releases-qwen3-5-omni-a-native-multimodal-model-for-text-audio-video-and-realtime-interaction/">MarkTechPost</a>)</p></li><li><p><strong>Modulate launches Velma Deepfake Detect</strong> - Synthetic voice detection API ranked #1 on the HuggingFace Deepfake Speech leaderboard. Claims 578x lower cost than the next-best model, making always-on call monitoring viable. (<a href="https://gamesbeat.com/modulatess-velma-deepfake-detect-focuses-on-synthetic-voice-detection/">GamesBeat</a>)</p></li><li><p><strong>CNTXT AI launches Munsit</strong> - Arabic voice AI platform combining ASR and TTS across 25+ dialects. Already processing over a million minutes of audio for 250+ government and enterprise orgs in the UAE. (<a href="https://www.zawya.com/en/press-release/companies-news/cntxt-ai-launches-munsit-the-worlds-most-accurate-arabic-voice-ai-as-demand-for-ai-services-accelerates-across-the-uae-fw5z241m">Zawya</a>)</p></li><li><p><strong>Retell AI makes Wing VC Enterprise Tech 30</strong> - Voice AI agent platform hit $50M ARR and powers 50M+ real-time AI phone calls per month. One of three voice AI companies on the list. (<a href="https://www.globenewswire.com/news-release/2026/04/03/3268014/0/en/Voice-AI-Startup-Retell-AI-Named-to-Wing-VC-Enterprise-Tech-30-2026-List-Celebrating-the-Best-of-Enterprise-Tech.html">GlobeNewswire</a>)</p></li><li><p><strong>Speechify launches Windows app with on-device models</strong> - Local Whisper-based transcription and neural TTS on Copilot+ PCs and GPUs. No cloud needed. Competing with Wispr Flow and Superwhisper. (<a href="https://techcrunch.com/2026/03/31/speechifys-windows-app-uses-local-models-for-transcription-and-dictation/">TechCrunch</a>)</p></li><li><p><strong>The hidden cost of agentic AI callers</strong> - Some B2B contact centers seeing 15-20% of inbound volume from AI agents at peak. They wait forever, consume resources, and extract operational data. Detection is key. (<a href="https://www.symnexconsulting.com/blog/hidden-cost-of-agentic-ai-callers">SymNex</a>)</p></li><li><p><strong>AudioShake ships real-time audio separation SDK</strong> - Source separation for iOS, Android, Windows, Linux. Ranked #1 in Meta&#8217;s SAM audio benchmarks. Used by Warner, Universal, Sony, Disney. Now available for edge deployment. (<a href="https://slator.com/ai-audio-separation-audioshake/">Slator</a>)</p></li><li><p><strong>AI voice scams surge with 3-second cloning</strong> - Scammers cloning family members&#8217; voices from short social media clips. BBB and FTC warnings. AI-generated voice fraud up 1,200% in 2025. (<a href="https://www.moneycontrol.com/news/business/personal-finance/think-it-s-your-family-calling-why-ai-voice-scams-are-getting-harder-to-spot-13878976.html">MoneyControl</a>)</p></li><li><p><strong>MiraVoice raises $6.3M</strong> - AI voice agent for long-form phone surveys (120+ questions, 40+ min). Seed round led by Unusual Ventures. (<a href="https://news.crunchbase.com/venture/ai-interviewer-miravoice-raises-seed-funding-unusual/">Crunchbase</a>)</p></li><li><p><strong>Gnani.ai raises $10M Series B</strong> - India&#8217;s leading voice AI platform, 30M+ voice interactions daily in 12+ languages. Also launched Inya VoiceOS, a 5B-parameter voice-to-voice model. (<a href="https://www.businesstoday.in/technology/story/gnaniai-raises-10-million-in-funding-from-aavishkaar-capital-to-scale-global-voice-ai-push-523218-2026-03-31">BusinessToday</a>)</p></li><li><p><strong>Insight Health raises $11M Series A</strong> - Voice and chat AI agents for clinical admin: patient screening, referral processing, EHR documentation. Integrated with athenahealth. (<a href="https://www.mobihealthnews.com/news/insight-health-raises-11m-scale-clinical-ai-agents">MobiHealthNews</a>)</p></li><li><p><strong>Google Gboard adds Bluetooth mic for voice typing</strong> - Finally lets you dictate through connected earbuds instead of phone mic. Rolling out via server-side update. (<a href="https://www.androidauthority.com/gboard-voice-typing-bluetooth-earbuds-3652971/">Android Authority</a>)</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p><strong>Orpheus-FastAPI</strong> - Self-hosted TTS server with OpenAI-compatible API. 8 voices, emotion tags, long-form batching. Connects to llama.cpp, LM Studio, GPUStack. Apache 2.0. (<a href="https://github.com/Lex-au/Orpheus-FastAPI">GitHub</a>)</p></li><li><p><strong>MeloTTS</strong> - Multi-lingual TTS library by MyShell.ai. English (4 accents), Spanish, French, Chinese, Japanese, Korean. Runs in real time on CPU. MIT license. (<a href="https://github.com/myshell-ai/MeloTTS">GitHub</a>)</p></li><li><p><strong>Build a voice-enabled AI agent in n8n</strong> - Step-by-step tutorial for wiring up voice input/output in n8n workflows. (<a href="https://dev.to/kfuras/build-a-voice-enabled-ai-agent-in-n8n-3oke">dev.to</a>)</p></li><li><p><strong>How to choose the best STT API for voice agents</strong> - Comparison of latency, accuracy, and cost tradeoffs across providers. (<a href="https://hackernoon.com/how-to-choose-the-best-speech-to-text-api-for-voice-agents">HackerNoon</a>)</p></li><li><p><strong>The hidden audio bias in audio-visual speech recognition</strong> - Analysis of how AV-ASR models over-rely on audio, undermining the visual modality. (<a href="https://hackernoon.com/the-hidden-audio-bias-inside-audio-visual-speech-recognition">HackerNoon</a>)</p></li><li><p><strong>Why speech recognition APIs need a different architecture</strong> - Smallest AI on designing ASR for real-time voice agent use cases vs batch transcription. (<a href="https://dev.to/smallestai-community/why-speech-recognition-api-requires-a-different-architecture-46ed">dev.to</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Krisp Is Nominated for 3 Webby Awards]]></title><description><![CDATA[Your vote decides who wins.]]></description><link>https://voice-ai-newsletter.krisp.ai/p/krisp-is-nominated-for-3-webby-awards</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/krisp-is-nominated-for-3-webby-awards</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Thu, 02 Apr 2026 13:54:40 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/49535f6e-5c33-47eb-bad8-55b20a62bb00_1000x700.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ou3b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ou3b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ou3b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg" width="1200" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86772,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/192785886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ou3b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ou3b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2585b1fa-9f16-4b73-9cd2-7267d4fe0f9a_1200x300.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Krisp has been nominated for three 2026 Webby Awards for Technical Achievement, Developer Tools &amp; APIs, and Best Use of AI Voice &amp; Conversational Interface.</p><p>The Webby Awards are one of the most recognized honors in digital technology. Getting nominated in three categories, all tied to voice AI, is a meaningful signal of where this space is headed and the work the team has put in to get us here.</p><p>The People&#8217;s Voice Award is decided solely by public vote. </p><p><strong>If you follow this newsletter, you already believe in what we&#8217;re building, and we'd love your support.</strong></p><h3><strong>Click each link below to vote:</strong></h3><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://vote.webbyawards.com/PublicVoting#/2026/apps-software-immersive/app-excellence/technical-achievement&quot;,&quot;text&quot;:&quot;Technical Achievement&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://vote.webbyawards.com/PublicVoting#/2026/apps-software-immersive/app-excellence/technical-achievement"><span>Technical Achievement</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://vote.webbyawards.com/PublicVoting#/2026/apps-software-immersive/business-software-services/developer-tools-apis&quot;,&quot;text&quot;:&quot;Developer Tools &amp; APIs&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://vote.webbyawards.com/PublicVoting#/2026/apps-software-immersive/business-software-services/developer-tools-apis"><span>Developer Tools &amp; APIs</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://wbby.co/58853N&quot;,&quot;text&quot;:&quot;AI Voice &amp; Conversational Interface&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://wbby.co/58853N"><span>AI Voice &amp; Conversational Interface</span></a></p><p><strong>Or type &#8220;Krisp&#8221; into the category search bar and our nominations will surface for one-click voting.  </strong></p><p>You can cast one vote per category, closes April 16.</p><p>Thank you, and more soon.</p><p>&#8212; Davit</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Voice AI Newsletter! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[3 New Open-Source Voice Models Drop in One Week]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/3-new-open-source-voice-models-drop</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/3-new-open-source-voice-models-drop</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 30 Mar 2026 14:03:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c7a3c93a-0840-497a-84dd-fb57fc8122d8_1296x872.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p><strong>Mistral launches Voxtral TTS</strong> - Open-weight 4B TTS model. 9 languages, 90ms TTFA, 6x RTF. Runs on consumer GPUs. Mistral claims it beats ElevenLabs on quality benchmarks. (<a href="https://techcrunch.com/2026/03/26/mistral-releases-a-new-open-source-model-for-speech-generation/">TechCrunch</a>) (<a href="https://mistral.ai/news/voxtral-tts">Mistral blog</a>)</p></li></ul><ul><li><p><strong>Cohere releases Transcribe</strong> - Open-source 2B ASR model built for edge. 14 languages, 5.42 avg WER on HF Open ASR leaderboard, beating Zoom Scribe v1, IBM Granite 4.0, ElevenLabs Scribe v2, and Qwen3-ASR. Free via API and HuggingFace. (<a href="https://techcrunch.com/2026/03/26/cohere-launches-an-open-source-voice-model-specifically-for-transcription/">TechCrunch</a>) (<a href="https://cohere.com/blog/transcribe">Cohere blog</a>)</p></li><li><p><strong>Google ships Gemini 3.1 Flash Live + Search Live goes global</strong> - Real-time voice/video model with native function calling. 90.8% on ComplexFuncBench Audio (~20% jump over prev gen). Now powers Search Live in 200+ countries with voice and camera input. (<a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-live/">Google blog</a>) (<a href="https://techcrunch.com/2026/03/26/google-is-launching-search-live-globally/">TechCrunch</a>)</p></li><li><p><strong>Smallest AI launches Lightning V3</strong> - 3.89 MOS in conversational evals, claims to beat OpenAI, Cartesia, and ElevenLabs. 15 languages with auto-detection and mid-sentence switching. Voice cloning from 5-15s of audio. (<a href="https://smallest.ai/blog/introducing-lightning-v3">Smallest.ai blog</a>)</p></li><li><p><strong>Amazon Polly adds Bidirectional Streaming</strong> - Stream text to Polly token-by-token as your LLM generates it, get audio back in real time over HTTP/2. 39% faster than batch approach, collapses 27 API calls to 1 on a 970-word passage. GA now. (<a href="https://aws.amazon.com/blogs/machine-learning/introducing-amazon-polly-bidirectional-streaming-real-time-speech-synthesis-for-conversational-ai/">AWS blog</a>)</p></li><li><p><strong>AWS adds WebRTC to Bedrock AgentCore</strong> - Pipecat voice agents now run on AgentCore Runtime with bidirectional WebSocket and WebRTC. Supports barge-in. Ready-to-deploy examples with Pipecat, Nova Sonic, LiveKit, and Strands SDK. (<a href="https://aws.amazon.com/blogs/machine-learning/deploy-voice-agents-with-pipecat-and-amazon-bedrock-agentcore-runtime-part-1/">AWS blog</a>)</p></li><li><p><strong>Genesys reports record Q4</strong> - Genesys Cloud at ~$2.6B ARR, 35%+ YoY growth. 70%+ of customers now on AI. AI-powered conversations up 120% YoY. AI is 20% of new ACV, with 10+ deals where AI exceeded half the contract value. (<a href="https://www.genesys.com/company/newsroom/announcements/genesys-reports-record-fourth-quarter-as-organizations-accelerate-the-adoption-of-ai-powered-experience-orchestration">Genesys</a>)</p></li><li><p><strong>Artificial Analysis updates voice benchmarks</strong> - AA-WER v2.0 adds conversational AI, EU Parliament speech, and financial call datasets. ElevenLabs Scribe v2 leads at 2.3% WER. Best value: Mistral Voxtral Small at 3.0% WER / $4 per 1K min. TTS Arena: Inworld TTS-1.5-Max at #1, ELO 1,160. (<a href="https://x.com/ArtificialAnlys/status/2037195442489090485?s=20">X post</a>)</p></li><li><p><strong>AI chatbots handle 60%+ of banking support</strong> - BofA Erica: 1.5B+ interactions, 98% resolved without human. Klarna AI: 66% of inquiries, saving $40M/yr. Gartner projects $80B in contact center labor cost cuts in 2026. (<a href="https://techbullion.com/why-ai-chatbots-are-handling-over-60-of-banking-customer-support/">TechBullion</a>)</p></li><li><p><strong>The economics of AI vs human agents</strong> - Voice AI now costs ~$0.40/call vs $7-12 for a human agent: 90-95% cost reduction per interaction. Analysis of how this is reshaping contact center staffing. (<a href="https://medium.datadriveninvestor.com/the-silence-of-the-call-center-openai-just-cut-40-of-call-center-jobs-in-one-week-df82cc10fc61">Medium</a>)</p></li><li><p><strong>Agentic Voice AI goes mainstream</strong> - 1 in 10 customer service interactions projected to be fully automated by agentic voice AI in 2026. 80% of businesses plan to deploy. RingCentral shipped AIR Pro, an agentic voice platform embedded in its comms stack. (<a href="https://telecomreseller.com/2026/03/24/agentic-voice-ai-for-business/">Telecom Reseller</a>)</p></li><li><p><strong>Salesforce Agentforce Contact Center</strong> - Native CCaaS unifying voice, digital channels, CRM, and AI agents in one stack. Voice now built into the CRM on Hyperforce. GA since Feb 23. (<a href="https://cloudwars.com/ai/salesforce-agentforce-contact-center-brings-unified-data-and-ai-agents-to-customer-service/">Cloud Wars</a>)</p></li><li><p><strong>Otter.ai hits 35M users, $100M ARR</strong> - Sam Liang interview. $100M ARR with &lt;200 employees ($500K+ rev/employee). #14 on Forbes 2026 Best Startup Employers. Liang: 2026 is &#8220;the year of the voice.&#8221; (<a href="https://youtu.be/7yMetPnsFT0">YouTube</a>)</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p><strong>Gladia open-sources WER normalization library</strong> - Normalizes transcripts before computing WER to eliminate false penalties from formatting differences (&#8221;$50&#8221; vs &#8220;fifty dollars&#8221;). Configurable YAML pipelines for fair cross-engine ASR comparison. (<a href="https://github.com/gladiaio/normalization">GitHub</a>) (<a href="https://www.linkedin.com/posts/gevorg-minasyan-42475a132_tts-normalization-comparisons-activity-7442931104967319552-Baog/">LinkedIn - Gevorg Minasyan</a>)</p></li></ul><ul><li><p><strong>MacWhisper</strong> - Mac-native local transcription using Whisper and Nvidia Parakeet. 300K copies sold. Batch processing, YouTube transcription, auto-recording Zoom/Teams/Webex. All on-device. (<a href="https://www.trendhunter.com/amp/trends/macwhisper">Trend Hunter</a>)</p></li><li><p><strong>Logan Kilpatrick on Gemini 3 Flash</strong> - Google DeepMind&#8217;s Logan Kilpatrick discusses the latest Gemini model capabilities. (<a href="https://x.com/OfficialLoganK/status/2037187750005240307?s=20">X post</a>)</p></li><li><p><strong>Google Docs adds Gemini-powered audio proofreading</strong> - &#8220;Listen to this&#8221; reads docs aloud with AI voices. 0.5x-2x playback. Also ships audio summaries: condenses long docs into ~3min podcast-style recaps. Desktop, English only for now. (<a href="https://www.makeuseof.com/google-docs-hidden-audio-feature-proofread/">MakeUseOf</a>)</p></li><li><p><strong>Rekam AI</strong> - All-in-one voice platform: TTS, STT, voice cloning, custom voice creation. 2,000+ voices, 20+ languages. Free unlimited tier for Kokoro models. (<a href="https://dynamicbusiness.com/ai-tools/rekam-ai-ai-voice-platform-overview.html">Dynamic Business</a>)</p></li><li><p><strong>Klassifier</strong> - AI-powered audio classification tool. (<a href="https://www.trendhunter.com/trends/klassifier">Trend Hunter</a>)</p></li><li><p><strong>ViciStack on call center AI voice agents</strong> - Overview of real-time conversation handling, reduced wait times, and automated workflows in production contact centers. (<a href="https://vicistack.com/blog/call-center-ai-voice-agents">ViciStack</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Scaling STT systems | Maxime Gaudin (CTO at Gladia)]]></title><description><![CDATA[Watch now | In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?]]></description><link>https://voice-ai-newsletter.krisp.ai/p/scaling-stt-systems-maxime-gaudin</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/scaling-stt-systems-maxime-gaudin</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Thu, 26 Mar 2026 13:10:44 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/191781305/5ff3cfd6530c40911705194eb9bac5f9.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<pre><code><code>In the Future of Voice AI series of interviews, I ask three questions to my guests:

- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?</code></code></pre><p>This episode&#8217;s guest is <a href="https://www.linkedin.com/in/maxime-gaudin/">Maxime Gaudin</a>, CTO at <a href="https://www.gladia.io/">Gladia</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M_NR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M_NR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M_NR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png" width="1200" height="1200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1200,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:512303,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/191781305?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M_NR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!M_NR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a26818e-78f7-4be0-86b5-1468d7f09021_1200x1200.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Former Co-Founder &amp; CTO at Matcha and CTO at MadKudu through its private equity acquisition. Among the first employees at Malt, where he helped scale the company from 6 to 250 people and over &#8364;12M in monthly transaction volume. He earned his Master's degree in Computer Science from INSA Lyon and Polytechnique Montr&#233;al, Canada. Throughout his career, he has built and scaled products across B2B SaaS, data intelligence, and speech AI, from early-stage founding to leading engineering organizations through hypergrowth and acquisitions.</p><p><a href="http://www.gladia.io">Gladia</a> was founded in 2022 by Jean-Louis Queguiner and Jonathan Soto with a mission to help companies leverage cutting-edge AI and retrieve actionable insights from audio data. Its API supports advanced speech recognition features in over 100 languages, with exceptional accuracy and asynchronous and real-time transcription.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.youtube.com/@futureofvoiceai&quot;,&quot;text&quot;:&quot;Listen on YouTube&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.youtube.com/@futureofvoiceai"><span>Listen on YouTube</span></a></p><h3><strong>Recap Video</strong></h3><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;9bea8880-52b7-4f24-a617-2bc892ef35a1&quot;,&quot;duration&quot;:null}"></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3><strong>Takeaways</strong></h3><ul><li><p>Winning isn&#8217;t just about model quality, it is surviving brutal tradeoffs between latency, cost, and scale.</p></li><li><p>The real challenge is not training one great model, it is running it cheap enough to meet market pricing without breaking performance.</p></li><li><p>STT is getting commoditized so fast that providers have to chase better accuracy while selling at margins that keep shrinking.</p></li><li><p>Big models don&#8217;t matter if they are too expensive to run at scale.</p></li><li><p>Real-time voice AI lives or dies under a hard latency budget, and staying under 300 milliseconds leaves little room for mistakes.</p></li><li><p>The industry obsession with one model that does everything may be the wrong path if smaller specialist models can outperform it in the moments that matter.</p></li><li><p>Every model upgrade is risky because improving one language or task can make another one worse.</p></li><li><p>Testing speech systems is harder than people admit because teams know something broke, but don&#8217;t know what.</p></li><li><p>General transcription errors can be patched by an LLM, but once a name, phone number, email, or address is lost, it is gone.</p></li><li><p>The next edge in voice AI may come from tiny models trained for high-value details like PII, not from one giant model trying to handle everything.</p></li><li><p>Email addresses sound simple until real accents, pauses, corrections, and spelling cues expose how messy spoken language really is.</p></li><li><p>The companies that win enterprise voice AI will be the ones that orchestrate many narrow models well, not the ones chasing a single universal model.</p></li><li><p>Infrastructure strategy is becoming a product decision because legal rules, traffic spikes, and customer use cases all change what &#8220;best&#8221; deployment looks like.</p></li><li><p>Cloud scaling breaks in real-time spikes, like emergency calls.</p></li><li><p>Using managed infra and large DevOps teams at once wastes money.</p></li><li><p>Customers want one vendor for everything, even if quality drops.</p></li><li><p>The market will reward depth over breadth if a vendor can become truly exceptional in one painful, business-critical part of the voice stack.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Scale AI launches real-world voice AI benchmark]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/scale-ai-launches-real-world-voice</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/scale-ai-launches-real-world-voice</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 23 Mar 2026 14:02:58 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e054f56f-17b9-4ab0-9d17-c2b2b5aa237b_1880x1022.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Scale AI launches the first real-world voice AI benchmark (<a href="https://venturebeat.com/data/scale-ai-launches-voice-showdown-the-first-real-world-benchmark-for-voice-ai">VentureBeat</a>)</p></li><li><p>NVIDIA has released Nemotron 3 VoiceChat speech to speech model (<a href="https://x.com/ArtificialAnlys/status/2033642073052868861?s=20">X</a>)</p></li><li><p>Krisp launches MCP integration with Claud (<a href="https://www.linkedin.com/feed/update/urn:li:activity:7440383581539033088/">LinkedIn</a>)</p></li><li><p>Amazon Connect voice AI agents now supports 13 new languages (<a href="https://aws.amazon.com/about-aws/whats-new/2026/03/amazon-connect-voice-ai-agents-13-languages/">AWS</a>)</p></li><li><p>Modulate launches Velma Transcribe: High-performance transcription for real-world conversations at 90% lower cost (<a href="https://www.enterprisenews.com/press-release/story/89859/modulate-launches-velma-transcribe-high-performance-transcription-for-real-world-conversations-at-90-lower-cost/">Enterprise News</a>)</p></li><li><p>Google News could soon give you a convenient new way to consume its audio briefings (<a href="https://www.androidauthority.com/google-news-read-ai-audio-briefings-transcript-apk-teardown-3649402/">Android Authority</a>)</p></li><li><p>AI notetaking devices that record and transcribe your meetings (<a href="https://techcrunch.com/2026/03/20/ai-notetaker-hardware-devices-pins-pendants-record-transcribe/">TechCrunch</a>)</p></li><li><p>Krisp has been named a Palomarr Leader across Accent Conversion, Noise Cancellation, Voice Translation (<a href="https://www.linkedin.com/feed/update/urn:li:activity:7439664507603357696/">LinkedIn</a>)</p></li><li><p>Amazon Connect adds new generative TTS voices and expands regions (<a href="https://aws.amazon.com/about-aws/whats-new/2026/03/amazon-connect-adds-generative-text-to-speech-voices/">AWS</a>)</p></li><li><p>Ringover launches enhanced AI assistant ask Empower 2.0 (<a href="https://aithority.com/machine-learning/ringover-launches-enhanced-ai-assistant-ask-empower-2-0/">AIThority</a>)</p></li><li><p>WhatsApp upgrade &#8212; calls will sound completely different (<a href="https://nokiapoweruser.com/whatsapp-just-got-a-game-changing-upgrade-calls-will-sound-completely-different/">Nokia Power User</a>)</p></li><li><p>8x8 Engage launches globally for frontline teams (<a href="https://www.cmswire.com/customer-experience/8x8-engage-launches-globally-for-frontline-teams/">CMSWire</a>)</p></li><li><p>Itel unveils Zeno AI Weaver voice recorder in India (<a href="https://www.gadgets360.com/ai/news/itel-zeno-ai-weaver-voice-recorder-price-in-india-unveil-specifications-features-11233496">Gadgets360</a>) </p></li><li><p>AI voice cloning &amp; synthesis are shaping the future of digital voices (<a href="https://www.techtimes.com/articles/315169/20260317/ai-voice-cloning-voice-synthesis-technology-are-shaping-future-digital-voices.htm">TechTimes</a>)</p></li><li><p>How businesses are replacing IVR with conversational AI (<a href="https://socialmediaexplorer.com/business-innovation-2/ai-voice-agents-in-2026-how-businesses-are-replacing-ivr-with-conversational-ai-that-actually-works/">Social Media Explorer</a>)</p></li><li><p>Bandicam launches AI feature to transcribe video to text on Mac (<a href="https://martechseries.com/video/bandicam-launches-ai-feature-to-transcribe-video-to-text-on-mac/">MarTech Series</a>)</p></li><li><p>The mounting cost of voice fraud: revenue loss, broken trust (<a href="https://www.retaildive.com/spons/the-mounting-cost-of-voice-fraud-revenue-loss-broken-trust-and-operationa/814409/">Retail Dive</a>)</p></li><li><p>Robinhood&#8217;s startup fund invests $35M in Stripe and AI audio firm (<a href="https://www.theblock.co/amp/post/393910/robinhoods-startup-fund-invests-roughly-35-million-across-stripe-and-ai-audio-firm">The Block</a>)</p></li><li><p>Ezra raises $3.2M in seed funding (<a href="https://www.finsmes.com/2026/03/ezra-raises-3-2m-in-seed-funding.html">FinSMEs</a>)</p></li><li><p>WellSaid closes venture debt funding (<a href="https://www.finsmes.com/2026/03/wellsaid-closes-venture-debt-funding.html">FinSMEs</a>)</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>VoXtream2: Full-stream TTS with dynamic speaking-rate control (<a href="https://www.linkedin.com/posts/ntorgashov_tts-texttospeech-streaming-activity-7439675674191147008-BOlV/">LinkedIn</a>)</p></li><li><p>Adaptive AI voice layer for real-time communication (<a href="https://dev.to/peacebinflow/adaptive-ai-voice-layerfor-real-time-communication-32gf">Dev</a>)</p></li><li><p>Utterly: Transcribe speech privately on Apple devices, offline (<a href="https://betalist.com/startups/utterly">BetaList</a>)</p></li><li><p>MiniMax 2.7: GLM-5 at 1/3 cost SOTA open model  (<a href="https://news.smol.ai/issues/26-03-18-not-much/">Smol AI News</a>)</p></li><li><p>Best STT APIs to build an AI notetaker in 2026 (<a href="https://hackernoon.com/best-speech-to-text-apis-to-build-an-ai-notetaker-in-2026">Hacker Noon</a>)</p></li><li><p>PersonaOps: A voice-to-data intelligence system powered by Notion MCP (<a href="https://dev.to/peacebinflow/personaopsa-voice-to-data-intelligence-systempowered-by-notion-mcp-4m2">Dev</a>)</p></li><li><p>Google AI releases WAXAL: Multilingual African speech dataset (<a href="https://www.marktechpost.com/2026/03/17/google-ai-releases-waxal-a-multilingual-african-speech-dataset-for-training-automatic-speech-recognition-and-text-to-speech-models/">MarktechPost</a>)</p></li><li><p>WhisperWeb processed STT Directly within the browser (<a href="https://www.trendhunter.com/trends/local-ai-transcription">Trend Hunter</a>)</p></li><li><p>Why building voice AI agents is still so hard (<a href="https://dev.to/dograh/why-building-voice-ai-agents-is-still-so-hard-and-why-we-started-dograh-2gcc">Dev</a>)</p></li><li><p>OpenVoiceUI: AI voice agent app generates live canvas pages (<a href="https://dev.to/mcerqua/openvoiceui-ai-voice-agent-app-generates-live-canvas-pages-using-openclaw-33i9">Dev</a>)</p></li><li><p>Vietnamese automatic speech recognition (<a href="https://tldr.takara.ai/p/2603.14779">TLDR Takara</a>)</p></li><li><p>VoiceType AI transcribes, edits, and auto-formats your speech (<a href="https://www.trendhunter.com/trends/voicetype-ai">Trend Hunter</a>)</p></li><li><p>Speech synthesis API for TTS (<a href="https://dev.to/omriluz1/speech-synthesis-api-for-text-to-speech-1b1c">Dev</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Updates from Salesforce, RingCentral, MS and others!]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/updates-from-salesforce-ringcentral</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/updates-from-salesforce-ringcentral</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 16 Mar 2026 14:01:40 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c85a9eb3-d05d-4cb0-a5f2-61faf1f2e0ba_1280x720.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Salesforce launches Agentforce Contact Centre (<a href="https://cxm.world/customer-experience/salesforce-launches-agentforce-contact-centre-ai-voice-crm-finally-unified-in-one-ccaas-platform/">CXM World</a>)</p></li><li><p>RingCentral unveils AIR Pro at Enterprise Connect (<a href="https://www.cxtoday.com/contact-center/enterprise-connect-ringcentral-air-pro-voice-ai/">CX Today</a>)</p></li><li><p>Microsoft announces Custom Voice for Dynamics 365 Contact Center (<a href="https://www.microsoft.com/en-us/dynamics-365/blog/it-professional/2026/03/06/custom-neural-voices-dynamics-365-contact-center/">Microsoft</a>)</p></li><li><p>Intron launches voice AI supporting 57 African languages (<a href="https://kenyanwallstreet.com/intron-launches-new-voice-ai-service-sahara-v2">Kenyan Wallstreet</a>)</p></li><li><p>Krisp launches customer accent conversion for global contact centers (<a href="https://www.cxtoday.com/contact-center/krisp-customer-accent-conversion-contact-centers/">CX Today</a>)</p></li><li><p>Voice and language intelligence market size in 2026 (<a href="https://www.precedenceresearch.com/voice-and-language-intelligence-market">Precedence Research</a>)</p></li><li><p>Hume AI appoints new CEO (<a href="https://www.prnewswire.com/news-releases/hume-ai-appoints-new-ceo-302668103.html">PR Newswire</a>)</p></li><li><p>ElevenLabs pledges to restore 1 million voices at SXSW (<a href="http://findarticles.com/elevenlabs-pledges-to-restore-1-million-voices-at-sxsw/">FindArticles</a>)</p></li><li><p>AI customer support startup Wonderful AI raises $150 million (<a href="https://www.bloomberg.com/news/articles/2026-03-12/ai-customer-support-startup-wonderful-ai-raises-150-million">Bloomberg</a>)</p></li><li><p>Devnagri AI launches multilingual enterprise speech AI (<a href="https://martechseries.com/predictive-ai/ai-platforms-machine-learning/devnagri-ai-launches-speech-ai-to-power-multilingual-voice-workflows-for-enterprises/">MarTech Series</a>)</p></li><li><p>Spectrum Business and RingCentral expand partnership (<a href="https://corporate.charter.com/newsroom/spectrum-business-and-ring-central-expand-partnership">Charter Corporate</a>)</p></li><li><p>CallMiner adds AI classifiers, custom summaries to CX platform (<a href="https://www.cmswire.com/contact-center/callminer-adds-ai-classifiers-custom-summaries-to-cx-platform/">CMSWire</a>)</p></li><li><p>Sakura adds speech synthesis API to AI platform (<a href="https://www.telecompaper.com/news/sakura-adds-speech-synthesis-api-to-ai-platform-launches-research-notebook-beta--1564747">Telecompaper</a>)</p></li><li><p>Agora removes barriers to scalable voice AI agents (<a href="https://www.globenewswire.com/news-release/2026/03/11/3253909/0/en/Agora-Removes-Barriers-to-Scalable-Voice-AI-Agents.html">Globe Newswire</a>)</p></li><li><p>ThinkrrAI advances its voice AI strategy (<a href="https://www.manilatimes.net/2026/03/08/tmt-newswire/globenewswire/thinkrrai-advances-its-voice-ai-strategy-under-cmo-cody-getchell-amid-growing-demand-for-ai-driven-automation/2295418">Manila Times</a>)</p></li><li><p>How voicemail-to-email transcription can create privacy exposure (<a href="https://www.paubox.com/blog/how-voicemail-to-email-transcription-can-create-privacy-exposure">Paubox</a>)</p></li><li><p>Outbound AI voice agents in Vodia v70 (<a href="https://telecomreseller.com/2026/03/11/outbound-ai-voice-agents-in-vodia-v70/">Telecom Reseller</a>)</p></li><li><p>Conversational AI solutions: Benefits, challenges &amp; best practices (<a href="https://www.nextiva.com/blog/conversational-ai-solutions.html">Nextiva</a>)</p></li><li><p>AI ring startup takes on OpenAI And Meta In Wearables (<a href="https://www.upstartsmedia.com/p/sandbar-stream-ai-ring-raises-23m">Upstarts Media</a>)</p></li><li><p>Together AI launches voice agent platform with sub-700ms latency (<a href="https://www.mexc.com/news/917035">MEXC</a>)</p></li><li><p>Sinch unveils Voice Relay to power AI-driven calls (<a href="https://telconews.com.au/story/sinch-unveils-voice-relay-to-power-ai-driven-calls">Telco News</a>)</p></li><li><p>Ex-Apple engineer&#8217;s voice-only pendant raises $5M (<a href="https://www.techbuzz.ai/articles/ex-apple-engineer-s-voice-only-pendant-raises-5m">TechBuzz AI</a>)<br></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Hume AI: First open source TTS model, TADA (<a href="https://x.com/hume_ai/status/2031401003078062578?s=20">X</a>)</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;10fd5511-c88d-4bbe-853c-c348f6a7e58b&quot;,&quot;duration&quot;:null}"></div></li><li><p>How developers can bring voice AI into telephony applications (<a href="https://www.infoworld.com/article/4136039/how-developers-can-bring-voice-ai-into-telephony-applications.html">InfoWorld</a>)</p></li><li><p>This AI can hear, translate, and speak back in 100 languages (<a href="https://hackernoon.com/this-ai-can-hear-translate-and-speak-back-in-100-languages?source=rss">Hacker Noon</a>)</p></li><li><p>KrishokBondhu: A retrieval-augmented voice-based agricultural advisory call center for Bengali farmers (<a href="https://arxiv.org/abs/2510.18355">arXiv</a>)</p></li><li><p>Causal prosody mediation for TTS: Counterfactual training of duration, pitch, and energy in FastSpeech2 (<a href="https://tldr.takara.ai/p/2603.11683">TLDR Takara</a>)</p></li><li><p>The future of clearer speech is multimodal (<a href="https://hackernoon.com/the-future-of-clearer-speech-is-multimodal">Hacker Noon</a>)</p></li><li><p>Fish Audio S2, a new generation of expressive TTS with controllable emotion (<a href="https://x.com/FishAudio/status/2031411140820152560?s=20">X</a>)</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;3502fd21-629a-41da-8a0a-925075508906&quot;,&quot;duration&quot;:null}"></div></li><li><p>JEPA-v0: Audio encoder for real-time speech translation (<a href="https://www.startpinch.com/research/en/jepa-encoder-translation/">StartPinch</a>)</p></li><li><p>Human brain and AI speech recognition decode speech similarly (<a href="https://techxplore.com/news/2026-03-human-brain-ai-speech-recognition.html">TechXplore</a>)</p></li><li><p>Cybersecurity and forensic audio analysis: Deepfake detection based on MFCC, audio-text disconsistency, and prosodic features (<a href="https://www.scirp.org/journal/paperinformation?paperid=150057">SCIRP</a>)</p></li><li><p>Voice isolation iPhone guide (<a href="https://thinkdesignblog.com/voice-isolation-iphone-guide/">Think Design Blog</a>)</p></li><li><p>Gemini embedding 2: Natively multimodal embedding model (<a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-embedding-2/">Google Blog</a>)</p></li><li><p>Building a TTS engine in pure C (<a href="https://dev.to/gabrielemastrapasqua/building-a-text-to-speech-engine-in-pure-c-59h4">Dev</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Crazy Week 🔥 Updates from Anthropic, OpenAI, Krisp, Assembly and much more! ]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/crazy-week-updates-from-anthropic</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/crazy-week-updates-from-anthropic</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 09 Mar 2026 14:03:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1c900b4a-e7f2-4f24-b0f3-6c4aa1cee74a_1456x816.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Assembly launches Universal Pro-3 streaming (<a href="https://www.assemblyai.com/blog/universal-3-pro-streaming">Assembly</a>)</p></li><li><p>Anthropic launches voice mode for Claude Code (<a href="https://mlq.ai/news/anthropic-launches-voice-mode-for-claude-code/">MLQ</a>)</p></li><li><p>OpenAI develops a &#8216;Bidirectional&#8217; Audio Model to boost Voice Assistants (<a href="https://www.theinformation.com/newsletters/ai-agenda/openai-develops-bidirectional-audio-model-boost-voice-assistants">The Information</a>)</p></li><li><p>Krisp launches listener-side, real-time Accent Conversion (<a href="https://siliconangle.com/2026/03/03/krisp-launches-listener-side-real-time-accent-conversion/">SiliconANGLE</a>)</p></li><li><p>AI Vocal Cloning and the Limits of Voice-Based Authentication (<a href="https://bisi.org.uk/reports/when-voice-is-no-longer-proof-ai-vocal-cloning-and-the-limits-of-voice-based-authentication">BISI</a>)</p></li><li><p>Huawei launched next-generation voice virtual agents (<a href="https://www.huawei.com/en/news/2026/3/mwc-voice-interaction-aicc">Huawei</a>)</p></li><li><p>Modulate adds nuance to voice analysis (<a href="https://www.nojitter.com/ai-automation/modulate-adds-nuance-to-voice-analysis">NoJitter</a>)</p></li><li><p>Deutsche Telekom partners with ElevenLabs to bring AI assistant to calls (<a href="https://www.wired.com/story/deutsche-telekom-elevenlabs-ai-phone-calls-mwc-2026/">Wired</a>)</p></li><li><p>Alibaba Tongyi unveils Fun-CosyVoice3.5 and Fun-AudioGen-VD with FreeStyle voice generation (<a href="https://pandaily.com/alibaba-tongyi-unveils-fun-cosy-voice3-5-and-fun-audio-gen-vd-with-free-style-voice-generation">Pandaily</a>)</p></li><li><p>Voice AI platform VoiceLine raises 10M EUR in series A (<a href="https://slator.com/voiceline-raises-10m/">Slator</a>)</p></li><li><p>LevelAI expands agentic CX platform (<a href="https://customerservicemanager.com/level-ai-expands-agentic-cx-platform-to-deliver-human-quality-virtual-agents/">Customer Service Manager</a>)</p></li><li><p>Talkdesk CX accelerates patient access with agentic AI (<a href="https://www.globenewswire.com/news-release/2026/03/05/3250425/0/en/Talkdesk-Customer-Experience-Automation-accelerates-patient-access-with-agentic-AI-orchestration.html">GlobeNewswire</a>)</p></li><li><p>Syntiant to showcase always-on AI voice solutions (<a href="https://www.globenewswire.com/news-release/2026/03/05/3250619/0/en/Syntiant-to-Showcase-Always-On-AI-Voice-Solutions-at-Embedded-World-2026-with-Seltech.html">GlobeNewswire</a>)</p></li><li><p>ElevenLabs &amp; Google dominate Artificial Analysis&#8217; STT benchmark (<a href="https://the-decoder.com/elevenlabs-and-google-dominate-artificial-analysis-updated-speech-to-text-benchmark/">The Decoder</a>)</p></li><li><p>DiligenceSquared uses AI to make M&amp;A research affordable (<a href="https://techcrunch.com/2026/03/05/diligencesquared-uses-ai-voice-agents-to-make-ma-research-affordable/">TechCrunch</a>)</p></li><li><p>3CLogic chosen to enhance ServiceNow-driven managed services (<a href="https://www.prnewswire.com/news-releases/3clogic-chosen-by-apex-systems-to-enhance-servicenow-driven-managed-services-302701229.html">PR Newswire</a>)</p></li><li><p>AI vocal cloning and the limits of voice-based authentication (<a href="https://bisi.org.uk/reports/when-voice-is-no-longer-proof-ai-vocal-cloning-and-the-limits-of-voice-based-authentication">BISI</a>)</p></li><li><p>How large-scale speech models will impact voice AI (<a href="https://www.forbes.com/councils/forbestechcouncil/2026/02/26/how-large-scale-speech-models-will-impact-voice-ai/">Forbes</a>)</p></li><li><p>Why advanced voice agents require owning the voice stack (<a href="https://www.callcentrehelper.com/performance-voice-agents-voice-stack-271986.htm">Call Centre Helper</a>)</p></li><li><p>iFLYTEK Globally Launches AI Glasses and AI Interpret Mic (<a href="https://www.globenewswire.com/news-release/2026/03/05/3250195/0/en/iFLYTEK-Globally-Launches-AI-Glasses-and-AI-Interpret-Mic-Showcasing-Full-Scenario-AI-Translation-Solutions-at-MWC26.html">GlobeNewswire</a>)</p></li><li><p>Meeami Technologies, Alif Semiconductor to demonstrate ultra-efficient edge AI noise suppression (<a href="https://www.blufftontoday.com/press-release/story/57127/meeami-technologies-alif-semiconductor-to-demonstrate-ultra-efficient-edge-ai-noise-suppression-at-embedded-world-2026/">Bluffton Today</a>)</p></li><li><p>Sensory brings always-on AI speech and biometrics to Snapdragon Wear Elite (<a href="https://www.democratandchronicle.com/press-release/story/159304/sensory-brings-always-on-ai-speech-and-biometrics-to-snapdragon-wear-elite/">Democrat and Chronicle</a>)</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Spectre I, the first smart device to stop unwanted audio recordings (<a href="https://x.com/aidaxbaradari/status/2028864606568067491?s=20">X</a>)</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;ad3c2fdf-a0e1-41a5-afd7-21720152a267&quot;,&quot;duration&quot;:null}"></div></li><li><p>Google releases <a href="https://x.com/GoogleResearch/status/2030012702668865784?s=20">WAXAL</a>. This open-access dataset delivers 2,400+ hours of high-quality speech data for 27 Sub-Saharan African languages, serving 100M+ speakers</p></li><li><p>Introducing KokoClone: Kokoro TTS, but it clones voices now (<a href="https://www.reddit.com/r/StableDiffusion/comments/1rjsgtd/kokoro_tts_but_it_clones_voices_now_introducing/">Reddit</a>)</p></li><li><p>VietSuperSpeech: A large-scale Vietnamese conversational speech dataset (<a href="https://arxiv.org/abs/2603.01894">arXiv</a>)</p></li><li><p>ZeSTA: Zero-shot TTS augmentation with domain-conditioned training for data-efficient personalized speech synthesis (<a href="https://tldr.takara.ai/p/2603.04219">Takara TLDR</a>)</p></li><li><p>How to compare latency and accuracy in voice recognition (<a href="https://www.goodcall.com/voice-ai/how-to-compare-latency-and-accuracy-in-voice-recognition">Goodcall</a>)</p></li><li><p>FineVoice review: Voice cloning in 30 seconds (<a href="https://www.unite.ai/finevoice-review/">Unite.AI</a>)</p></li><li><p>Improving automatic speech recognition for kids (<a href="https://drivendata.co/blog/child-asr-word-benchmark">DrivenData</a>)</p></li><li><p>Comparing STT algorithms for transcribing survey voice data (<a href="https://academic.oup.com/poq/article-abstract/89/4/1154/8418151?login=false">Oxford Academic</a>)</p></li><li><p>Top 10 voice AI agent platforms: Features, pros, cons &amp; comparison (<a href="https://www.bestdevops.com/top-10-voice-ai-agent-platforms-features-pros-cons-comparison/">Best DevOps</a>)</p></li><li><p>Best voice AI for fraud detection workflows (<a href="https://www.goodcall.com/voice-ai/best-voice-ai-for-fraud-detection-workflows">Goodcall</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Krisp Introduced Listener-Side Accent Conversion]]></title><description><![CDATA[A real-time Voice AI layer that improves understanding across meetings, contact centers, and AI agents]]></description><link>https://voice-ai-newsletter.krisp.ai/p/krisp-introduced-listener-side-accent</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/krisp-introduced-listener-side-accent</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Tue, 03 Mar 2026 16:19:42 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/189778557/574b54a15f1796935f66600ebefe240c.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<h3>Krisp just launched a revolutionary new technology: Listener-side Accent Conversion. </h3><p>Krisp now supports bidirectional Accent Conversion, clarity on both sides of live conversations &#8212; an industry first.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://krisp.ai/ai-accent-conversion/listener/&quot;,&quot;text&quot;:&quot;See what we built&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://krisp.ai/ai-accent-conversion/listener/"><span>See what we built</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/feed/update/urn:li:activity:7434599180188180480/&quot;,&quot;text&quot;:&quot;Read Arto&#8217;s LinkedIn post&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/feed/update/urn:li:activity:7434599180188180480/"><span>Read Arto&#8217;s LinkedIn post</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Zoom Launches Virtual Agent 3.0 and more news!]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/zoom-launches-virtual-agent-30-and</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/zoom-launches-virtual-agent-30-and</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 02 Mar 2026 14:01:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/97327a53-2406-4a95-be8a-9a328907c8bd_850x425.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Zoom Launches Virtual Agent 3.0 (<a href="https://www.cxtoday.com/contact-center/zoom-virtual-agent-3-0-chatbot-resolution/">CX Today</a>)</p></li><li><p>Speechify can now transcribe and summarize your meetings (<a href="https://9to5mac.com/2026/02/27/speechify-can-now-transcribe-and-summarize-your-meetings/">9to5Mac</a>)</p></li><li><p>RingCentral, OpenAI partner to elevate voice AI as the new frontier (<a href="https://www.uctoday.com/unified-communications/openai-ringcentral-enterprise-voice-ai/">UC Today</a>)</p></li><li><p>Genesys vs NICE vs Five9: The right CCaaS platform for enterprise AI (<a href="https://www.cxtoday.com/contact-center/genesys-vs-nice-vs-five9">CX Today</a>)</p></li><li><p>Deepgram and IBM introduce new enterprise voice AI features (<a href="https://newsroom.ibm.com/2026-02-24-deepgram-and-ibm-introduce-advanced-voice-capabilities-for-enterprise-ai">IBM Newsroom</a>)</p></li><li><p>Telekom CoMind redefines AI-powered voice and chat bots (<a href="https://www.telekom.com/en/media/media-information/archive/telekom-comind-1102658">Deutsche Telekom</a>)</p></li><li><p>Qwen TTS ships local TTS: Voice cloning in 3 seconds (<a href="https://www.geeky-gadgets.com/qwen-tts-local-text-to-speech/">Geeky Gadgets</a>)</p></li><li><p>Amazon lets users choose conversation styles (<a href="https://commstrader.com/technology/amazon-lets-users-choose-brief-chill-or-sweet-conversation-styles/">CommsTrader</a>)</p></li><li><p>ServiceNow launches AI that resolves tickets 99% faster than humans (<a href="https://www.cxtoday.com/contact-center/servicenow-autonomous-workforce-employeeworks-cx/">CX Today</a>)</p></li><li><p>Bad voice AI makes customers hang up &#8211; and move on (<a href="https://www.nojitter.com/ai-automation/bad-voice-ai-makes-customers-hang-up-and-move-on">No Jitter</a>)</p></li><li><p>55% of consumers use voice AI but most companies still sound like robots (<a href="https://www.forbes.com/sites/kolawolesamueladebayo/2026/02/27/55-of-consumers-use-voice-ai-but-most-companies-still-sound-like-robots/">Forbes</a>)</p></li><li><p>VoiceLine: &#8364;10M raised to scale enterprise-grade voice AI (<a href="https://pulse2.com/voiceline-e10-million-raised-to-scale-enterprise-grade-voice-ai-for-frontline-organizations/">Pulse 2.0</a>)</p></li><li><p>NamiTech supports AI translation at Nikkei Digital Forum Asia 2026 (<a href="https://e.vnexpress.net/news/business/namitech-provides-ai-translation-support-at-nikkei-digital-forum-in-asia-2026-5043237.html">VnExpress</a>)</p></li><li><p>How large-scale speech models will impact voice AI (<a href="https://www.forbes.com/councils/forbestechcouncil/2026/02/26/how-large-scale-speech-models-will-impact-voice-ai/">Forbes</a>)</p></li><li><p>RingCentral: Agentic AI is happening now and it&#8217;s adding value (<a href="https://siliconangle.com/2026/02/26/ringcentral-agentic-ai-happening-now-adding-value/">SiliconANGLE</a>)</p></li><li><p>How AI voice &amp; situational intelligence improve patient access (<a href="https://hitconsultant.net/2026/02/25/intelligent-access-situational-intelligence-healthcare-scheduling-ai/">HIT Consultant</a>)</p></li><li><p>Canary Speech, JubileeTV partner on AI voice biomarkers (<a href="https://www.mobihealthnews.com/news/canary-speech-jubileetv-partner-ai-voice-biomarkers-home-care">MobiHealthNews</a>)</p></li><li><p>Agora and FPT launch regional AI partnership targeting Southeast Asia&#8217;s banking and financial institutions (<a href="https://www.globenewswire.com/news-release/2026/02/24/3243775/0/en/Agora-and-FPT-Launch-Regional-AI-Partnership-Targeting-Southeast-Asia-s-Banking-and-Financial-Institutions.html">GlobeNewswire</a>)</p></li><li><p>Deepdub and Love TV create scalable localization model for European FAST channels  (<a href="https://www.prnewswire.com/news-releases/deepdub-and-love-tv-channels-establish-a-scalable-model-to-overcome-localization-barriers-and-premier-previously-unreleased-content-on-european-fast-channels-302694430.html">PR Newswire</a>)</p></li></ul><h2><strong>Voice AI Podcast &#127897;&#65039;</strong></h2><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;921cd566-eacd-4093-9616-fb3ff8065515&quot;,&quot;caption&quot;:&quot;In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?&quot;,&quot;cta&quot;:&quot;Watch now&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Promptable Speech Language Models | Dylan Fox (Founder &amp; CEO at AssemblyAI)&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-02-26T15:25:11.173Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/391ba11b-ba8e-4d16-9194-a62eb96c34d6_1920x1080.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/promptable-speech-language-models&quot;,&quot;section_name&quot;:&quot;Podcast&quot;,&quot;video_upload_id&quot;:&quot;0ac3478b-0a26-4171-84f1-0162c79c0f8a&quot;,&quot;id&quot;:188955472,&quot;type&quot;:&quot;podcast&quot;,&quot;reaction_count&quot;:23,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Voice API: Build voice agents that speak, think, and act (<a href="https://x.ai/api/voice">xAI</a>)</p></li><li><p>KittenTTS Nano: Small TTS LLM runs on CPUs without a GPU (<a href="https://www.geeky-gadgets.com/kittentts-tts-llm-model/">Geeky Gadgets</a>)</p></li><li><p>AgoraAI: An open-source voice-to-voice framework for multi-persona and multi-human interaction (<a href="https://www.mdpi.com/2076-3417/16/4/2120">MDPI</a>)</p></li><li><p>How to improve speech recognition accuracy: Tips and techniques (<a href="https://dev.to/sciforce/how-to-improve-speech-recognition-accuracy-tips-and-techniques-2ank">DEV</a>)</p></li><li><p>They tested 10 STT AI tools: these 6 saved them hours (<a href="https://startuptalky.com/speech-to-text-ai-tools-list/">StartupTalky</a>)</p></li><li><p>Vaani: Speech and translation without compromising voice data privacy (<a href="https://dev.to/mohit_kumawat_ac7e1c73556/vaani-mastering-speech-and-translation-without-compromising-voice-data-privacy-hhc">DEV</a>)</p></li><li><p>VoiceDash: Turn your voice into high&#8209;quality writing in seconds (<a href="https://techbullion.com/turn-your-voice-into-writing-in-seconds/">TechBullion</a>)</p></li><li><p>Building multilingual agents with Retell AI SDKs for accent adaptation (<a href="https://dev.to/callstacktech/building-multilingual-agents-with-retell-ai-sdks-for-accent-adaptation-my-journey-21n8">DEV</a>)</p></li><li><p>Alibaba&#8217;s Qwen: The Chinese AI model challenging Silicon Valley (<a href="https://hackernoon.com/alibabas-qwen-the-chinese-ai-model-challenging-silicon-valley">HackerNoon</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Promptable Speech Language Models | Dylan Fox (Founder & CEO at AssemblyAI)]]></title><description><![CDATA[Watch now | In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?]]></description><link>https://voice-ai-newsletter.krisp.ai/p/promptable-speech-language-models</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/promptable-speech-language-models</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Thu, 26 Feb 2026 15:25:11 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/188955472/5478fc6882972c79af5fe08df098dbf1.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<pre><code><code>In the Future of Voice AI series of interviews, I ask three questions to my guests:

- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?</code></code></pre><p>This episode&#8217;s guest is <a href="https://www.linkedin.com/in/dylanbfox/">Dylan Fox</a>, Founder &amp; CEO at <a href="https://www.assemblyai.com/">AssemblyAI</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iWtT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iWtT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iWtT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png" width="1200" height="1200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1200,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:415940,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/188955472?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iWtT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!iWtT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F274b4e69-e33d-4e2e-acf6-25a8cc225cc8_1200x1200.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Dylan started <a href="https://www.assemblyai.com/">AssemblyAI</a> in 2017 inspired by the potential of new voice-powered products like the Amazon Alexa, as well as his experience working as a research engineer at Cisco on new AI products and features. He saw an opportunity to use new AI technology to make fundamental improvements in the way that computers can understand and extract value from voice data. AssemblyAI started in Y Combinator and has now grown into a Series C company with over $115 million in funding from notable investors like Accel, Insight Partners, and Smith Point Capital. Dylan lives in Brooklyn, NY.</p><p><a href="https://www.assemblyai.com/">AssemblyAI</a> builds speech language models that serve as the foundational voice AI infrastructure for next-generation voice applications. Their models deliver industry-leading speech-to-text accuracy with superhuman speech understanding capabilities including speaker detection, summarization, PII redaction, and an LLM gateway &#8212; giving developers everything they need to build sophisticated voice AI products.<br><br>Universal-3 Pro, the first speech language model optimized specifically for voice AI, goes further with advanced prompting capabilities that let developers customize model behavior for their exact use case. With both async and real-time streaming support, AssemblyAI integrates directly into voice agents, AI assistants, medical scribes, real-time call analysis systems, and more. Tens of thousands of developers rely on AssemblyAI's models to power voice AI applications used by millions of end users every day.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.youtube.com/@futureofvoiceai&quot;,&quot;text&quot;:&quot;Listen on YouTube&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.youtube.com/@futureofvoiceai"><span>Listen on YouTube</span></a></p><h3><strong>Recap Video</strong></h3><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;97f96a66-aad9-4daf-8416-0155d91cd291&quot;,&quot;duration&quot;:null}"></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3><strong>Takeaways</strong></h3><ul><li><p>Real-time is the new growth engine - the last ~18&#8211;20 months crossed a reliability threshold where voice use cases actually work.</p></li><li><p>The real barrier in real-time STT is not model quality, it&#8217;s running low-latency systems at massive scale without breaking.</p></li><li><p>Voice AI is quietly expanding beyond agents into robotics, consumer hardware, ambient listening, and medical scribes, which widens the market fast.</p></li><li><p>Streaming models will always be disadvantaged on &#8220;look-ahead,&#8221; so the core problem is making good calls with incomplete future context.</p></li><li><p>The old quality-vs-speed tradeoff is shrinking because hardware and model optimizations are closing the gap between streaming and batch.</p></li><li><p>The &#8216;98% accuracy&#8217; claims are meaningless because benchmarks reward clean audio, not real phone chaos and edge cases.</p></li><li><p>The industry needs hard voice evals where models look bad on purpose (WER ~50%) because that&#8217;s closer to real conditions.</p></li><li><p>The bottleneck is not model quality, it&#8217;s operating low-latency voice systems at insane scale without falling over.</p></li><li><p>Pricing is used as a growth lever: $0.21 per hour, prorated by the second, with automatic volume discounts.</p></li><li><p>The &#8220;no reservations, no concurrency limits&#8221; promise is really a bet on infra superiority, not just model quality.</p></li><li><p>Dylan&#8217;s open-source take is blunt: managing your own AI infra is a tax that slows shipping and kills competitiveness.</p></li><li><p>Specialization beats multimodal generalists for reliability: a model trained 100% on STT tasks is less likely to go off the rails.</p></li><li><p>Massive training data scale, not a sudden architecture breakthrough, is the main reason accuracy jumped in the last 2&#8211;3 years.</p></li><li><p>Infrastructure is becoming the hidden moat: unlimited rate limits and no concurrency negotiations remove a major bottleneck for teams shipping voice products.</p></li><li><p>Real-world performance can move business metrics, like a 15&#8211;20% lift in voice agent booking conversions from better STT.</p></li><li><p>Dylan&#8217;s adoption forecast is aggressive: we are at the start of a 100x curve, which means today&#8217;s usage is the floor, not the peak.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Krisp launches Real-Time Voice Translation SDK for Customer Experience 🔥]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/krisp-launches-real-time-voice-translation</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/krisp-launches-real-time-voice-translation</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 23 Feb 2026 14:03:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e73d2ff0-c692-42f9-84f7-5d3c66509325_2728x1458.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Krisp launches voice translation SDK to end language barriers (<a href="https://krisp.ai/blog/real-time-voice-translation-sdk/">Krisp Blog</a>). It can be easily embedded into Web, Windows and Mac apps.</p></li></ul><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;2c0fb624-35d7-4ef5-902d-015ba5796df9&quot;,&quot;duration&quot;:null}"></div><ul><li><p>Async introduces an agentic AI framework for audio and video (<a href="https://martechseries.com/video/async-introduces-an-agentic-ai-framework-for-audio-and-video-production/">MarTech Series</a>)</p></li><li><p>NPR host David Greene sues Google over NotebookLM voice (<a href="https://techcrunch.com/2026/02/15/longtime-npr-host-david-greene-sues-google-over-notebooklm-voice/">TechCrunch</a>)</p></li><li><p>Sarvam AI introduces Sarvam Edge, a model that works offline (<a href="https://www.moneycontrol.com/technology/sarvam-ai-introduces-sarvam-edge-its-new-ai-model-that-works-on-smartphones-and-laptops-without-internet-article-13829206.html">Moneycontrol</a>)</p></li><li><p>GnaniAI launches zero-shot voice cloning TTS model (<a href="https://www.businesstoday.in/technology/story/gnaniai-launches-zero-shot-voice-cloning-tts-model-for-12-indic-languages-516896-2026-02-19">Business Today</a>)</p></li><li><p>India is becoming ElevenLabs&#8217; key growth engine (<a href="https://economictimes.indiatimes.com/tech/technology/india-is-becoming-elevenlabs-key-growth-engine-as-enterprises-scale-voice-ai-ceo-mati-staniszewski/articleshow/128435783.cms?from=mdr">Economic Times</a>)</p></li><li><p>TalkSign launches real-time sign language translation model (<a href="https://techcabal.com/2026/02/16/talksign-launches-real-time-sign-language-translation-model/">TechCabal</a>)</p></li><li><p>Ambiq launches SoundKit to accelerate on-device audio AI (<a href="https://www.morningstar.com/news/business-wire/20260217828579/ambiq-launches-soundkit-to-accelerate-always-on-on-device-audio-ai-for-the-edge">Morningstar</a>)</p></li><li><p>Samsung Smart Galaxy Glasses launching in 2026 (<a href="https://www.geeky-gadgets.com/samsung-smart-galaxy-glasses/">Geeky Gadgets</a>)</p></li><li><p>Speechmatics and Edvak EHR partner to make voice AI safe for clinical automation (<a href="https://www.globenewswire.com/news-release/2026/02/17/3239382/0/en/Speechmatics-and-Edvak-EHR-Partner-to-Make-Voice-AI-Safe-for-Clinical-Automation-at-Scale.html">GlobeNewswire</a>)</p></li><li><p>Twilio gains momentum as AI and voice demand accelerate (<a href="https://finance.yahoo.com/news/twilio-twlo-gains-momentum-ai-120701049.html">Yahoo Finance</a>)</p></li><li><p>Toyo raises &#8364;3.6 million to develop secure AI agents (<a href="https://www.eu-startups.com/2026/02/british-startup-toyo-raises-e3-6-million-to-develop-secure-ai-agents-for-non-technical-founders/">EU-Startups</a>)</p></li><li><p>Travelers launches agentic AI Claim Assistant (<a href="https://insurance-canada.ca/2026/02/20/travelers-launch-agentic-ai-claim-assistant/">Insurance Canada</a>)</p></li><li><p>How Read AI and Lucidya are redrawing the lines between meeting intelligence and customer support (<a href="https://www.webpronews.com/the-ai-notetaker-wars-heat-up-how-read-ai-and-lucidya-are-redrawing-the-lines-between-meeting-intelligence-and-customer-support/">WebProNews</a>)</p></li><li><p>RingCentral drives new era of enterprise voice AI performance (<a href="https://martechseries.com/predictive-ai/ai-platforms-machine-learning/ringcentral-drives-new-era-of-enterprise-voice-ai-performance-with-openai/">MarTech Series</a>)</p></li><li><p>Speechify&#8217;s AI Voice Research Lab launches Simba 3.0 voice model (<a href="https://www.prweb.com/releases/speechifys-ai-voice-research-lab-launches-simba-3-0-voice-model-to-power-next-generation-of-voice-ai-302692591.html">PRWeb</a>)</p></li><li><p>InterfaceAI unites CCaaS with agentic AI for community banking (<a href="https://www.manilatimes.net/2026/02/18/tmt-newswire/globenewswire/interfaceai-disrupts-the-status-quo-only-provider-to-unite-elite-ccaas-with-agentic-ai-for-community-banking/2280195">Manila Times</a>)</p></li><li><p>FlashLabs launches FlashAI 2.0 enterprise voice AI platform (<a href="https://www.prnewswire.com/news-releases/flashlabs-launches-flashai-2-0-enterprise-voice-ai-platform-for-human-level-ai-voice-agents-and-real-time-call-center-automation-302689532.html">PR Newswire</a>)</p></li><li><p>Why voice AI will replace chatbots on Indian websites by 2027 (<a href="https://dev.to/adarsh_kant_1d0455d2af438/why-voice-ai-will-replace-chatbots-on-indian-websites-by-2027-n98">DEV</a>)</p></li><li><p>Emvo unveils VoiceSHIELD to secure STT systems (<a href="https://smestreet.in/technology/emvo-unveils-voiceshield-to-secure-speech-to-text-systems-11132677">SME Street</a>)</p></li><li><p>Global speech AI struggles to understand India (<a href="https://cxotoday.com/media-coverage/global-speech-ai-struggles-to-understand-india-new-national-benchmark-voice-of-india-reveals/">CXO Today</a>)</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Kani-TTS-2: A 400M param open-source TTS model (<a href="https://www.marktechpost.com/2026/02/15/meet-kani-tts-2-a-400m-param-open-source-text-to-speech-model-that-runs-in-3gb-vram-with-voice-cloning-support/">MarkTechPost</a>)</p></li><li><p>Eve: A microphone recorder with real-time transcription (<a href="https://github.com/nexmoe/eve">Github</a>)</p></li><li><p>Best AI audio enhancers tools list (<a href="https://startuptalky.com/best-ai-audio-enhancers-tools-list/">StartupTalky</a>)</p></li><li><p>Build a custom AI voice agent using MirrorFly (<a href="https://dev.to/alexsam986/build-a-custom-ai-voice-agent-using-mirrorfly-rag-393">DEV</a>)</p></li><li><p>AI voice tools for better CX (<a href="https://www.nextiva.com/blog/best-ai-voice-tools.html">Nextiva</a>)</p></li><li><p>Detecting mental manipulation in speech via synthetic multi-speaker dialogue (<a href="https://aclanthology.org/2026.iwsds-1.41/">ACL Anthology</a>)</p></li><li><p>A deep neural network model of audiovisual speech recognition reports the McGurk effect (<a href="https://link.springer.com/article/10.3758/s13423-025-02846-8">Springer</a>)</p></li><li><p>Voice AI: Here&#8217;s what you need to know about it (<a href="https://hackernoon.com/voice-ai-heres-what-you-need-to-know-about-it">HackerNoon</a>)</p></li><li><p>Best 10 AI voice cloning tools in 2026 (<a href="https://www.findarticles.com/best-10-ai-voice-cloning-tools-in-2026-complete-guide-for-creators-and-businesses/?amp=1">FindArticles</a>)</p></li><li><p>How to change your voice in real time with iTop Voicy (<a href="https://programminginsider.com/how-to-change-your-voice-in-real-time-with-itop-voicy-a-step-by-step-guide/">Programming Insider</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Hot week in Voice AI 🔥 ]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/hot-week-in-voice-ai</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/hot-week-in-voice-ai</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 16 Feb 2026 14:01:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8fdd9d37-49bd-4b3c-8dc7-41d44c3faebb_1348x756.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><h2>Top Updates &#128170;</h2><ul><li><p>Sarvam AI: The Indian startup that beat Google AI and ChatGPT (<a href="https://www.indiatoday.in/education-today/news/story/what-is-sarvam-ai-the-indian-startup-that-beat-google-ai-and-chatgpt-2865144-2026-02-08">India Today</a>)</p></li><li><p>Google Docs rolling out Gemini-powered audio summaries (<a href="https://9to5google.com/2026/02/12/google-docs-audio-summaries/">9to5Google</a>)</p></li><li><p>NotebookLM turns PDFs into podcast-style conversations (<a href="https://www.webpronews.com/googles-notebooklm-turns-your-pdfs-into-podcast-style-conversations-and-its-changing-how-professionals-consume-dense-documents/">WebProNews</a>)</p></li><li><p>Apple prepares to enable third-party AI voice assistants in CarPlay (<a href="https://mlq.ai/news/apple-prepares-to-enable-third-party-ai-voice-assistants-in-carplay/">mlq.ai</a>)</p></li><li><p>The case for agentic AI in CX is accelerating (<a href="https://www.ringcentral.com/us/en/blog/agentic-ai-in-cx-new-research/">RingCentral</a>)</p></li><li><p>Krisp appoints CX veteran Harry Folloder to lead Enterprise Voice AI (<a href="https://krisp.ai/blog/krisp-appoints-cx-industry-veteran-harry-folloder-to-lead-enterprise-voice-ai/">Krisp</a>)</p></li><li><p>Startup aiOla strives to fix AI speech recognition woes (<a href="https://siliconangle.com/2026/02/09/aiola-strives-fix-ai-speech-recognition-woes-dynamic-routing/">SiliconANGLE</a>)</p></li><li><p>Telnyx &amp; Telarus join to accelerate AI-driven communications (<a href="https://telecomreseller.com/2026/02/13/telnyx-and-telarus-partner-to-accelerate-ai-driven-communications-across-north-america/">Telecom Reseller</a>)</p></li><li><p>T-Mobile&#8217;s live translation AI agent will be baked into your phone calls (<a href="https://www.cnet.com/tech/mobile/t-mobile-live-translation-ai-agent/">CNET</a>)</p></li><li><p>Corsound AI and IngenID partner on integrated voice security platform (<a href="https://idtechwire.com/corsound-ai-and-ingenid-partner-on-integrated-voice-security-platform/">ID Tech</a>)</p></li><li><p>Apple-backed AI model can generate sound &amp; speech from silent videos (<a href="https://9to5mac.com/2026/02/09/new-apple-backed-ai-model-can-generate-sound-and-speech-from-silent-videos/">9to5Mac</a>)</p></li><li><p>Voximplant brings Cartesia Line voice agents into real calls (<a href="https://www.globenewswire.com/news-release/2026/02/12/3237440/0/en/Voximplant-Brings-Cartesia-Line-Voice-Agents-into-Real-Calls.html">GlobeNewswire</a>)</p></li><li><p>How voice AI is reshaping customer support (<a href="https://www.nojitter.com/contact-centers/from-ivr-to-agentic-workflows-how-voice-ai-is-reshaping-customer-support">No Jitter</a>)</p></li><li><p>Kyutai releases simultaneous speech-to-speech translation model (<a href="https://www.marktechpost.com/2026/02/13/kyutai-releases-hibiki-zero-a3b-parameter-simultaneous-speech-to-speech-translation-model-using-grpo-reinforcement-learning-without-any-word-level-aligned-data/">MarkTechPost</a>)</p></li><li><p>Speechmatics and Boost.ai join forces to deliver &#8220;responsible&#8221; voice AI for regulated industries (<a href="https://customerservicemanager.com/speechmatics-and-boost-ai-join-forces-to-deliver-responsible-voice-ai-for-regulated-industries/">Customer Service Manager</a>)</p></li><li><p>Vonage and C3 AI partner on network-enabled, agentic AI field services solution for mobile workforces (<a href="https://telecomreseller.com/2026/02/11/vonage-and-c3-ai-partner-on-network-enabled-agentic-ai-field-services-solution-for-mobile-workforces/">Telecom Reseller</a>)</p></li><li><p>Travel Outlook unveils major upgrades to Annette, the Virtual Hotel Agent, powered by PolyAI (<a href="https://www.hotelnewsresource.com/article139925.html?utm_source=chatgpt.com">Hotel News Resource</a>)</p></li><li><p>How Dyna.Ai and Ejada Systems are betting big on AI-powered call centers in Saudi Arabia (<a href="https://www.webpronews.com/from-hong-kong-to-riyadh-how-dyna-ai-and-ejada-systems-are-betting-big-on-ai-powered-call-centers-in-saudi-arabia/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>Simple AI raises $14M to build voice AI agents that sell (<a href="https://finance.yahoo.com/news/simple-ai-announces-14m-first-170000321.html">Yahoo Finance</a>)</p></li><li><p>VCONIC and Speechmatics announce strategic partnership to transform conversation intelligence in healthcare and financial services (<a href="https://www.manilatimes.net/2026/02/10/tmt-newswire/globenewswire/vconic-and-speechmatics-announce-strategic-partnership-to-transform-conversation-intelligence-in-healthcare-and-financial-services/2275183">The Manila Times</a>)</p></li><li><p>Secai raises 6.2m series a to scale healthcare automation (<a href="https://financialpost.com/globe-newswire/voxira-ai-agent-developer-secai-raises-6-2m-series-a-to-scale-healthcare-automation">Financial Post</a>)</p></li><li><p>Newo raises 25m to scale AI voice infrastructure for small businesses (<a href="https://www.startup365.fr/newo-raises-25m-series-a-led-by-ratmir-timashev-to-scale-ai-voice-infrastructure-for-small-businesses-2/">Startup365</a>)</p></li><li><p>The future of voice-controlled workspaces: Use cases, implications (<a href="https://thegadgetflow.com/blog/the-future-of-voice-controlled-workspaces-technology/">Gadget Flow</a>)</p></li><li><p>Do people really want wearable AI voice recorders? (<a href="https://www.soundguys.com/do-people-want-ai-voice-recorders-152734/">SoundGuys</a>)</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>LuxTTS: Lightweight voice cloning that fits in 1GB VRAM (<a href="https://hackernoon.com/luxtts-lightweight-voice-cloning-that-fits-in-1gb-vram">Hackernoon</a>)</p></li><li><p>RoomKit, Pipecat, TEN Framework, LiveKit Agents: Choosing the right conversational AI framework (<a href="https://dev.to/quintana/roomkit-pipecat-ten-framework-livekit-agents-choosing-the-right-conversational-ai-framework-2h80">DEV</a>)</p></li><li><p>Connecting AI voice agents to SIP &amp; PSTN using NextGenSwitch (<a href="https://dev.to/masum0009/connecting-ai-voice-agents-to-sip-pstn-using-nextgenswitch-35fi">DEV</a>)</p></li><li><p>The rise of local speech recognition (<a href="https://oatmealapp.com/blog/the-rise-of-local-speech-recognition/">Oatmeal</a>)</p></li><li><p>TranzyNote: Silent copilot for every meeting (<a href="https://tranzynote.darshix.com/">TranzyNote</a>)</p></li><li><p>A complete guide to OpenAI&#8217;s Whisper API (<a href="https://dev.to/frankdotdev/turn-audio-into-intelligence-a-complete-guide-to-openais-whisper-api-520c">DEV</a>)</p></li><li><p>Building content-safe language learning apps (<a href="https://dev.to/amit_tyagi_b6bb9dd185178e/building-content-safe-language-learning-apps-azure-content-safety-real-time-speech-translation-2cbi">DEV</a>)</p></li><li><p>Top audio to text tools in 2026 (<a href="https://techbullion.com/top-audio-to-text-tools-you-should-know-in-2026/">TechBullion</a>)</p></li><li><p>Beyond noise suppression: Dynamic distortion control loss for speech enhancement and robust ASR<strong> </strong>(<a href="https://ieeexplore.ieee.org/document/11371472">IEEE Xplore</a>)</p></li><li><p>Voxtral Realtime 4B Pure C Implementation (<a href="https://github.com/antirez/voxtral.c?utm_source=chatgpt.com">GitHub</a>)</p></li><li><p>Build voice AI in Python: complete STT developer guide 2026 (<a href="https://dev.to/stalwartcoder/build-voice-ai-in-python-complete-speech-to-text-developer-guide-2026-1oe2?utm_source=chatgpt.com">DEV</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Voxtral transcribes at the speed of sound]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/voxtral-transcribes-at-the-speed</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/voxtral-transcribes-at-the-speed</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 09 Feb 2026 14:03:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/950dad61-cd22-4436-92b2-3d23f84474c7_1550x776.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><h2>Top Updates &#128170;</h2><ul><li><p>Voxtral transcribes at the speed of sound (<a href="https://mistral.ai/news/voxtral-transcribe-2?utm_source=chatgpt.com">Mistral AI</a>)</p></li><li><p>Voice AI startup ElevenLabs raises $500 million (<a href="https://www.wsj.com/tech/ai/voice-ai-startup-elevenlabs-raises-500-million-568c0c60?utm_source=chatgpt.com">The Wall Street Journal</a>)</p></li><li><p>AssemblyAI launches Universal-3 Pro optimized for voice AI (<a href="https://www.assemblyai.com/blog/introducing-universal-3-pro?utm_source=chatgpt.com">AssemblyAI</a>)</p></li><li><p>Sarvam launches Sarvam Audio, claims to offer better accuracy than GPT-4o, Gemini 3 Flash (<a href="https://www.businesstoday.in/technology/news/story/sarvam-launches-sarvam-audio-claims-to-offer-better-accuracy-than-gpt-4o-gemini-3-flash-514361-2026-02-03?utm_source=chatgpt.com">Business Today</a>)</p></li><li><p>DeepL launches real-time voice translation API (<a href="https://biz.chosun.com/en/en-it/2026/02/03/R75ETTV6AJCUXDMJBSZKQEMFEY/?utm_source=chatgpt.com">CHOSUNBIZ</a>)</p></li><li><p>Speechify expands beyond TTS with Voice AI Assistant, voice typing, and AI workspace (<a href="https://martechseries.com/technology/speechify-expands-to-voice-ai-assistant-voice-typing-ai-podcasts-platform-ai-note-taking-ai-meeting-assistant-and-ai-workspace-alongside-text-to-speech-reader/?utm_source=chatgpt.com">MarTech Series</a>)</p></li><li><p>Samsung confirms smart glasses launch in 2026 (<a href="https://techstory.in/samsung-confirms-smart-glasses-launch-in-2026-a-new-era-of-wearables/?utm_source=chatgpt.com">TechStory</a>)</p></li><li><p>Audio AI tools market is booming worldwide (<a href="https://www.openpr.com/news/4377629/audio-ai-tools-market-is-booming-worldwide-major-giants-openai?utm_source=chatgpt.com">openPR</a>)</p></li><li><p>How voice AI went from taking notes to running drive-thrus (<a href="https://www.forbes.com/sites/kolawolesamueladebayo/2026/02/03/how-voice-ai-went-from-taking-notes-to-running-drive-thrus/?utm_source=chatgpt.com">Forbes</a>)</p></li><li><p>Google to evaluate conversational AI in virtual care (<a href="https://www.beckershospitalreview.com/healthcare-information-technology/telehealth/google-to-evaluate-conversational-ai-in-virtual-care/?utm_source=chatgpt.com">Becker&#8217;s Hospital Review</a>)</p></li><li><p>8x8 reports surge in AI-powered CX adoption &amp; unveils new tools (<a href="https://itbrief.co.nz/story/8x8-sees-ai-customer-interactions-surge-across-voice-chat">ITBrief</a>)</p></li><li><p>PDFgear launches free TextaVoice TTS (<a href="https://www.morningstar.com/news/pr-newswire/20260202la75751/pdfgear-launches-textavoice-a-truly-free-text-to-speech-challenging-expensive-tts-subscriptions">Morningstar</a>)</p></li><li><p>Whispp shows real-time on-device voice AI for whisper-to-speech (<a href="https://www.telecompaper.com/partner-content/whispp-demonstrates-real-time-on-device-voice-ai-for-whisper-to-speech-at-the-dutch-pavilion-during-mwc-barcelona-march-2-6-2026--1561087?utm_source=chatgpt.com">Telecompaper</a>)</p></li><li><p>AudioCodes expands its voice solution portfolio for Webex Calling (<a href="https://www.prnewswire.com/news-releases/audiocodes-expands-its-voice-solution-portfolio-for-webex-calling-302676130.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>How ReSpeaker brings voice AI into real-world scenarios (<a href="https://www.seeedstudio.com/blog/2026/02/03/from-hearing-clearly-to-understanding-sound-how-respeaker-brings-voice-ai-into-real-world-scenarios/">Seeed Studio</a>)</p></li><li><p>YouTube expands AI auto-dubbing to 27 languages (<a href="https://gulfnews.com/technology/media/youtube-expands-ai-auto-dubbing-to-27-languages-with-expressive-speech-1.500433271?utm_source=chatgpt.com">Gulf News</a>)</p></li><li><p>Why the telephony stack is the real bottleneck in voice AI QA (<a href="https://betanews.com/article/why-the-telephony-stack-is-the-real-bottleneck-in-voice-ai-qa/?utm_source=chatgpt.com">BetaNews</a>)</p></li><li><p>Redefining excellence for AI agents in the contact center (<a href="https://www.microsoft.com/en-us/dynamics-365/blog/it-professional/2026/02/04/ai-agent-performance-measurement/?utm_source=chatgpt.com">Microsoft</a>)</p></li><li><p>Pindrop and NICE add deepfake detection to CXone (<a href="https://www.cxtoday.com/contact-center/pindrop-nice-integration-deepfake-detection-cxone/?utm_source=chatgpt.com">CX Today</a>)</p></li><li><p>Linq raises $20M to power AI agents in text messaging &amp; voice (<a href="https://www.thefastmode.com/technology-solutions/46911-linq-raises-20m-series-a-to-power-ai-agents-in-text-messaging-voice?utm_source=chatgpt.com">The Fast Mode</a>)</p><p></p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Cartesia Sonic 3 TTS model is available on Amazon SageMaker JumpStart (<a href="https://aws.amazon.com/about-aws/whats-new/2026/02/cartesia-sonic-3-on-sagemaker-jumpstart/?utm_source=chatgpt.com">AWS</a>)</p></li><li><p>WAXAL: A new open dataset for African speech technology (<a href="https://blog.google/intl/en-africa/company-news/outreach-and-initiatives/introducing-waxal-a-new-open-dataset-for-african-speech-technology/?utm_source=chatgpt.com">Google Blog</a>)</p></li><li><p>PaceCoach: Real-time speaking pace monitoring iOS app (<a href="https://pacecoach.co.uk/">PaceCoach</a>)</p></li><li><p>They build a free voice-to-text tool that supports 55+ languages published (<a href="https://dev.to/digitalwareshub/i-built-a-free-voice-to-text-tool-that-supports-55-languagespublished-true-25ma?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>Set up voice AI for scheduling appointments with Calendly using Twilio (<a href="https://dev.to/callstacktech/how-to-set-up-voice-ai-for-scheduling-appointments-with-calendly-using-twilio-30pb?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>Discord2sum &#8212; meeting minutes for Discord voice (<a href="https://dev.to/tox3d/from-discord-voice-to-meeting-minutes-local-transcription-telegramslack-delivery-15dj?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>DeVoice: Convert any sound or video into precise text (<a href="https://devoice.io/">Devoice</a>)</p></li><li><p>A free desktop app for real-time transcription &amp; translation (<a href="https://dev.to/mrd999999/i-built-a-free-desktop-app-for-real-time-transcription-translation-heres-everything-it-can-do-c65?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>Apple&#8217;s breakthrough in AI speech synthesis: How sound clustering could revolutionize voice generation (<a href="https://www.webpronews.com/apples-breakthrough-in-ai-speech-synthesis-how-sound-clustering-could-revolutionize-voice-generation/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>Children&#8217;s Speech Recognition in Slovak (<a href="https://ieeexplore.ieee.org/document/11366671">IEEE Xplore</a>)</p></li><li><p>AI-powered ESP32 TTS for DIY voice devices (<a href="https://www.hackster.io/ElectroScopeArchive/ai-powered-esp32-text-to-speech-for-diy-voice-devices-42bf7b?utm_source=chatgpt.com">Hackster.io</a>)</p></li><li><p>Paza: ASR benchmarks &amp; models for low-resource languages (<a href="https://www.microsoft.com/en-us/research/blog/paza-introducing-automatic-speech-recognition-benchmarks-and-models-for-low-resource-languages/?utm_source=chatgpt.com">Microsoft Research</a>)</p></li><li><p>FineVoice TTS enhances audio content with AI precision (<a href="https://www.findarticles.com/from-script-to-sound-finevoice-text-to-speech-enhances-your-audio-content-with-ai-precision/?amp=1">FindArticles</a>)</p></li><li><p>This simple setting changed how they used ChatGPT (<a href="https://www.techradar.com/ai-platforms-assistants/this-simple-setting-changed-how-i-used-chatgpt-forever-and-its-so-good-youll-want-to-try-it-too?utm_source=chatgpt.com">TechRadar</a>)</p><p></p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Exciting updates this week! 🔥]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/exciting-updates-this-week</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/exciting-updates-this-week</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 02 Feb 2026 14:03:02 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/edeaf4f1-6338-4077-a13c-c5d0e6917c2f_1024x681.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><h2>Top Updates &#128170;</h2><ul><li><p>Apple acquires AI startup Q.ai reported $2B (<a href="https://siliconangle.com/2026/01/29/apple-acquires-ai-startup-q-ai-reported-2b/?utm_source=chatgpt.com">SiliconANGLE</a>)</p></li><li><p>Movate and Krisp partner on AI voice solutions for CX (<a href="https://www.business-standard.com/content/press-releases-ani/movate-and-krisp-announce-strategic-partnership-to-transform-global-cx-delivery-with-ai-powered-voice-solutions-126012800857_1.html">Business Standard</a>)</p></li><li><p>Agora collaborates with Microsoft Azure AI to enable a real-time, intelligent, and interactive future across 140+ languages (<a href="https://www.microsoft.com/en/customers/story/25899-agora-azure-openai-in-foundry-models?utm_source=chatgpt.com">Microsoft</a>)</p></li><li><p>Voice AI is booming but without CX observability, it will break (<a href="https://www.cxtoday.com/contact-center/voice-ai-is-booming-but-without-cx-observability-it-will-break-operata-cs-0010/?utm_source=chatgpt.com">CX Today</a>)</p></li><li><p>Retell AI upgrades voice platform; revenue tops $40M ARR (<a href="https://markets.businessinsider.com/news/stocks/upgraded-retell-ai-voice-platform-enables-corporate-call-centers-to-deploy-infinite-ai-sales-and-support-agents-across-voice-chat-email-and-sms-company-revenue-now-exceeds-40m-arr-1035761505?utm_source=chatgpt.com">Markets Insider</a>)</p></li><li><p>Google to pay $68M over voice assistant eavesdropping claims (<a href="https://www.cbsnews.com/news/google-voice-assistant-lawsuit-settlement-68-million/?utm_source=chatgpt.com">CBS News</a>)</p></li><li><p>Moving to BPO-hosted voice AI? Risks &amp; path forward (<a href="https://mag.contactcenterpipeline.com/BzEd/p24/p24">Contact Center Pipeline</a>)</p></li><li><p>Five Guys extends partnership with SoundHound AI (<a href="https://www.theglobeandmail.com/investing/markets/markets-news/GlobeNewswire/37273698/five-guys-extends-partnership-with-soundhound-ai/">The Globe and Mail</a>)</p></li><li><p>Boldvoice raises $21M for AI voice coaching (<a href="https://www.prnewswire.com/news-releases/boldvoice-raises-21m-series-a-to-give-a-billion-non-native-english-speakers-their-own-ai-voice-coach-302671777.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>These California companies want you to ditch your keyboard (<a href="https://www.latimes.com/business/story/2026-01-29/thanks-to-ai-voice-dictation-more-people-are-speaking-out-their-emails-messages-code?utm_source=chatgpt.com">Los Angeles Times</a>)</p></li><li><p>CommBox unveils Era AI Voice to transform call centres (<a href="https://itbrief.co.nz/story/commbox-unveils-era-ai-voice-to-transform-call-centres">ITBrief</a>)</p></li><li><p>Germany&#8217;s largest grocery retailer turns to LYDIA Voice (<a href="https://www.pressebox.com/pressrelease/ehrhardt-partner-gmbh-co-kg-boppard-buchholz/germanys-largest-grocery-retailer-turns-to-lydia-voice/boxid/1283833">Pressebox</a>)</p></li><li><p>Telus &amp; RingCentral expand business connect with AI features (<a href="https://telecomreseller.com/2026/01/27/telus-and-ringcentral-expand-business-connect-with-ai-powered-features-for-canadian-businesses/">Telecom Reseller</a>)</p></li><li><p>How RingCentral&#8217;s agentic AI unifies the experience (<a href="https://www.ringcentral.com/us/en/blog/bridging-cx-ex-how-agentic-ai-unifies-experience/">RingCentral Blog</a>)</p></li><li><p>Synthesia raises $200M at $4B valuation for AI avatars (<a href="https://siliconangle.com/2026/01/26/synthesia-raises-200m-4b-valuation-build-worker-skills-using-ai-avatars/">SiliconANGLE</a>)</p></li><li><p>CyberloQ and IngenID partner to add voice biometrics and deepfake detection to location-based MFA (<a href="https://idtechwire.com/cyberloq-and-ingenid-partner-to-add-voice-biometrics-and-deepfake-detection-to-location-based-mfa/?utm_source=chatgpt.com">IDTechWire</a>)</p></li><li><p>AI-Media to showcase real-time translation and accessibility workflows at ISE 2026 (<a href="https://www.globenewswire.com/news-release/2026/01/29/3228647/0/en/AI-Media-to-Showcase-Real-Time-Translation-and-Accessibility-Workflows-at-ISE-2026-as-Multilingual-AV-Demand-Accelerates.html">GlobeNewswire</a>)</p></li><li><p>Voice AI: Come to the dark side (<a href="https://www.nojitter.com/ai-voice/voice-ai-come-to-the-dark-side?utm_source=chatgpt.com">No Jitter</a>)</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Qwen3-ASR &amp; Qwen3-ForcedAligner is now open sourced (<a href="https://qwen.ai/blog?id=qwen3asr&amp;utm_source=chatgpt.com">Qwen</a>)</p></li><li><p>Qwen3 TTS and the case for token-based speech synthesis (<a href="https://hackernoon.com/qwen3-tts-and-the-case-for-token-based-speech-synthesis">HackerNoon</a>)</p></li><li><p>Running TTS fully in the browser with PocketTTS (<a href="https://dev.to/soasme/running-text-to-speech-fully-in-the-browser-with-pockettts-2b0m?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>VIBEVOICE-ASR technical report (<a href="https://arxiv.org/abs/2601.18184">arXiv</a>)</p></li><li><p>SpatialEmb: Extract and encode spatial information for 1-stage multi-channel multi-speaker ASR on arbitrary microphone arrays (<a href="https://arxiv.org/abs/2601.18037">arXiv</a>)</p></li><li><p>A ground-truth-free framework for validating emotions in generative AI speech synthesis (<a href="https://ieeexplore.ieee.org/document/11359665">IEEE Xplore</a>)</p></li><li><p>A wireless, battery-free artificial throat patch with deep learning for emotional speech recognition (<a href="https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202516617?af=R">Wiley Online Library</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Updates from LiveKit, Google, ServiceNow, Nvidia and more this week 🔥]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/updates-from-livekit-google-servicenow</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/updates-from-livekit-google-servicenow</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 26 Jan 2026 14:01:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!w9AT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The team at Coval has <a href="https://www.coval.dev/2026-voice-ai-report">published</a> a Voice AI 2025 report.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w9AT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w9AT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 424w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 848w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 1272w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w9AT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png" width="1456" height="696" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:696,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:696356,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/185738107?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w9AT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 424w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 848w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 1272w, https://substackcdn.com/image/fetch/$s_!w9AT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6df417-bd3d-403d-809f-702af2472b41_2714x1298.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Top Updates &#128170;</h2><ul><li><p>LiveKit&#8217;s Series C: Towards the voice-driven era of computing (<a href="https://blog.livekit.io/livekit-series-c/?utm_source=chatgpt.com&amp;utm_medium=email&amp;utm_campaign=series_c">LiveKit</a>)</p></li><li><p>Google snags team behind AI voice startup Hume AI (<a href="https://techcrunch.com/2026/01/22/google-reportedly-snags-up-team-behind-ai-voice-startup-hume-ai/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>ServiceNow and OpenAI push AI past chatbots into real CX work (<a href="https://www.cxtoday.com/contact-center/servicenow-openai-partnership-ai-cx-resolution/?utm_source=chatgpt.com">CX Today</a>)</p></li><li><p>Voice AI just changed: How enterprise AI builders can benefit (<a href="https://venturebeat.com/orchestration/everything-in-voice-ai-just-changed-how-enterprise-ai-builders-can-benefit?utm_source=chatgpt.com">VentureBeat</a>)</p></li><li><p>VibeVoice-ASR: STT handling 60-minute audio in a single pass (<a href="https://www.marktechpost.com/2026/01/22/microsoft-releases-vibevoice-asr-a-unified-speech-to-text-model-designed-to-handle-60-minute-long-form-audio-in-a-single-pass/?utm_source=chatgpt.com">MarkTechPost</a>)</p></li><li><p>Deepfakes leveled up in 2025: Here&#8217;s what&#8217;s coming next (<a href="https://www.buffalo.edu/ubnow/stories/2026/01/lyu-conversation-deep-fakes-2026.html?utm_source=chatgpt.com">UBNow</a>)</p></li><li><p>Krisp appoints Vimal Nair as CGO to lead India business expansion (<a href="https://krisp.ai/blog/vimal-nair-as-chief-growth-officer-to-lead-india-expansion/">Krisp Blog</a>)</p></li><li><p>Adobe&#8217;s AI transforms PDFs into podcasts (<a href="https://www.webpronews.com/adobes-ai-transforms-pdfs-into-podcasts-reshaping-document-workflows/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>FlashLabs researchers release Chroma 1.0: A 4B real-time speech dialogue model with personalized voice cloning (<a href="https://www.marktechpost.com/2026/01/21/flashlabs-researchers-release-chroma-1-0-a-4b-real-time-speech-dialogue-model-with-personalized-voice-cloning/?utm_source=chatgpt.com">MarkTechPost</a>)</p></li><li><p>CareXM introduces AI voice agent to improve patient access (<a href="https://www.businesswire.com/news/home/20260119266996/en/CareXM-Introduces-AI-Voice-Agent-to-Improve-Patient-Access-and-Preserve-Clinician-Capacity">Business Wire</a>)</p></li><li><p>Vodia integrates with with ElevenLabs Voice AI platform (<a href="https://telecomreseller.com/2026/01/22/vodia-announces-integration-with-elevenlabs-voice-ai-platform/?utm_source=chatgpt.com">Telecom Reseller</a>)</p></li><li><p>Litera brings agentic AI to iOS for Litera One platform (<a href="https://www.lawnext.com/2026/01/litera-brings-agentic-ai-to-mobile-with-new-ios-app-for-litera-one-platform.html?utm_source=chatgpt.com">LawNext</a>)</p></li><li><p>Medallia &amp; Ada partner to turn insights into action (<a href="https://customerservicemanager.com/medallia-and-ada-team-up-to-bridge-the-genai-divide-for-enterprise-cx/">Customer Service Manager</a>)</p></li><li><p>HiDock introduced live transcription &amp; translation on HiNotes (<a href="https://www.newswire.com/news/hidock-introduced-live-transcription-translation-on-hinotes-22714975?utm_source=chatgpt.com">Newswire</a>)</p></li><li><p>Deepfake-as-a-Service revolutionizing biometrics spoofing (<a href="https://www.biometricupdate.com/202601/deepfake-as-a-service-revolutionizing-biometrics-spoofing-and-identity-fraud-report?utm_source=chatgpt.com">Biometric Update</a>)</p></li><li><p>Evernote v11: A new chapter in AI-powered productivity (<a href="https://www.newswire.co.kr/newsRead.php?no=1027288&amp;sourceType=rss">Newswire Korea</a>)</p></li><li><p>The future of AI voice agents: Trends &amp; business applications (<a href="https://www.ringcentral.com/us/en/blog/future-of-ai-voice-agents-key-trends/?utm_source=chatgpt.com">RingCentral</a>)</p></li><li><p>Conversational intelligence is reshaping modern staffing decisions (<a href="https://staffingtalk.com/why-conversational-intelligence-is-quietly-reshaping-modern-staffing-decisions/?utm_source=chatgpt.com">StaffingTalk</a>)</p></li><li><p>Xiaomi smart audio glasses record meetings in a lighter design<strong> </strong>(<a href="https://www.hardwarezone.com.sg/mobile/wearables/xiaomi-mijia-smart-audio-glasses-singapore-specs-price?utm_source=chatgpt.com">HardwareZone</a>)</p></li><li><p>roverIQ launches Ava voice assistant for StayNTouch hotels (<a href="https://www.globenewswire.com/news-release/2026/01/23/3225040/0/en/roverIQ-Introduces-Ava-the-AI-Voice-Assistant-for-StayNTouch-Hotels-That-Answers-Calls-Manages-Reservations-and-Elevates-the-Guest-Experience.html?utm_source=chatgpt.com">GlobeNewswire</a>)</p></li><li><p>Cadence launches sixth-generation Tensilica HiFi iQ DSP for voice AI and immersive audio (<a href="https://www.newelectronics.co.uk/content/news/cadence-launches-sixth-generation-tensilica-hifi-iq-dsp-for-voice-ai-and-immersive-audio">New Electronics</a>)</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Qwen3-TTS is officially live. They&#8217;ve open-sourced the full family (<a href="https://x.com/Alibaba_Qwen/status/2014326211913343303">X</a>)</p></li><li><p>PersonaPlex-7B: An open-source, full-duplex conversational model (<a href="https://x.com/DataChaz/status/2013892316105417082">X</a>)</p></li></ul><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;498f9de0-c54d-4f64-ac3f-ca05a96bd63e&quot;,&quot;duration&quot;:null}"></div><ul><li><p>Supertonic-2: Lightning fast, on-device, multilingual TTS (<a href="https://x.com/psk90_ai/status/2013952547149824430?referrer=grok-com">X</a>)</p></li><li><p>Velma: Understand the true meaning of every conversation (<a href="https://preview.modulate.ai/?utm_source=chatgpt.com">Modulate</a>)</p></li><li><p>The NVIDIA Nemotron Stack for production agents (<a href="https://hackernoon.com/the-nvidia-nemotron-stack-for-production-agents?utm_source=chatgpt.com">HackerNoon</a>)</p></li><li><p>Offline STT On iOS and macOS with Whisper Notes (<a href="https://www.trendhunter.com/trends/offline-speech-transcription?utm_source=chatgpt.com">TrendHunter</a>)</p></li><li><p>The best TTS tools: Expert tested<strong> </strong>(<a href="https://www.zdnet.com/article/best-text-to-speech-tools/">ZDNet</a>)</p></li><li><p>Voice task manager: LIA Workday (<a href="https://www.freelancer.com/projects/natural-language-processing/voice-task-manager-lia-workday?utm_source=chatgpt.com">Freelancer</a>)</p></li><li><p>AI voice agents: How to get started (<a href="https://www.socialmediaexaminer.com/ai-voice-agents-how-to-get-started/?utm_source=chatgpt.com">Social Media Examiner</a>)</p></li><li><p>Garo ASR - STT AI model for Garo language (A&#8217;chik) (<a href="https://linguistlist.org/issues/37/257/?utm_source=chatgpt.com">LINGUIST List</a>)</p></li><li><p>Severity-controllable pathological TTS for clinical applications (<a href="https://ieeexplore.ieee.org/document/11342311">IEEE Xplore</a>)</p></li><li><p>Advances and challenges in speech recognition and NLP (<a href="https://www.mdpi.com/2076-3417/16/2/1071">MDPI</a>)</p></li><li><p>Introducing the Gladia STT plugin in VideoSDK (<a href="https://dev.to/chaitrali_kakde/introducing-the-gladia-speech-to-text-plugin-in-videosdk-4c27?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>Comparing multi-scale and pipeline models for speaker change detection (<a href="https://www.mdpi.com/2624-599X/8/1/5">MDPI</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Big week in Voice AI 🔥]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/big-week-in-voice-ai</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/big-week-in-voice-ai</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 19 Jan 2026 14:03:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cJPp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Parloa raises $350M at $3B valuation (<a href="https://techcrunch.com/2026/01/15/parloa-triples-its-valuation-in-8-months-to-3b-with-350m-raise/">TechCrunch</a>)</p></li><li><p>Deepgram raises $130M at $1.3B valuation (<a href="https://www.reuters.com/technology/voice-ai-startup-deepgram-raises-130-million-13-billion-valuation-2026-01-13/">Reuters</a>)</p></li><li><p>Listen Labs raises $69M to scale AI customer interviews (<a href="https://venturebeat.com/technology/listen-labs-raises-usd69m-after-viral-billboard-hiring-stunt-to-scale-ai/?utm_source=chatgpt.com">VentureBeat</a>)</p></li><li><p>Flip raises $20M after automating 300M+ voice AI calls (<a href="https://techstartups.com/2026/01/13/flip-raises-20m-series-a-after-automating-300m-customer-service-calls-with-voice-ai/?utm_source=chatgpt.com">TechStartups</a>)</p></li><li><p>VoiceRun raises $5.5M for enterprise voice AI control (<a href="https://siliconangle.com/2026/01/14/voicerun-gets-5-5m-seed-funding-give-enterprises-control-voice-ai-agents/?utm_source=chatgpt.com">SiliconANGLE</a>)</p></li><li><p>Krisp appoints Vimal Nair as Chief Growth Officer to lead India expansion (<a href="https://krisp.ai/blog/vimal-nair-as-chief-growth-officer-to-lead-india-expansion/?utm_source=chatgpt.com">Krisp</a>)</p></li><li><p>Hands-on with Bee, Amazon&#8217;s latest AI wearable (<a href="https://techcrunch.com/2026/01/12/hands-on-with-bee-amazons-latest-ai-wearable/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>Meta Ray-Ban glasses add conversation focus for noise reduction (<a href="https://www.webpronews.com/meta-ray-ban-smart-glasses-add-ai-conversation-focus-for-noise-reduction/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>Dialpad launches real-time AI in Japan (<a href="https://martechseries.com/predictive-ai/ai-platforms-machine-learning/dialpad-launches-real-time-ai-in-japan/?utm_source=chatgpt.com">MarTechSeries</a>)</p></li><li><p>Speechify launches Voice AI Assistant on iOS (<a href="https://9to5mac.com/2026/01/12/speechify-launches-voice-ai-assistant-on-ios/?utm_source=chatgpt.com">9to5Mac</a>)</p></li><li><p>Voximplant brings xAI&#8217;s Grok Voice Agent to production calls (<a href="https://www.manilatimes.net/2026/01/15/tmt-newswire/globenewswire/voximplant-brings-xais-grok-voice-agent-api-to-production-ready-calls/2259729?utm_source=chatgpt.com">The Manila Times</a>)</p></li><li><p>Multimodal intelligence in finance industry: Audio intelligence (<a href="https://medium.com/google-cloud/multimodal-intelligence-in-finance-industry-audio-intelligence-03951504e341?utm_source=chatgpt.com">Medium</a>)</p></li><li><p>Canary&#8217;s AI Voice recognized as best hospitality solution (<a href="https://www.hospitalitynet.org/news/4130444.html?utm_source=chatgpt.com">Hospitality Net</a>)</p></li><li><p>RingCentral named a leader in the IDC MarketScape (<a href="https://telecomreseller.com/2026/01/14/ringcentral-named-a-leader-in-the-idc-marketscape-for-ai-enabled-contact-center-workforce-engagement-management/?utm_source=chatgpt.com">Telecom Reseller</a>)</p></li><li><p>India &#8216;talks&#8217; the AI walk (<a href="https://inc42.com/features/india-talks-the-ai-walk/?utm_source=chatgpt.com">Inc42</a>)</p></li><li><p>Willow Voice enables accurate, natural dictation (<a href="https://www.trendhunter.com/amp/trends/willow-voice?utm_source=chatgpt.com">TrendHunter</a>)</p></li><li><p>OmniSpeech brings deepfake voice detection into Zoom meetings (<a href="https://idtechwire.com/omnispeech-brings-deepfake-voice-detection-into-zoom-meetings/?utm_source=chatgpt.com">ID Tech</a>)</p></li><li><p>AI Voiceover Software Market to hit $105.71B by 2035 (<a href="https://market.us/report/ai-powered-voiceover-software-market/?utm_source=chatgpt.com">Market.us</a>)</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Pocket TTS: A 100M-parameter TTS model with high-quality voice (<a href="https://x.com/kyutai_labs/status/2011047335892303875?s=20">X</a>)</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;303216be-4d30-46bc-a66a-e6fe9cb08480&quot;,&quot;duration&quot;:null}"></div></li><li><p>NovaSR: Tiny audio SR model, just 52kb (<a href="https://x.com/wildmindai/status/2011071748679352726?s=20">X</a>)</p></li><li><p>StepFun Introduces Step-Audio-R1.1 (<a href="https://x.com/StepFun_ai/status/2011845838188822684?s=20">X</a>)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cJPp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cJPp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cJPp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg" width="1200" height="551" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54512,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/184940036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cJPp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cJPp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F270643e5-ca54-400a-9534-4eadf273eba1_1200x551.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li><li><p>Why your voice agent needs structure with Pipecat Flows (<a href="https://www.daily.co/blog/beyond-the-context-window-why-your-voice-agent-needs-structure-with-pipecat-flows/?utm_source=chatgpt.com">Daily</a>)</p></li><li><p>TranslateGemma: A new suite of open translation models (<a href="https://blog.google/innovation-and-ai/technology/developers-tools/translategemma/?utm_source=chatgpt.com&amp;utm_medium=social&amp;utm_campaign&amp;utm_content">Google Blog</a>)</p></li><li><p>The best dictation and STT apps for writers (<a href="https://thewritepractice.com/speech-to-text-apps-for-write/?utm_source=chatgpt.com">The Write Practice</a>)</p></li><li><p>Agent CLI: A collection of local-first, AI-powered command-line agents (<a href="https://pypi.org/project/agent-cli/?utm_source=chatgpt.com">PyPI</a>)</p></li><li><p>Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR (<a href="https://research.google/blog/next-generation-medical-image-interpretation-with-medgemma-15-and-medical-speech-to-text-with-medasr/?utm_source=chatgpt.com">Google Research Blog</a>)</p></li><li><p>Whisper.cpp 1.8.3 unleashes 12x performance boost (<a href="https://portallinuxferramentas.blogspot.com/2026/01/whispercpp-183-unleashes-12x.html?utm_source=chatgpt.com">Portal Linux Ferramentas</a>)</p></li><li><p>STEAMROLLER: A multi-agent system for inclusive automatic speech recognition for people who stutter (<a href="https://arxiv.org/abs/2601.10223?utm_source=chatgpt.com">arXiv</a>)</p></li><li><p>Implementing AI voice agents in retail: Key challenges and solutions (<a href="https://dev.to/rootstack/implementing-ai-voice-agents-in-retail-key-challenges-and-solutions-kb3?utm_source=chatgpt.com">DEV</a>)</p><p></p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Voice AI Agents market to grow to $47B by 2034]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/voice-ai-agents-market-to-grow-to</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/voice-ai-agents-market-to-grow-to</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 12 Jan 2026 14:01:40 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ce709f6d-85dd-4a69-af39-a86b4cc7117c_1552x1138.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>AI voice agents market to grow from $3.1B to $47.5B by 2034 (<a href="https://www.unite.ai/voice-ai-is-booming-but-is-it-realistic-enough-to-make-an-impact/?utm_source=chatgpt.com">Unite.AI</a>)</p></li><li><p>Amazon&#8217;s AI assistant comes to the web with Alexa.com (<a href="https://techcrunch.com/2026/01/05/alexa-without-an-echo-amazons-ai-chatbot-comes-to-the-web-and-a-revamped-alexa-app/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>ElevenLabs launches Scribe v2 (<a href="https://x.com/elevenlabsio/status/2009626517521797288?s=20">X</a>)</p></li><li><p>SoundHound AI unveils agentic voice commerce for vehicles &amp; TVs (<a href="https://www.soundhound.com/newsroom/press-releases/ces-2026-soundhound-ai-unveils-agentic-voice-commerce-for-vehicles-and-tvs-with-ai-agents-that-order-food-make-dinner-reservations-pay-for-parking-and-book-tickets-on-the-go/?utm_source=chatgpt.com">SoundHound</a>)</p></li><li><p>Voice-to-voice translation app developed by NATO &amp; UK military (<a href="https://www.ukauthority.com/articles/offline-voice-to-voice-translation-app-developed-by-nato-and-uk-military?utm_source=chatgpt.com">UKAuthority</a>)</p></li><li><p>Sentiment analysis with text and audio using AWS generative AI services<strong> </strong>(<a href="https://aws.amazon.com/blogs/machine-learning/sentiment-analysis-with-text-and-audio-using-aws-generative-ai-services-approaches-challenges-and-solutions/?utm_source=chatgpt.com">AWS</a>)</p></li><li><p>Why OpenAI is betting big on the audio AI revolution (<a href="https://aimagazine.com/news/openai-betting-big-on-audio-ai?utm_source=chatgpt.com">AI Magazine</a>)</p></li><li><p>What voice AI means for support leaders right now (<a href="https://www.forbes.com/councils/forbesbusinesscouncil/2026/01/06/what-voice-ai-means-for-support-leaders-right-now/">Forbes</a>)</p></li><li><p>Mobvoi introduces TicNote Cloud and Shadow Agent 2.0 (<a href="https://www.prnewswire.com/news-releases/mobvoi-unveils-ticnote-ticnote-pods-and-ticnote-watch--introducing-ticnote-cloud-and-shadow-agent-2-0--302655008.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>How TomTom brings AI into the car (<a href="https://www.webwire.com/ViewPressRel.asp?aId=348843&amp;utm_source=chatgpt.com">WebWire</a>)</p></li><li><p>Voicegain acquires TrampolineAI for healthcare contact center AI (<a href="https://www.prweb.com/releases/voicegain-acquires-trampolineai-to-deliver-end-to-end-contact-center-ai-for-healthcare-payers-302653539.html?utm_source=chatgpt.com">PRWeb</a>)</p></li><li><p>Apple&#8217;s Siri gets major AI overhaul with LLMs in spring 2026 (<a href="https://www.webpronews.com/apples-siri-gets-major-ai-overhaul-with-llms-in-spring-2026/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>NYU professor deploys AI voice agents for exams against cheating (<a href="https://www.webpronews.com/nyu-professor-deploys-ai-voice-agents-for-scalable-oral-exams-against-cheating/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>EarFun unveils powerful live AI translation with new Air Pro 4+ and Clip 2 alongside new flagship Wave Pro X at CES 2026 (<a href="https://www.prnewswire.com/news-releases/earfun-unveils-powerful-live-ai-translation-with-new-air-pro-4-and-clip-2-alongside-new-flagship-wave-pro-x-at-ces-2026-302652427.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>VoAgents launches enterprise voice AI platform<strong> </strong>(<a href="https://aithority.com/machine-learning/voagents-launches-enterprise-voice-ai-platform-to-help-businesses-automate-customer-conversations-and-scale-operations/?utm_source=chatgpt.com">AiThority</a>)</p></li><li><p>AI voice earbuds are redefining calls and dictation (<a href="https://savedelete.com/article/ai-voice-earbuds-are-redefining-calls-and-dictation/?utm_source=chatgpt.com">SaveDelete</a>)</p></li><li><p>These smart glasses stole the show at CES 2026 (<a href="https://www.pcmag.com/news/next-level-vision-these-smart-glasses-stole-the-show-at-ces-2026">PCMag</a>)</p></li><li><p>Nagoya firm creates app to transcribe deaf users&#8217; speech (<a href="https://www.asahi.com/ajw/articles/16206171">The Asahi Shimbun</a>)</p></li><li><p>Goertek showcases full-stack innovations in acoustics and sensing (<a href="https://www.prnewswire.com/news-releases/goertek-showcases-full-stack-innovations-in-acoustics-and-sensing-at-ces-2026-302656965.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>Beauty retailer upgrades HR service with 3CLogic voice AI (<a href="https://www.prnewswire.com/news-releases/major-beauty-retailer-modernizes-hr-service-delivery-with-3clogic-voice-ai-integration-302652873.html?utm_source=chatgpt.com">PR Newswire</a>)</p></li><li><p>Presto raises $10M in funding (<a href="https://www.restaurantdive.com/news/presto-raises-10-million-funding/809128/?utm_source=chatgpt.com">Restaurant Dive</a>)</p></li><li><p>Speechify rolls out new Snoop Dogg AI voice (<a href="https://www.edtechinnovationhub.com/news/speechify-rolls-out-new-snoop-dogg-ai-voice-powered-by-simba?utm_source=chatgpt.com">EdTech Innovation Hub</a>)</p></li><li><p>Subtle&#8217;s AI earbuds challenge AirPods with voice dictation (<a href="https://www.techbuzz.ai/articles/subtle-s-ai-earbuds-challenge-airpods-with-voice-dictation?utm_source=chatgpt.com">TechBuzz</a>)</p></li><li><p>FaceOff unveils its 10th AI &#8211; synthetic audio detection (<a href="https://www.varindia.com/news/FaceOff-Unveils-It-10th-AI-%E2%80%93-Synthetic-Audio-Detection">VARINDIA</a>)</p></li><li><p>OpenAI pushes next-gen audio AI for voice-first devices (<a href="https://www.cxodigitalpulse.com/openai-advances-toward-next-generation-audio-ai-as-voice-first-devices-take-shape/?utm_source=chatgpt.com">CXO Digital Pulse</a>)</p></li><li><p>Sound Group Inc. launches SoundSphereAI, an innovative voice AI technology showcase platform (<a href="https://www.quiverquant.com/news/Sound+Group+Inc.+Launches+SoundSphereAI%2C+an+Innovative+Voice+AI+Technology+Showcase+Platform">QuiverQuant</a>)</p><p></p></li></ul><h2>Our Latest Article</h2><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6edb089c-15db-4569-9da7-ca187fca7937&quot;,&quot;caption&quot;:&quot;2025 produced a lot of AI activity. It also exposed where CX breaks.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Voice AI Takeaways Worth Carrying Into 2026&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-01-08T15:35:54.819Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!x0Vp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/voice-ai-takeaways-worth-carrying&quot;,&quot;section_name&quot;:&quot;Articles&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:183567576,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:19,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Building voice agents with NVIDIA open models (<a href="https://www.daily.co/blog/building-voice-agents-with-nvidia-open-models/">Daily.co</a>) </p></li><li><p>Pipecat cloud is now generally available (<a href="https://www.daily.co/blog/pipecat-cloud-is-now-generally-available/">Daily.co</a>) </p></li><li><p>Asterisk PBX transport for Pipecat (<a href="https://www.linkedin.com/posts/nikolay-n-shakin_hey-voip-ai-lovers-ive-built-an-asterisk-activity-7414800249078071297-b4lt">LinkedIn</a>)</p></li><li><p>NVIDIA Broadcast app: AI-powered voice and video (<a href="https://www.nvidia.com/en-us/geforce/broadcasting/broadcast-app/?utm_source=chatgpt.com">NVIDIA</a>)</p></li><li><p>LTX-2 creates synchronized video &amp; audio from text prompts (<a href="https://x.com/ResearchBitesAI/status/2008946838217326874">X</a>)</p></li><li><p>Liquid AI: 5 open-weight model instances from a single architecture (<a href="https://x.com/liquidai/status/2008385294848942549">X</a>)</p></li><li><p>AirSpeech: Lightweight speech synthesis framework for home intelligent space service robots (<a href="https://www.mdpi.com/2079-9292/15/1/239?utm_source=chatgpt.com">MDPI</a>)</p></li><li><p>RFGETT-TTS: Robust fine-grained expressivity transfer with transformer for TTS synthesis (<a href="https://ieeexplore.ieee.org/document/11315896">IEEE Xplore</a>)</p></li><li><p>A review on Bangla TTS With human-like expressions (<a href="https://ieeexplore.ieee.org/document/11318078">IEEE Xplore</a>)</p></li><li><p>Cuneflow E-Ink notebook demo: multimodal pen + audio (<a href="https://www.youtube.com/watch?v=RfqBNfbJkC0">YouTube</a>)</p></li><li><p>VOCCI AI note-taking ring (<a href="https://www.youtube.com/watch?v=fv5YwgGCLeY">YouTube</a>)</p></li><li><p>Comparing STT algorithms for survey transcription (<a href="https://academic.oup.com/poq/advance-article-abstract/doi/10.1093/poq/nfaf056/8418151?redirectedFrom=fulltext&amp;utm_source=chatgpt.com">Oxford Academic</a>)</p></li><li><p>Voice cloning defenses are easier to undo than expected (<a href="https://www.helpnetsecurity.com/2026/01/08/voice-authentication-audio-cleanup-risk/?utm_source=chatgpt.com">Help Net Security</a>)</p></li><li><p>Mimo-audio enables few-shot learning for audio tasks (<a href="https://quantumzeitgeist.com/100-learning-mimo-audio-enables-few-shot/?utm_source=chatgpt.com">Quantum Zeitgeist</a>)</p></li><li><p>How to build a voice agent with RAG and safety guardrails (<a href="https://developer.nvidia.com/blog/how-to-build-a-voice-agent-with-rag-and-safety-guardrails/?utm_source=chatgpt.com">NVIDIA</a>)</p><p></p><p></p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Voice AI Takeaways Worth Carrying Into 2026]]></title><description><![CDATA[2025 produced a lot of AI activity. It also exposed where CX breaks.]]></description><link>https://voice-ai-newsletter.krisp.ai/p/voice-ai-takeaways-worth-carrying</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/voice-ai-takeaways-worth-carrying</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Thu, 08 Jan 2026 15:35:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!x0Vp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x0Vp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x0Vp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x0Vp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2112047,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://voice-ai-newsletter.krisp.ai/i/183567576?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x0Vp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!x0Vp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb75b5748-2c8b-4a02-974a-c078ac0780a8_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>2025 produced a lot of AI activity. It also exposed where CX breaks.</p><p>As Voice AI moved from pilots to real deployments, the gaps became hard to ignore. Some approaches scaled, others added friction, and many revealed that the hardest problems in CX still live inside live conversations.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Voice AI Newsletter! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This roundup surfaces the shifts, data, and ideas that matter heading into 2026.</p><div><hr></div><h3>1. The State of Voice AI </h3><p>Voice didn&#8217;t just improve in 2025. It showed the industry where CX actually breaks. This report highlights where legacy systems, weak adoption, and language gaps continue to create cost and inconsistency across contact centers.</p><p><strong>Read this to understand how voice AI is being used today and where teams still struggle to scale it effectively.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;7621c5e7-c591-4569-ad89-7d387ca0e51f&quot;,&quot;caption&quot;:&quot;Voice AI in CX: What 819 Leaders Reveal About the Future of Voice&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;2025 State of Voice in CX&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-08-28T14:30:51.846Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da64818c-0d57-45fd-8784-d1d86b8be1ca_2400x1257.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/2025-state-of-voice-in-cx&quot;,&quot;section_name&quot;:&quot;Articles&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:171891807,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:26,&quot;comment_count&quot;:1,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><h3>2. 2026 Voice AI Productivity Predictions</h3><p>2026 won&#8217;t be about more AI. It will be about where AI holds up in live conversations and at scale. These predictions focus on where Voice AI reduces friction in real time, improves understanding, and helps agents resolve issues faster without breaking trust.</p><p><strong>Read this to understand what actually drives Voice AI productivity and why clarity, not automation, is the lever that scales.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6fd29fe3-000d-43a5-974b-834b3d2c8f1c&quot;,&quot;caption&quot;:&quot;Voice AI Productivity is entering its execution phase.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;5 Predictions for Voice AI Productivity in 2026&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-11-21T14:03:19.843Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!XaEY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36421de-f26a-4768-8d5a-225faeea9f81_1920x1080.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/5-predictions-for-voice-ai-productivity&quot;,&quot;section_name&quot;:&quot;Articles&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:179154469,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:13,&quot;comment_count&quot;:4,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><h3>3. From Fragmentation to Focus: A Playbook for Eliminating Agent Burnout</h3><p>Burnout is a downstream symptom. The real problem is fragmented CX systems, misaligned metrics, and automation pushed too far.</p><p>This guide breaks down how today&#8217;s contact center stacks create cognitive overload and where purposeful, real-time Voice AI can support agents without removing human judgment from the conversation.</p><p><strong>Read this to understand why burnout shows up operationally and what changes actually reduce agent load.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/hubfs/WhitePaper/202601%20Tech_vs._Humanity%20Practicality%20Guide.pdf?utm_source=substack&amp;utm_medium=newsletter&amp;utm_campaign=2026+kickoff&quot;,&quot;text&quot;:&quot;Get the Playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/hubfs/WhitePaper/202601%20Tech_vs._Humanity%20Practicality%20Guide.pdf?utm_source=substack&amp;utm_medium=newsletter&amp;utm_campaign=2026+kickoff"><span>Get the Playbook</span></a></p><div><hr></div><h2>In the News</h2><h3>4. Why 95% of AI Pilots Fail and What the 5% Do Differently</h3><p>This isn&#8217;t about models falling short. It&#8217;s about <strong>how AI is deployed</strong>. The companies that succeed aren&#8217;t leading with autonomous, customer-facing agents. They&#8217;re starting with co-pilots that augment humans, where trust is built into the workflow. This piece lays out what separates experimentation from production.</p><p><strong>Read this to understand why copilots scale, autonomous agents stall, and why deployment choices matter more than ambition.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.forbes.com/sites/tarungalagali/2025/10/28/why-95-of-ai-pilots-fail-and-what-the-5-do-differently/&quot;,&quot;text&quot;:&quot;Read the full story&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.forbes.com/sites/tarungalagali/2025/10/28/why-95-of-ai-pilots-fail-and-what-the-5-do-differently/"><span>Read the full story</span></a></p><div><hr></div><h3>5. The State of CX: What 2025 Taught Us</h3><p>Customers now expect more than efficiency. In 2025, AI moved from experiment to expectation, but the gap between brand promise and delivery widened. Teams advanced automation, personalization, and data use, but cracks showed up in empathy, voice of the customer, and emotional connection.</p><p><strong>Read this to understand where CX investments outpaced real customer experience and why connection still matters as much as capability.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.forbes.com/sites/tarungalagali/2025/10/28/why-95-of-ai-pilots-fail-and-what-the-5-do-differently/&quot;,&quot;text&quot;:&quot;Read the full story&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.forbes.com/sites/tarungalagali/2025/10/28/why-95-of-ai-pilots-fail-and-what-the-5-do-differently/"><span>Read the full story</span></a></p><div><hr></div><h3>6. 48% of CX Leaders Plan to Access AI via BPO Partners</h3><p>As Voice AI adoption accelerates, BPOs are playing a larger role in making AI usable at scale and are often better positioned to operationalize it inside live contact center environments. Rather than building everything in-house, teams are turning to BPOs to reduce risk, speed deployment, and embed AI into real workflows.</p><p><strong>Read this to understand why Voice AI adoption is shifting toward partners who can operationalize it, not just vendors who sell it.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://360magazine.com/2025/08/27/48-of-cx-leaders-plan-to-access-ai-via-bpo-partners/&quot;,&quot;text&quot;:&quot;Read the full story&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://360magazine.com/2025/08/27/48-of-cx-leaders-plan-to-access-ai-via-bpo-partners/"><span>Read the full story</span></a></p><div><hr></div><h2>Worth Watching</h2><h3>7. Accent Conversion&#8217;s 85+ NPS Impact </h3><p>A concrete example of what happens when you remove friction from voice conversations at scale. The outcome wasn&#8217;t marginal. It was structural.</p><p><strong>Watch this to see how improving comprehension in real time can drive measurable CX outcomes, not marginal gains.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;00df8a0e-8fff-42d7-ae92-fdc6022e23fe&quot;,&quot;caption&quot;:&quot;In this special edition of the Future of Voice AI series of interviews, we're joined by industry vets to unpack: - How clarity became a measurable KPI for CX quality and trust - How TTEC identified and solved global voice challenges across regions - Real results: customer satisfaction, agent confidence, cost efficiency improvements and more&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Accent AI&#8217;s 85+ NPS Impact in India | James Bednar and Biju Pillai (TTEC)&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-11-06T15:35:08.685Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f812d60a-00fa-4d60-8675-495eac61b55b_1721x965.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/accent-ais-80-nps-impact-in-india&quot;,&quot;section_name&quot;:&quot;Podcast&quot;,&quot;video_upload_id&quot;:&quot;3220c462-aa76-4e78-b1c4-f0162485d44d&quot;,&quot;id&quot;:178026252,&quot;type&quot;:&quot;podcast&quot;,&quot;reaction_count&quot;:12,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><h3>8. Inside the Data: The State of Voice in CX Unpacked</h3><p>The session also digs into where deployments fail, why overpromising slows adoption, and how measurable outcomes are replacing futuristic demos as the bar for investment.</p><p><strong>Watch this to understand where voice AI is delivering real value today and why pragmatism, not hype, is driving the next phase of CX adoption.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;b87e299f-50d1-413a-a213-1389c9010a5c&quot;,&quot;caption&quot;:&quot;In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years?&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Inside the Data: The State of Voice in CX Unpacked | Peter Ryan ( Ryan Strategic Advisory)&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:32916364,&quot;name&quot;:&quot;Davit Baghdasaryan&quot;,&quot;bio&quot;:&quot;CEO &amp; Co-Founder of Krisp, early pioneer in Voice AI.\n20+ years in engineering. 18 US patent applications, ex Twilion&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23088dde-6cb0-44df-b220-5f22830cdd4c_1179x960.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-09-04T14:25:40.165Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f0aeb0c-3293-42f8-a50e-9f77a9b78bd8_1165x776.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/p/inside-the-data-the-state-of-voice&quot;,&quot;section_name&quot;:&quot;Podcast&quot;,&quot;video_upload_id&quot;:&quot;435183ab-cf9f-4912-87f0-647c9f35a6ad&quot;,&quot;id&quot;:171576167,&quot;type&quot;:&quot;podcast&quot;,&quot;reaction_count&quot;:27,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2073467,&quot;publication_name&quot;:&quot;Voice AI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!YLgs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F831a2f7e-d0a7-4e3d-87a8-c42c65d0b71c_1000x1000.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><h3>9. The Rise of Voice Productivity</h3><p>A conversation on why voice remains the highest-stakes channel in CX and how teams are redefining productivity beyond speed alone. Voice productivity is about reducing friction in live conversations. When clarity improves, teams see fewer repeats, faster resolution, and better outcomes for both customers and agents.</p><p><strong>Watch this to understand why clarity, not automation, is the real driver of voice productivity.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://slator.com/the-rise-of-voice-productivity-with-krisp-ceo-davit-baghdasaryan/&quot;,&quot;text&quot;:&quot;Watch now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://slator.com/the-rise-of-voice-productivity-with-krisp-ceo-davit-baghdasaryan/"><span>Watch now</span></a></p><div><hr></div><h2>10. What We&#8217;re Carrying Into 2026</h2><ul><li><p><strong>Voice quality drives outcomes.</strong> When conversations break down, everything slows: resolution, satisfaction, and agent capacity. Teams that invest in clear, reliable conversations resolve issues faster, protect CSAT, and give agents more capacity to do real work.</p></li><li><p><strong>Language and accent friction is expensive.</strong> The cost shows up in repetition, longer calls, and churn long before it shows up in reports. When understanding improves, calls shorten, repeats drop, and loyalty increases across global customer bases.</p></li><li><p><strong>AI creates value in the moment.</strong> The biggest gains come from supporting agents during live interactions, not from adding more layers of automation. When AI reduces friction in real time, agents stay focused, errors drop, and customers get to resolution faster.</p></li></ul><p>If 2026 is about anything, it&#8217;s this: CX improves when conversations get easier for the people on both sides of the call.</p><p><strong>Voice AI in 2026 is about real-time clarity, not more tools.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Voice AI Newsletter! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[OpenAI bets big on audio, and more this week!]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/openai-bets-big-on-audio-and-more</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/openai-bets-big-on-audio-and-more</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 05 Jan 2026 14:02:41 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2eff28c3-953b-4a17-9661-fcbd0a132d94_1296x864.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>OpenAI bets big on audio as Silicon Valley declares war on screens (<a href="https://techcrunch.com/2026/01/01/openai-bets-big-on-audio-as-silicon-valley-declares-war-on-screens/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>Dialpad launches an Agentic AI Platform for customer service (<a href="https://www.cxtoday.com/contact-center/dialpad-launches-an-agentic-ai-platform-calling-it-a-world-first-for-customer-service/">CX Today</a>)</p></li><li><p>Wispr CEO: What a post-keyboard office might look like (<a href="https://www.computerworld.com/article/4107331/wispr-ceo-interview-post-keyboard-office.html?utm_source=chatgpt.com">Computerworld</a>)</p></li><li><p>Plaud Note Pro is an AI recorder that they carry everywhere (<a href="https://techcrunch.com/2025/12/29/plaud-note-pro-is-an-excellent-ai-powered-recorder-that-i-carry-everywhere/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>Updates to Qwen Model Series and Fun-Audio-Chat-8B (<a href="https://www.alizila.com/major-updates-to-qwen-model-series-new-speech-to-speech-model-fun-audio-chat-8b-image-generation-model-tops-leaderboard/">Alizila</a>)</p></li><li><p>Krisp launches Live Captions for Call Center agents (<a href="https://krisp.ai/blog/live-captions/">Krisp Blog</a>)</p></li><li><p>Voice AI surges into 2026: Breakthroughs reshape business sectors (<a href="https://www.webpronews.com/voice-ai-surges-into-2026-breakthroughs-transform-business-sectors/?utm_source=chatgpt.com">WebProNews</a>)</p></li><li><p>Viaim to showcase the future of AI audio &amp; note-taking tech (<a href="https://ameyawdebrah.com/viaim-to-showcase-the-future-of-ai-audio-note-taking-tech-at-ces-2026/?utm_source=chatgpt.com">AmeyawDebrah</a>)</p></li><li><p>Pronounce: AI-powered speech coaching that fits your schedule (<a href="https://martech.zone/pronounce-ai-powered-speech-coaching/?utm_source=chatgpt.com">Martech Zone</a>)</p></li><li><p>How Voice AI technology is transforming business communication (<a href="https://metapress.com/how-ai-powered-voice-technology-is-transforming-business-communication/">MetaPress</a>)</p></li><li><p>Mobvoi TicNote, officially launched in the Philippines (<a href="http://techpinas.com/2025/12/Mobvoi-TicNote-Philippines.html">TechPinas</a>)</p></li><li><p>Kinyo S06 AI smart translation TWS wireless headphones review (<a href="https://www.igeekphone.com/kinyo-s06-ai-smart-translation-tws-wireless-headphones-review-revolutionizing-communication-and-audio-quality/?utm_source=chatgpt.com">iGeekphone</a>)</p></li><li><p>Carnegie Mellon&#8217;s AI-driven speech reconstruction tool aims to bridge gap for children&#8217;s speech disorders (<a href="https://community.triblive.com/c/bridgeville-signal-item/news/3947201?utm_source=chatgpt.com">TribLIVE Community</a>)</p></li><li><p>Voice AI startup Humelo builds emotion-aware speech engine (<a href="https://www.mk.co.kr/en/business/11922669?utm_source=chatgpt.com">MK</a>)</p></li><li><p>Speech Typing aims to democratize digital accessibility (<a href="https://www.trendhunter.com/trends/speech-typing?utm_source=chatgpt.com">TrendHunter</a>)</p></li><li><p>Making voice AI work: Achieve zero hallucinations in contact centers (<a href="https://www.uctoday.com/unified-communications/making-voice-ai-work-how-to-achieve-zero-hallucinations-in-contact-centers/">UC Today</a>)</p></li><li><p>Voice AI is the new interface, but is India&#8217;s infrastructure ready? (<a href="https://cxotoday.com/special-reports/voice-ai-is-the-new-interface-but-is-indias-infrastructure-ready/?utm_source=chatgpt.com">CXO Today</a>)</p></li><li><p>Transforming written content into natural speech (<a href="https://www.criticalhit.net/technology/text-to-voice-technology-transforming-written-content-into-natural-speech/?utm_source=chatgpt.com">Critical Hit</a>)</p><p></p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Voice agents with Amazon Nova Sonic (<a href="https://dev.to/aws-builders/voice-agents-with-amazon-nova-sonic-1p5k?utm_source=chatgpt.com">DEV</a>) </p></li><li><p>Building production-ready voice agents (<a href="https://shekhargulati.com/2026/01/03/building-production-ready-voice-agents/?utm_source=chatgpt.com">Shekhar Gulati</a>)</p></li><li><p>How to calculate ROI for voice AI agents in eCommerce (<a href="https://dev.to/callstacktech/how-to-calculate-roi-for-voice-ai-agents-in-ecommerce-a-practical-guide-59c9?utm_source=chatgpt.com">DEV</a>)</p></li><li><p>The best AI-powered dictation apps of 2025 (<a href="https://techcrunch.com/2025/12/30/the-best-ai-powered-dictation-apps-of-2025/?utm_source=chatgpt.com">TechCrunch</a>)</p></li><li><p>FunASR: Last updates (<a href="https://pypi.org/project/funasr/?utm_source=chatgpt.com">PyPI</a>)</p><p></p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[80M weights open-source TTS model, and much more this week!]]></title><description><![CDATA[Voice AI weekly digest]]></description><link>https://voice-ai-newsletter.krisp.ai/p/80m-weights-open-source-tts-model</link><guid isPermaLink="false">https://voice-ai-newsletter.krisp.ai/p/80m-weights-open-source-tts-model</guid><dc:creator><![CDATA[Davit Baghdasaryan]]></dc:creator><pubDate>Mon, 29 Dec 2025 14:01:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f3009739-bff2-4113-819f-40b2d6a55108_600x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Top Updates &#128170;</h2><ul><li><p>Soprano: 80M weights open-source TTS model (<a href="https://x.com/wildmindai/status/2004503960280027555?referrer=grok-com">X</a>)</p></li><li><p>Top ten language AI use cases in 2025 (<a href="https://slator.com/top-ten-language-ai-use-cases-2025/?utm_source=chatgpt.com">Slator</a>)</p></li><li><p>Resemble AI drops Chatterbox Turbo, an open-source TTS model (<a href="https://the-decoder.com/resemble-ai-drops-chatterbox-turbo-an-open-source-text-to-speech-model-that-clones-voices-in-five-seconds/">The Decoder</a>)</p></li><li><p>ByteDance launches voice-driven AI workspace AnyGen (<a href="https://equalocean.com/news/2025122521692?utm_source=chatgpt.com">EqualOcean</a>)</p></li><li><p>Gnani.ai launches Vachana STT, a foundational Indic STT model (<a href="https://cxotoday.com/press-release/gnani-ai-launches-vachana-stt-a-foundational-indic-speech-to-text-model-trained-on-1m-hours-under-india-ai-mission/?utm_source=chatgpt.com">CXO Today</a>)</p></li><li><p>Google Health AI releases a conformer-based medical STT (<a href="https://www.marktechpost.com/2025/12/23/google-health-ai-releases-medasr-a-conformer-based-medical-speech-to-text-model-for-clinical-dictation/?utm_source=chatgpt.com">MarkTechPost</a>)</p></li><li><p>Gemini adds dynamic pacing control for natural speech (<a href="https://smallbiztrends.com/google-gemini-introduces-dynamic-pacing-control-for-natural-speech/?utm_source=chatgpt.com">Small Business Trends</a>)</p></li><li><p>Hyper AI audio glasses debut at CES as a voice recorder (<a href="https://www.usatoday.com/press-release/story/21966/hyper-ai-audio-glasses-debut-at-ces-as-a-voice-recorder-with-transcription-alongside-capture-model-showcase/">USA Today</a>)</p></li><li><p>Jeff Dean on how a compute-intensive speech recognition feature made Google develop its own TPUs in 2015 (<a href="https://officechai.com/ai/jeff-dean-on-how-a-compute-intensive-speech-recognition-feature-made-google-develop-its-own-tpus-in-2015/?utm_source=chatgpt.com">OfficeChai</a>)</p></li><li><p>Alibaba open-sources voice interaction model Fun-Audio-Chat (<a href="https://pandaily.com/alibaba-open-sources-fun-audio-chat-a-new-end-to-end-voice-interaction-model?utm_source=chatgpt.com">Pandaily</a>)</p></li><li><p>TicNote AI-powered voice recorder launches in the Philippines (<a href="https://www.technobaboy.com/2025/12/27/ticnote-ai-powered-voice-recorder-launches-in-the-philippines/?utm_source=chatgpt.com">Technobaboy</a>) </p></li><li><p>NotebookLM may introduce long &#8216;Lecture&#8217; audio mode (<a href="https://www.business-standard.com/technology/tech-news/google-s-notebooklm-may-introduce-long-lecture-audio-mode-with-new-accent-125122600330_1.html?utm_source=chatgpt.com">Business Standard</a>)</p></li><li><p>NARRIS partners with Heartfulness Institute for speech AI (<a href="https://smestreet.in/technology/narris-partners-with-heartfulness-institute-for-speech-ai-10945352?utm_source=chatgpt.com">SMEStreet</a>)</p></li><li><p>How TTS technology is transforming content creation (<a href="https://www.toptrade.it/rubriche/tendenze/how-text-to-voice-technology-is-transforming-content-creation/?utm_source=chatgpt.com">TopTrade</a>)</p></li><li><p>Why they chose voice over chat for AI interviews (<a href="https://dev.to/ambalogun/why-i-chose-voice-over-chat-for-ai-interviews-and-why-it-almost-backfired-2fab?utm_source=chatgpt.com">DEV</a>)</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://voice-ai-newsletter.krisp.ai/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Engineering Corner &#128526;</h2><ul><li><p>Asterisk AI voice agent (<a href="https://github.com/hkjarral/Asterisk-AI-Voice-Agent?utm_source=chatgpt.com">GitHub</a>)</p></li><li><p>Deploy Mistral AI&#8217;s Voxtral on Amazon SageMaker AI (<a href="https://aws.amazon.com/blogs/machine-learning/deploy-mistral-ais-voxtral-on-amazon-sagemaker-ai/?utm_source=chatgpt.com">AWS</a>)</p></li><li><p>SpeakerLM: End-to-end versatile speaker diarization and recognition with multimodal large language models (<a href="https://arxiv.org/abs/2508.06372?utm_source=chatgpt.com">arXiv</a>)</p></li><li><p>Whisper statistics 2026 (<a href="https://www.aboutchromebooks.com/whisper-statistics/?utm_source=chatgpt.com">About Chromebooks</a>)</p></li><li><p>Implementing real-time streaming with VAPI for live support chat systems (<a href="https://dev.to/callstacktech/implementing-real-time-streaming-with-vapi-for-live-support-chat-systems-505j">DEV</a>)</p></li><li><p>An intelligent english-speaking training system using generative AI and speech recognition (<a href="https://www.mdpi.com/2076-3417/16/1/189">MDPI</a>)</p></li><li><p>MiraTTS: A finetune of the Spark-TTS  (<a href="https://github.com/ysharma3501/MiraTTS?utm_source=chatgpt.com">GitHub</a>)</p></li><li><p>Voice chat using WebGPU in a browser (<a href="https://x.com/tom_doerr/status/2004990292047397055?s=20">X</a>)</p></li><li><p>24/7 answering service powered by AI phone answering (<a href="https://techbullion.com/24-7-answering-service-powered-by-ai-phone-answering-the-future-of-business-communication/?utm_source=chatgpt.com">TechBullion</a>)</p></li><li><p>Human and AI voice identities evoke shared neural signatures during speaker recognition across changes in speech content and prosody (<a href="https://www.biorxiv.org/content/10.64898/2025.12.22.695263v1?rss=1">bioRxiv</a>)</p></li></ul><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://resources.krisp.ai/fullband-2025&quot;,&quot;text&quot;:&quot;Register now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://resources.krisp.ai/fullband-2025"><span>Register now</span></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://voice-ai-newsletter.krisp.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the most important news in Voice AI delivered directly to your inbox every week</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>