In the Future of Voice AI series of interviews, I ask three questions to my guests:
- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?
This episode’s guest is Tom Shapland, CEO at Canonical AI.
Tom Shapland is the CEO and Co-Founder of Canonical AI, where he leads work on real-time speech AI systems that sound and interact like humans. He holds a Ph.D. in plant physiology and has a strong background in applying machine learning to real-world systems. He is an alumnus of Y Combinator and is based in San Francisco.
Canonical AI helps Voice AI developers improve their Voice AI agents. They provide conversation-level analytics such as caller journeys and call failure classification. They also provide metrics on latency, interruptions, and other signal processing statistics that are relevant to Voice AI.
Recap Video
Takeaways
Canonical AI builds real-time analytics tools that help developers find and fix where voice agents break.
A lot of people hang up from calls with voice AI agents as soon as they realize it’s a bot.
Voice AI agents fail because the core models—ASR, LLM, TTS—aren’t good enough yet.
MVPs get shipped with patches and band-aids on top of text-based tools that aren’t built for the unique demands of speech.
Speech-to-text still struggles with spelled-out words and letters, which ruins basic tasks like getting an email address.
People’s reactions to voice AI are shaped by psychology, not just tech.
Cultural habits and personal history make people instinctively distrust machines in emotional moments.
Years of bad IVR experiences have conditioned users to expect frustration from voice systems.
Voice AI may always be the first layer, not the final answer, for emotional or complex issues.
Hype is everywhere, but most people still haven’t used voice AI in the wild.
Devs don’t know what their agents are doing in production and need better visibility.
Many teams look successful online but are stuck in MVP mode with no real usage.
Growth comes from expanding current customers—not just landing new ones.
Canonical gives teams insight into why calls fail so they can improve conversion.
Real adoption will take off once users see voice AI actually solve problems.
Voice AI needs to feel helpful and connected—not like old IVR menus.
Devs want a platform-agnostic tool they can trust to objectively track performance and avoid vendor lock-in.
You can’t scale voice AI without observability—flying blind doesn’t work.
Share this post