In the Future of Voice AI series of interviews, I ask three questions to my guests:
- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?
This episode’s guest is Russ d’Sa, CEO and Co-Founder of LiveKit.
Russ d'Sa is the co-founder and CEO of LiveKit, the network backbone for multimodal AI applications like ChatGPT and Character.ai. He started his first company in the 2007 batch of YC and was the 30th engineer at Twitter.
LiveKit is the network backbone for multimodal AI applications, including ChatGPT Advanced Voice and Character AI. What began as an open-source project for real-time voice and video applications has evolved into a global delivery network for any modality of real-time data. Today, over 20K developers use LiveKit's APIs and tools for real-time computing as the backbone of their real-time applications.
Recap Video
Takeaways
LiveKit Cloud is building the transport layer for AI computing.
It speeds up AI communication like an airline vs. driving for faster, more efficient routing.
Real-time communication is shifting from humans talking to humans to humans talking to AI.
Open-source branding hurt enterprise sales—LiveKit removed it and won Fortune 500 clients.
LiveKit hacked WebRTC’s design to turn a human-to-human protocol into an AI-to-human system.
LiveKit ran a real-time voice AI experiment using WebRTC and ChatGPT,
catching OpenAI’s attention which led to a partnership to develop VoiceMode.
Voice AI agents today are mostly hype—just glorified assistants, not solving real business problems.
Voice AI needs a stateful system that continuously listens, reacts, and adapts.
AI’s biggest challenge in enterprise is reliability—companies can’t afford AI that only works some of the time.
The magic of an LLM isn’t in executing structured workflows—it’s making human-computer interaction feel fluid and intuitive.
LiveKit is betting on a hybrid model where AI augments traditional programming logic to ensure reliability and enable natural human-AI interactions.
Russ warns against relying solely on LLMs for structured workflows because they can produce unreliable results.
AI is shifting from assistant to co-worker—eventually, it won’t wait for human input.
WebRTC wasn’t made for AI, but LiveKit is proving it can work.
The biggest shift in internet communication will be AI-to-AI interactions outnumbering human ones.
This may evolve into a higher-bandwidth, non-human language, demonstrated in the Gibberlink project born out of an ElevenLabs hackathon.
Share this post