100K parallel conversations |Scott Spephenson (CoFounder & CEO Deepgram)

Playback speed

Share post at current time

Share from 0:00

0:00

Transcript

100K parallel conversations | Scott Stephenson (Co-Founder & CEO, Deepgram)

Davit Baghdasaryan

Apr 18, 2024

Transcript

In The Future of Voice AI series of interviews, I ask three questions to my guests:

- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?

This episode’s guest is Scott Stephenson, Co-Founder & CEO at Deepgram.

Scott is a dark matter physicist turned Deep Learning entrepreneur. He earned a PhD in particle physics from University of Michigan where his research involved building a lab two miles underground to detect dark matter. Scott left his physics post-doc research position to found Deepgram.

Deepgram is one of the largest API companies offering Speech AI technologies such as Speech-to-Text, Audio Intelligence and the recently launched Text-to-Speech. Deepgram’s technology provides high accuracy and naturalness across multiple languages and accents. The major use cases include contact centers, conversational AI, media transcription, and speech analytics.

Recap Video

Takeaways

Deepgram is building its own ASR models and this gives the ability to tune and scale the models
Their infrastructure handles 100K real-time conversations (on average) at any moment of the day
It’s easy to get an AI model to work but way, way harder to scale it with a 10x cheaper price
The vast majority of Deepgram use cases are Speech to text. But Text to Speech is starting to take off as well
When competing with large companies (Google, Amazon, MS, etc.), it’s important to realize that you are not really competing with the entire company but a small technical team who are generally less motivated than your startup
Accuracy, speed and price are the top 3 problems in Speech-to-text
Speech-to-Text prices have already decreased by 10x. Another 10x decrease is unlikely in the near future, at least not in the real-time use case.
Faster AI inference chips will allow for larger and more accurate models with the same pricing
Under 500ms latency is critical for Voice Bots’ use case
Deepgram offers super low latency STT and super low latency TTS today

Voice AI Newsletter

100K parallel conversations | Scott Stephenson (Co-Founder & CEO, Deepgram)

Recap Video

Takeaways

Discussion about this video