This article is inspired by Ants & Aliens: Long-Term Product Vision & Strategy
”There is no point in having a 5-year plan in this industry. With each step forward, the landscape you’re walking on changes. So we have a pretty good idea of where we want to be in six months, and where we want to be in thirty years. And every six months we take another look at where we want to be in thirty years to plan out the next six months.”
Facebook’s Little Read Book
This is especially true in the era we live in today - the era of AI.
For a couple of years, I have been following this framework while thinking about the product strategy at Krisp.
Why 30 years?
We fall into a curious trap when we think about the future - we know too much to be dangerous. If I ask you to tell me where your product will be in two years, I suspect you’ll jump right into technical details. Which LLM will be most popular Will AI gain traction with consumers?
But what if I ask you to imagine your product in 30 years? Something appealing happens when you contemplate that time horizon. It’s so far into the future that the little details have to fall away. Who the hell knows what device we’ll be using to communicate in 2046?! It’s impossible to predict. Yet it’s easy to anticipate that we’ll be using something that will be even easier, faster, more powerful, and more ubiquitous than the smartphones of today.
Let’s attempt to understand where Enterprise Voice AI will be in 30 years…
Enterprise Voice AI in 30 years
Below we have the foundational technology building blocks for Voice AI.
These technologies are created to solve cultural and physical constraints around voice, such as the language barrier, the accent barrier, and the environment we live in (noise). As well as make it possible to seamlessly talk to machines over voice, which has been a long-time objective for humanity.
These fundamental constraints and needs have been around for thousands of years and will continue to stick around for a long time.
Voice will stay as a primary form of communication for people
People will continue to use different languages
People will have different accents
There will be noise around us
And yes, people will want to talk to machines
What will change of course is the quality, speed, and cost of these technologies.
It’s safe to assume that in 30 years, all of them will be flawless, indistinguishable from humans, blazingly fast and cheap.
Bots will understand human voice input with 100% precision, across all languages
Bots will speak to humans in a way that is indistinguishable from human (e.g. emotion, intonation, etc)
Humans will be able to converse with each other across any language
These technologies will be so small and fast that they will be integrated in every device, including wearables and possible implants
Given that these technologies will be commoditized, what will be their applications in business?
Voice AI applications in business
The following are the Voice AI product categories that have traction in business today.
Conversational Intelligence (Gong, Balto, Observe, LayerAI, etc) has serious traction in Sales and Customer Service verticals
Meeting Assistants (Krisp, Otter, etc) are gaining serious traction in prosumer and team use cases
Voice Bots are a new category, gaining traction in Customer Service and Telesales
In the next years, these categories will flourish and be transformed into bigger ones.
Meetings Wiki
Over time, Conversational Intelligence will become horizontal and will serve all functions of the company. It will morph into a knowledge base. Similarly to having a wiki, we will have wiki for meetings.
The meetings wiki will not only store passive meeting information (e.g. recording, transcripts, summaries) but will proactively keep us up to date on everything that is being discussed in the company (respecting authorization). This will make teams more productive.
Meetings Wiki will be integrated into a bigger Knowledge Base.
Meeting Assistants
These assistants will make us more effective communicators during meetings, both online as well as offline.
From real-time guidance on what to say (ideas, facts, etc) to how to say it (empathy, cross-language, articulation, etc), the conversations will become more effective.
Customer Bots
While there will still be a lot of human-to-human conversations with customers, businesses will use Customer Bots for the majority of inbound and outbound conversations both in telesales as well as customer service.
Team Bots
Our time and our availability are the most expensive resources we have as individuals. In the future, we will send our personal bots to meetings (e.g. Zoom) to talk to our colleagues and unblock them whenever necessary.
Our bots will be audio/video rendered and will possess a lot of our knowledge and up-to-date information about us. These bots will unblock our colleagues and help them make and better faster decisions.