After the launch of ChatGPT back in Nov 2022, the world has changed. Our collective imagination has changed. We realized that the future is way closer than we were thinking.
Literally, every software company revisited its business strategy and product roadmap. Contact Center platforms are no exception.
One of the primary questions product executives asked themselves was (and still is) - will our current product strategy be relevant in 5 years?
One particular area of disruptive technology that has a long-tail effect in Contact Centers is AI Voice Bots. Will AI Voice Bots eventually replace Human Agents?
This is a billion-dollar question with large industry implications and in this article, we will try to explore it from various perspectives.
IVR - the early Voice Bot
Interactive Voice Response (IVR) systems are considered the basic version of modern AI Voice Bots, serving as the initial step in automating customer service in Contact Centers.Â
These systems use pre-recorded messages and menu options to guide callers through a series of choices, aiming to resolve their queries without human intervention.Â
IVR technology has proven successful in efficiently managing high volumes of calls and providing 24/7 service, thereby reducing wait times and operational costs.Â
However, they lack the advanced capabilities of AI Voice Bots, which use natural language processing to understand and respond to a wider range of customer queries with more nuanced and conversational interactions.Â
Despite this, IVR remains a foundational technology that has paved the way for more sophisticated and interactive AI solutions in Call Centers.
Benefits of modern AI Voice Bots
Modern AI Voice Bots provide several potential benefits today:
Cost Efficiency: Voice bots can significantly reduce labor costs as they don't require salaries, benefits, or breaks. Once developed and implemented, they can operate 24/7, handling multiple inquiries simultaneously.
Consistency and Scalability: Bots provide consistent responses to customer queries, ensuring uniform service quality. They can also be easily scaled to handle an increased load during peak times without the need for additional hiring or training.
Technological Advancements: Rapid advancements in AI are continuously improving the capabilities of voice bots, making their interactions more human-like and efficient.
Error Reduction: Bots eliminate human errors and provide accurate, data-driven responses. They can access and analyze large volumes of data instantly to support their interactions.
Increased Productivity: By handling routine inquiries, bots allow human agents to focus on more complex and sensitive issues, thereby increasing overall productivity and service quality.
What do surveys tell us today?
Today, customers, as well as contact center leaders think that people prefer talking to people, even in the case of identical outcomes and time 🤔
I expect this attitude will change over time as Voice AI technology matures and there are more and more early deployments of it.
Challenges faced by AI Voice Bots today
Customers usually prefer using voice communication when the problem they are facing has high emotion, high urgency and high complexity.
High emotion: e.g. a complaint
High urgency: e.g. checking the arrival time of a train
High complexity: e.g. difficulties completing a mortgage application form.Â
So if bots were to resolve such problems, they would need to:
Have high empathy
Work with very low latency
Understand context and perform complex tasks
Several factors make it challenging for them to fully replace human agents today:
Complex Human Emotions: Voice bots struggle to understand the nuances of human emotions. They may not effectively handle situations requiring empathy, humor, or subtlety, which are natural to human interactions.
Variability in Speech: Humans use a wide range of accents, dialects, and slang. Voice bots often have difficulty understanding and processing this variability, leading to misunderstandings or errors in communication.
Contextual Understanding: While AI has made significant strides, understanding context, especially in complex or multi-turn conversations, remains a challenge. Bots may fail to grasp the underlying intent or changes in topic, leading to irrelevant or incorrect responses.
Adaptability: Human agents can adapt to unexpected situations or new information quickly. Voice bots, however, require extensive programming and data to handle new scenarios, making them less flexible in dynamic environments.
Complex Tasks and Decision Making: Complex problem-solving and decision-making, especially where nuanced judgment is required, are areas where humans excel. Voice bots are not yet capable of handling such complexity with the same level of competence.
While technology is rapidly advancing, these challenges highlight why voice bots are not yet ready to fully replace humans, particularly in roles that require deep understanding, empathy, and adaptability.
Today chatbots are used to handle a fairly low number of request types.Â
Scenario 1: Voice AI will mostly empower the Agents and won’t go beyond it
One hypothesis is that voice bots won’t replace human agents; they will mostly serve to assist them.
While undoubtedly a lot of customer problems can be resolved by Voice Bots, the majority of these problems would still need to be solved by humans.
The following chart shows how US business opinion has changed over time. While in 2020, the majority thought that AI would indeed replace agents, in the last 3 years this opinion has been dramatically changing.
People are more of the opinion that AI will be a great assistant for agents (which is obviously true) rather than replace them.
Voice AI offers a chance to give agents quick and solid support during calls. AI can advise agents on the next best step, bring up useful info from the knowledge base, and suggest strategies based on the customer's past interactions to improve sales and add upsell opportunities.
Voice AI can help agents speak with the listener’s accent and language to improve understanding and comprehension.
It can listen to the live voice data, starting tasks like giving out information and handling back-office activities. They can also offer guidance or warnings if there's a long break in the conversation or if something goes wrong.
Ultimately in this scenario, Voice Bots will make agents highly productive and predictable but won’t replace them over time.
Scenario 2: in 10 years, bots will handle the majority of calls
No doubt bots will be great assistants for human agents. Over the next 10 years, they will make human agents way more productive.
Agent Assist will be a thriving category.
However, over time, as Voice AI advances and demonstrates more intelligence and empathy, it will take over more and more tasks from human agents.Â
Imagine having agents whichÂ
don’t require onboarding and training
are obsessed with your customers
have decent problem-solving skills
are polite, have a positive attitude and show great compassion
have unlimited memory and can work 24/7
And they are 10x more affordable than human agents.
Once you imagine this, it’s hard to un-imagine it.
If the above is true, in the long term (10+ years), it’s obvious that AI Voice Bots will be the preferred option.
My estimate is that in 10 years the bots vs human ratio serving voice calls in contact centers will be 80% and 20%.
The current limitations in Voice AI (e.g. robotic voice, no empathy, challenges around noisy speech-to-text, etc) are bugs. Engineers will solve these bugs one by one and we will gradually accept this transformation. This process won’t be overnight but will take small iterations.
The economic incentive (bot vs human) is so obvious and high that the market is going to do everything for the technology to catch up and then become much better than humans.
Startups building AI Voice Bots that can place or receive calls
Poly AI
Poly AI is a technology company specializing in conversational AI. They create advanced voice assistants capable of understanding and responding in natural language, aiming to provide human-like interactions. Their technology is used across various industries for customer service and other applications, leveraging deep learning to understand context and intent.
SynthFlow AI
All-in-one solution to building sophisticated AI agents and scaling intelligent solutions without any coding.Â
Their models are trained to handle different sales processes: cold calling, lead qualification, appointment scheduling, and CRM management. Example use cases:
Human-Like AI Voice
Fast Appointment Scheduling
Vocode
Vocode provides tools and abstractions to build any kind of voice-based application on top of LLMs. Examples of things you can build with Vocode include setting up LLMs to answer/make phone calls, act as personal assistants, join Zoom meetings, and more. What Vocode provides:
Conversation abstractions (streaming, turn-based)
Conversation functionality (endpointing, emotion tracking)
Integrations to all of the best speech-to-text/text-to-speech providers
Cross-platform support (telephony, web, Zoom)
Vapi
Vapi is an API for building voice assistants.
Vapi abstracts away the speech-to-speech pipeline, and connects to the providers on your behalf. We have various latency optimizations like end-to-end streaming, colocating servers, etc. to squeeze out every millisecond of latency we can. We also manage the coordination of interruptions, turn-taking, and other conversational dynamics.
لا اعتقد ذلك