On-Device Speech-to-Text, 10x cheaper 🔥

Aug 29, 2024

All modern laptops and PCs are actively equipped with AI chips (NPUs).

I wrote about it here:

Voice AI Newsletter

The AI PC Era 🚀 for Call Centers is here

“The AI PC will be a sea change moment in technical innovation…

a year ago · 5 likes

And here:

Voice AI Newsletter

The Power of On-Device Transcription in Call Centers 💪

With advancements in Speech-to-text AI and on-device AI, the call center industry is approaching a transformative change. We should start rethinking the traditional approach of cloud-based transcriptions, bringing the process directly onto the agents' devices…

2 years ago · Davit Baghdasaryan

This trend allows more Speech-to-Text workloads to be moved to on-device AI.

This is incredibly valuable in call centers and BPOs.

The problem with STT is that you need to first obtain the recordings from SoftPhone platforms and unfortunately it's not always possible.

- There might be no integration available with Softphone

- The cost might be prohibitive

- Customer might not allow it due to compliance

- It might take days to obtain it

Introducing Krisp Speech-to-Text API 🔥

At Krisp, we just launched an on-device Speech-to-Text API, specifically designed for call centers and BPOs.

Automatically supports all voice platforms (no integration needed)
Up to 10x cheaper than the industry
Customer data is not sent to any servers (including Krisp's)
Both post-call and real-time
Even PII/PCI is removed on-device

Who is it built for?

We have been building this technology for >2 years and are excited to bring it to our call center customers.

The solution is perfect for call centers and BPOs wanting to build Speech Analytics, Customer QA or Agent Assist technologies to make their operations more effective. Speech-to-Text is a core building block for these technologies.

It’s also perfect for enterprises that don’t want to share internal or customer data with 3rd parties.

Since Speech-to-Text happens on-device, the pricing is disruptive.

How it works

The installation process is straightforward:

Install Krisp app on agents’ devices (the same app that provides Noise Cancellation and Accent Localization)
Turn on Speech-to-Text from Krisp’s web admin dashboard
Specify the private cloud where you want call transcripts to be uploaded (e.g. S3)

Once set up, as soon as a call ends, Krisp will upload the transcript to the private cloud location, with < 1 second latency.

If Recording is enabled, Krisp will also record calls and upload recordings to the same private cloud, along with call transcripts.

Integration with softphone systems

The solution automatically integrates with top CX and voice platforms such as Genesys, Avaya, TalkDesk, Teams, Zoom, and more, simplifying the implementation process. No integration is required.

Accuracy

Krisp-generated transcripts go through several post-processing steps to make sure the transcript has the highest quality level:

Accuracy with a WER (Word Error Rate) of only 7.96%
Adds punctuation, capitalization, and numerical values
Assigns text to speakers with timestamps
If enabled, removes PII/PCI and filler words

Supporting 4 languages

Krisp’s STT API currently supports English, German, French and Spanish. More languages will come over time.

Learn More

You can learn more here.

Voice AI Newsletter

On-Device Speech-to-Text, 10x cheaper 🔥

Introducing Krisp Speech-to-Text API 🔥

Who is it built for?

How it works

Integration with softphone systems

Accuracy

Supporting 4 languages

Learn More

Discussion about this post