0:00
/
0:00

Voice AI for Frontline Workers | Assaf Asbag (Chief Product & Technology Officer at aiOla)

In the Future of Voice AI series of interviews, I ask three questions to my guests:

- What problems do you currently see in Enterprise Voice AI?
- How does your company solve these problems?
- What solutions do you envision in the next 5 years?

This episode’s guest is Assaf Asbag, Chief Technology and Product Officer at aiOla.

Listen on YouTube

Assaf Asbag is the CPTO at aiOla, leading AI-driven product innovation and enterprise solutions. He previously served as VP of AI at Playtika, where he built the AI division into a key growth engine. Assaf’s background includes advanced algorithm work at Applied Materials and leadership across engineering and data science teams. He holds B.Sc. and M.Sc. degrees in Electrical and Computer Engineering with a focus on machine learning from Ben-Gurion University, making him a recognized expert in AI and technology strategy.

aiOla's patented models and technology supports over 100 languages and discerns jargon, abbreviations, and acronyms, demonstrating a low error rate even in noisy environments. aiOla's purpose-built technology converts manual processes in critical industries into data-driven, paperless, AI-powered workflows through cutting-edge speech recognition.

Recap Video

Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates.

Takeaways

  • Turning spoken language into structured data in noisy, multilingual, and jargon-heavy environments is the real differentiator for enterprise voice AI.

  • Standard ASR models fail in frontline industries due to heavy accents, domain-specific vocabulary, and constant background noise.

  • Zero-shot keyword spotting from large jargon lists without fine-tuning can drastically cut setup time for specialized speech recognition.

  • Building proprietary, noise-heavy training datasets is essential for robust ASR performance in the real world.

  • Synthetic data generation that blends realistic noise with text-to-speech can cheaply scale model adaptation for niche environments.

  • Real-time processing is critical to making voice the primary human–technology interface, especially for operational workflows.

  • Voice AI has massive untapped potential among the world’s billion-plus frontline workers, far beyond current call center focus.

  • Incomplete or missing documentation is a hidden cost that voice-first tools can solve by capturing richer, structured information on the spot.

  • Effective enterprise AI solutions often require both a core product and flexible integration layers (SDK, API, or full app).

  • Trustworthy AI for voice will require guardrails, watermarking, bias detection, and context-aware filtering.

  • The next leap in conversational AI will be personalized, real-time adaptive systems rather than today’s generic emotion mimicking.

  • Designing for multimodal interaction (voice, text, UI) will be as important as model accuracy for user adoption.

  • AI revolutions historically create more jobs than they displace, but require new roles in monitoring, reliability, and context engineering.

  • Future speech AI should emulate human listening: diagnosing issues, correcting in real-time, and adapting based on cues like pace, volume, and accent.

Discussion about this video