Telecommunications

World-Class Speech AI for the Best Video Conferencing Experience

Objective

Serving accurate real-time transcriptions to millions of video conference users, improving business efficiency and customer satisfaction.

Customer

RingCentral

Use Case

Real-time Transcriptions

Technology

NVIDIA DGX A100, NVIDIA NeMo, NVIDIA Riva, NVIDIA Triton Inference Server

Accurate Transcriptions Enhance “Work from Anywhere” Collaboration

With hundreds of millions of online meetings daily, video conferencing has become an essential tool for enterprises today. Video conferencing applications use real-time transcription to offer features such as live captioning and meeting summarizations. RingCentral, a leading provider of unified-communications-as-a-service (UCaaS) solutions, transcribes over a billion minutes of meetings for 200,000 concurrent users on their platform. They were looking for a transcription solution to handle multiple accents, domain-specific jargon, and noisy environments accurately and in real time.

NVIDIA Solution

RingCentral fine-tuned NVIDIA’s state-of-the-art, pretrained speech recognition models on proprietary custom data with NVIDIA NeMo—an open-source framework for building conversational AI models. The models were deployed in production using NVIDIA Riva—a GPU-accelerated SDK for deploying world-class AI-based speech applications.

RingCentral Results

Results

  • Accuracy increased by more than 10 percent

  • Better quality of tasks downstream of transcription

With NVIDIA speech AI, the RingCentral team achieved impressive accuracy for customers with worldwide accents and different domain-specific vocabularies, reducing the word error rate (WER) by over 10 percent. Customers have reported colossal differences in the quality of tasks downstream of transcripts, such as meeting summarization and sentiment analysis of video conferencing and call center sessions.

“Using NVIDIA® Riva speech-to-text, we’re able to transcribe meeting audio in real time with high accuracy while concurrently running thousands of streams, which translates to more engaging meeting experiences for millions of RingCentral users.”

Prashant Kukde
Associate Vice President, RingCentral