Leveraging LILT and NVIDIA GPUs, the agency achieved translation throughput rates of up to 150,000 words per minute (benchmarked across four languages). The deployment on AWS supports scaling far beyond what was possible on premises. Scalable GPU resources easily increase throughput during peak workloads, meaning that no end user is waiting for a queue to be processed and no mission is left without adequate support and resourcing.
To achieve this result, LILT accelerated model training with NVIDIA A100 Tensor Core GPUs and model inference with NVIDIA T4 Tensor Core GPUs. Using models developed with NeMo, LILT’s platform can deliver up to a 30X character throughput boost in inference performance compared to equivalent models running on CPUs. In addition, the NVIDIA AI platform enables LILT to scale their model size by 5X with significant improvement, not only in latency, but also in quality.
NeMo is included as a part of NVIDIA AI Enterprise, which provides a production-grade, secure, end-to-end software platform for enterprises building and deploying accelerated AI software.
LILT’s adaptive machine learning models are constantly improving, specifically when content is reviewed and linguists provide input. This is then leveraged as training data for model fine-tuning. As a result of this continuous context enhancement, LILT’s dynamic tools keep up with ever-changing colloquial language across social media and other crucial content sources. LILT also deploys a multi-faceted workflow with applications for linguists and non-linguists alike, allowing each team to work autonomously and make the most of their unique skillset to drive ultimate efficiency in time-sensitive situations.
Learn more about how other organizations are leveraging LILT to improve their own operations and customer experiences.