Mastering Speech AI for Multilingual Multimedia Transformation
, Senior Product Manager, NVIDIA
, Machine Learning Engineer, OVHcloud
Creating practical real-time speech-AI-based applications requires sophisticated software to handle natural speech, different accents, and domain-specific vocabularies for various languages and environments. Learn about building real-time multimedia transcription, from selecting and optimizing speech AI models to API deployment. We'll show you how to add subtitles and dubbing in a specific language using Riva speech recognition, text-to-speech, and translation. Also, we'll discuss advanced features such as speaker diarization and text/video extraction. We'll demonstrate customization techniques such as domain-specific jargon adaptation (medical, legal, etc.) to improve speech transcription and synthesized speech for different pronunciations, tones, and accents. Finally, we'll bring everything together by showing how to build a simple web application that automatically creates subtitles and dubs in a targeted language.
이벤트:
날짜:
업계:
레벨:
토픽:
NVIDIA technology: Cloud / Data Center GPU,NeMo,TensorRT,Triton