NVIDIA provides resources for financial institutions looking to use generative AI for IDP, such as constructing chatbots with retrieval-augmented generation (RAG) to automate loan processes or developing market insights in portfolio construction and trade execution.
Optimal Inference for Generative AI Workloads
NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of easy-to-use inference microservices designed to accelerate the deployment of generative AI across your enterprise. This versatile runtime supports open community models and NVIDIA AI Foundation models from the NVIDIA API catalog, as well as custom AI models. NIM builds on NVIDIA Triton™ Inference Server, a powerful and scalable open-source platform for deploying AI models, and is optimized for large language model (LLM) inference on NVIDIA GPUs with NVIDIA® TensorRT™-LLM. NIM is engineered to facilitate seamless AI inferencing with high throughput and low latency, while preserving the accuracy of predictions. NIM lets organizations deploy AI applications anywhere with confidence, whether on premises or in the cloud.
Accelerate Data Curation
NVIDIA NeMo™ Curator is a scalable, GPU-accelerated data-curation microservice that prepares high-quality datasets for pretraining and customizing generative AI models. With it, financial institutions can train and fine-tune LLMs on financial documents. NeMo Curator streamlines data-curation tasks such as data download, text extraction, reformatting, cleaning, quality filtering, and exact/fuzzy deduplication to help reduce the burden of combing through unstructured data sources. Document-level deduplication ensures that LLMs are trained on unique documents, which can greatly reduce pretraining costs.
Real-Time Information Retrieval
NeMo Retriever is a collection of CUDA-X™ microservices that enable semantic search of enterprise data to deliver highly accurate responses using retrieval augmentation. Developers can use these GPU-accelerated microservices for specific tasks, such as searching for relevant pieces of information within internal data to answer business questions, increasing accuracy and reducing hallucinations.