The RAPIDS™ Accelerator for Apache Spark is a plug-in that leverages RAPIDS libraries and GPUs to accelerate data processing and machine learning pipelines on Apache Spark. It transforms existing pipelines without any code change.
All kinds of enterprises use Apache Spark for business process analytics, loading of data into data warehouses, and data preprocessing at the start of machine learning pipelines.
Assistance for Migration at Scale
Automate the qualifying, testing, and configuring of your Spark jobs for GPU acceleration, using AI to optimize configurations for maximum performance.
Large-scale migration time can be reduced from weeks or months to hours or days, enabling quicker time to value and significant savings. Apply to be considered for this free service by filling out the interest form.
Evaluate your own Apache Spark workloads for GPU acceleration potential and learn how to configure a cluster for optimal cost savings.
The Cloudera and NVIDIA integration will empower us to use data-driven insights to power mission-critical use cases. We are currently implementing this integration and already seeing over 10X speed improvements at half the cost for our data engineering and data science workflows.
— Joe Ansaldi, Technical Branch Chief of Research Applied Analytics and Statistics, IRS
We’re seeing significantly faster performance with NVIDIA-accelerated Spark 3 compared to running Spark on CPUs. With these game-changing GPU performance gains, entirely new possibilities open up for enhancing AI-driven features in our full suite of Adobe Experience Cloud apps.
— William Yan, Senior Director of Machine Learning, Adobe
Our continued work with NVIDIA improves performance with RAPIDS optimizations for Apache Spark 3 and Databricks to benefit our joint customers like Adobe. These contributions lead to faster data pipelines, model training and scoring, that directly translate to more breakthroughs and insights for our community of data engineers and data scientists.
— Matei Zaharia, Original Creator of Apache Spark and Chief Technologist at Databricks
Learn how to take GPU-accelerated data analytics from development to production.
To unlock the value of AI-powered big data and learn more about the next evolution of Apache Spark, download the ebook Accelerating Apache Spark 3.x—Leveraging NVIDIA GPUs to Power the Next Era of Analytics and AI.