The time and computational demand of preprocessing in machine learning can be a huge factor when productionizing an AI solution with time-demanding training in a high-data-volume environment. This is especially true for the use case we're discussing. We'll review the preprocessing code for anomaly detection on the sensor data of a cooling aggregate at automotive manufacturing plant. Furthermore, we'll show how to reduce the preprocessing time of training data from nearly two days to less than 10 minutes with NVIDIA's RAPIDS stack. We'll revisit crucial parts of the Python code and see the performance gain when we change from Spark to RAPIDS.