CUDA graphs are a great way to make workloads with many small kernels run faster, but when first released they only supported static workflows with constant parameters. That changed with the introduction of the “graph update” facility, which offers a better, more systematic way of deploying correct CUDA graphs that's nearly fail-safe. We will also share a real world production example using CUDA Graphs to accelerate inference in TensorFlow in a customer’s search and recommender system. Prerequisite: Intermediate knowledge of CUDA programming model.