CUDA graphs are a great way to make workloads with many small kernels run faster, but when first released they only supported static workflows with constant parameters. That changed with the introduction of the “graph update” facility, which offers a better, more systematic way of deploying correct CUDA graphs that's nearly fail-safe. We'll explain the potential difficulties of using ad-hoc methods to introduce CUDA graphs into real-life codes. While these often work, they can be cumbersome and are prone to failure. We'll contrast them with an update-based method that allows creation of an initial graph from a section of code that can be updated as needed on each occurrence. This is generally very fast if the structure of the code section represented by the graph remains the same. Prerequisite: Intermediate knowledge of CUDA programming model.