Introduction to CUDA Programming and Performance Optimization
, Developer Technology Engineer, NVIDIA
高度評價
This talk is the first part in a series of Core Performance optimization techniques. It is intended for developers learning CUDA, and will teach all the basics that every CUDA developer should know to achieve good performance when writing CUDA kernels for NVIDIA GPUs. The topics covered will include SIMT execution, control flow, global and shared memory access patterns, GPU occupancy, and identifying bottlenecks.