Parallel programming can be intimidating, but doesn't need to be! There's a new paradigm for parallel programming that's newcomer-friendly, highly productive, and performant: tile-based programming models.
In this example-driven talk, we'll introduce you to tile-based programming in Python, C++, and Rust. We'll present cuTile, NVIDIA's new tile programming stack and Tile IR, the new compiler stack that it is built with. You'll learn all about new features of CUDA Tile that have recently been announced, including multi-GPU communication, interoperability with traditional CUDA SIMT, and support for more diverse kernels like convolutions and stencils. We'll compare and contrast tile-based models with traditional parallel programming models. You'll see examples from a variety of domains, including HPC stencils, a sparse matrix vector (SPMV) and conjugate gradient (CG) solver, and AI models from TileGym.
By the end of the session, you'll understand how tile programming enables more intuitive, portable, and efficient development of high-performance, data-parallel applications, for HPC, data science, and machine learning.