Understanding the basic memory architecture of whatever system you’re programming for is necessary to create high performance applications. Most desktop systems consist of large amounts of system memory connected to a single CPU, which may have 2 or three levels or fully coherent cache. Before you get started with CUDA, you should read this to understand the basic memory hierarchy of modern CUDA capable compute devices. Continue reading ‘CUDA Memory and Cache Architecture’ »