Understanding the basic memory architecture of whatever system you’re programming for is necessary to create high performance applications. Most desktop systems consist of large amounts of system memory connected to a single CPU, which may have 2 or three levels or fully coherent cache. Before you get started with CUDA, you should read this to understand the basic memory hierarchy of modern CUDA capable compute devices. Continue reading ‘CUDA Memory and Cache Architecture’ »
Posts tagged ‘Coherence’
High level languages such as C, C++, C#, FORTRAN, and Java all do a great job of abstracting the hardware away from the language. This means that programmers generally don’t have to worry about how the hardware goes about executing their program. However, in order to get the maximum amount of performance out of your programs, it is necessary to start thinking about how the hardware is actually going to execute your program.