There comes a time in most complex programs where you want to ask a simple question like, ‘have I already processed a string with this id’? Linear searches through an array are easy to write and work well enough for small array sizes. Plus, the memory overhead of linear searches is fantastic, since it basically has none. But when your arrays can contain many elements, it is time to ditch those linear searches and go with an ordered map or unordered map. Continue reading ‘Ordered map vs. Unordered map – A Performance Study’ »

Understanding the basic memory architecture of whatever system you’re programming for is necessary to create high performance applications. Most desktop systems consist of large amounts of system memory connected to a single CPU, which may have 2 or three levels or fully coherent cache. Before you get started with CUDA, you should read this to understand the basic memory hierarchy of modern CUDA capable compute devices. Continue reading ‘CUDA Memory and Cache Architecture’ »