Atomic operations are often essential for multithreaded programs, especially when different threads need to access or modify the same data. Conventional multicore CPUs generally use a test-and-set instruction to manage which thread controls which data. CUDA has a much more expansive set of atomic operations. With CUDA, you can effectively perform a test-and-set using the atomicInc() instruction. However, you can also use atomic operations to actually manipulate the data itself, without the need for a lock variable. Continue reading ‘CUDA – Tutorial 5 – Performance of atomics’ »
Posts tagged ‘GPGPU’
The main strong point of CUDA is highly parallel number crunching. Fortunately, this is a very common type of problem encountered in many high performance computing problems. Here is a list of some example applications which have been created using CUDA to achieve maximum performance that is simply not possible on a CPU alone.
CUDA stands for Compute Unified Device Architecture, and is an extension of the C programming language and was created by nVidia. Using CUDA allows the programmer to take advantage of the massive parallel computing power of an nVidia graphics card in order to do general purpose computation. Continue reading ‘What is CUDA? An Introduction’ »