Welcome to the second tutorial in how to write high performance CUDA based applications. This tutorial will cover the basics of how to write a kernel, and how to organize threads, blocks, and grids. For this tutorial, we will complete the previous tutorial by writing a kernel function. The goal of this application is very simple. The idea is to take two arrays of floating point numbers, and perform an operation on them and store the result in a third floating point array. We will then study how fast the code executes on a CUDA device, and compare it to a traditional CPU. The data analysis will take place toward the end of the article. Continue reading ‘CUDA – Tutorial 2 – The Kernel’ »
Archive for the ‘CUDA’ Category
Welcome to the first tutorial for getting started programming with CUDA. This tutorial will show you how to do calculations with your CUDA-capable GPU. Any nVidia chip with is series 8 or later is CUDA -capable. This tutorial will also give you some data on how much faster the GPU can do calculations when compared to a CPU. Continue reading ‘CUDA – Tutorial 1 – Getting Started’ »
Welcome to the first article in a series of tutorials to teach you the basics of using CUDA. These tutorials will teach you, in a user-friendly way, how CUDA works, and how to take advantage of the massive computational ability of modern GPUs.
The main strong point of CUDA is highly parallel number crunching. Fortunately, this is a very common type of problem encountered in many high performance computing problems. Here is a list of some example applications which have been created using CUDA to achieve maximum performance that is simply not possible on a CPU alone.
CUDA stands for Compute Unified Device Architecture, and is an extension of the C programming language and was created by nVidia. Using CUDA allows the programmer to take advantage of the massive parallel computing power of an nVidia graphics card in order to do general purpose computation. Continue reading ‘What is CUDA? An Introduction’ »