Author Archive

This tutorial will be discussing how different threads can communicate with each other. In the previous tutorial, each thread operated without any interaction or data dependency from other threads. However, most parallel algorithms require some amount of data to be communicated between threads. Continue reading ‘CUDA – Tutorial 3 – Thread Communication’ »

Virtually all useful programs have some sort of loop in the code, whether it is a for, do, or while loop. This is especially true for all programs which take a significant amount of time to execute. Much of the time, different iterations of these loops have nothing to do with each other, therefore making these loops a prime target for parallelization. OpenMP effectively exploits these common program characteristics, so it is extremely easy to allow an OpenMP program to use multiple processors simply by adding a few lines of compiler directives into your source code. Continue reading ‘Tutorial – Parallel For Loops with OpenMP’ »

Welcome to the second tutorial in how to write high performance CUDA based applications. This tutorial will cover the basics of how to write a kernel, and how to organize threads, blocks, and grids. For this tutorial, we will complete the previous tutorial by writing a kernel function. The goal of this application is very simple. The idea is to take two arrays of floating point numbers, and perform an operation on them and store the result in a third floating point array. We will then study how fast the code executes on a CUDA device, and compare it to a traditional CPU. The data analysis will take place toward the end of the article. Continue reading ‘CUDA – Tutorial 2 – The Kernel’ »

Welcome to the first tutorial for getting started programming with CUDA. This tutorial will show you how to do calculations with your CUDA-capable GPU. Any nVidia chip with is series 8 or later is CUDA -capable. This tutorial will also give you some data on how much faster the GPU can do calculations when compared to a CPU. Continue reading ‘CUDA – Tutorial 1 – Getting Started’ »

Welcome to my tutorial on the very basics of OpenMP. OpenMP is a powerful and easy tool which makes multi-threaded programming very easy. If you would like your program to run faster on dual, or quad core computers, then your project may be very well suited to OpenMP. Continue reading ‘OpenMP tutorial – the basics’ »

Welcome to my tutorial on how to get started with writing OpenMP applications in Visual Studio. First things first, OpenMP is not available for the express or standard versions of Microsoft Visual Studio. Therefore, you will need the professional version or higher if you want to use visual studio to develop OpenMP project. Continue reading ‘Getting started with OpenMP on Visual Studio’ »

Over half of all computers sold today have more than one processor. While most new computers have two CPUs, the percentage of computers with four CPUs is steadily increasing. This trend will continue to increase well into the future. This is where OpenMP steps in.

Continue reading ‘What is OpenMP?’ »

In our previous tutorial, Thread Communication with MPI, we covered the basics of how to send data between threads. However, in the previous tutorial, only integers were sent. However, sometimes large amounts of data need to be sent between threads.

Continue reading ‘Sending large datasets in MPI’ »

Welcome to the thread communication with MPI tutorial. If you’re new to MPI, I suggest you go back and read the previous tutorials first. Otherwise, continue on to learn basic thread communication with MPI!

Continue reading ‘Thread Communication with MPI’ »

Welcome to the first article in a series of tutorials to teach you the basics of using CUDA. These tutorials will teach you, in a user-friendly way, how CUDA works, and how to take advantage of the massive computational ability of modern GPUs.

Continue reading ‘CUDA – The Basics’ »