Welcome to my tutorial on the very basics of OpenMP. OpenMP is a powerful and easy tool which makes multi-threaded programming very easy. If you would like your program to run faster on dual, or quad core computers, then your project may be very well suited to OpenMP.

Usually, a program will consist of parts which should only run on one thread. Other portions of the program, such as a big for loop, could run significantly faster if multiple threads were used. When writing an application in OpenMP, by default, the code you write will be executed on one, and only on thread called the master thread. When you get to a point in your program that should use multiple threads, you can easily use the OpenMP compiler directives to identify these sections of code. The important thing I want you to take away from this paragraph is this; some portions of your program will be serial, other parts will be parallelized.

Basic OpenMP functions

The two most basic functions you’ll need to know is omp_get_thread_num(), and omp_get_num_threads(). These two functions simply return the current thread number, and the total number of threads, respectively. If you call these functions during a serial, (non-parallelized) portion of your program, it will look as if there is 1 thread, and the thread ID is 0, which is the master thread. However, if these functions are called within a parallelized portion of your program, then you will get the true thread number, and the true total number of threads.

Declaring a parallel section of code

Because code will only run on the master thread by default, we must tell the compiler which sections of code we would like parallelized across all the threads. Defining a parallel section of code can be as easy as this:

#pragma omp parallel
{
  // Insert code here
}

In the pseudo-code above, we simply give the compiler a directive, #pragma omp parallel, and use curly brackets around the region of code we want to run on all threads.

Private variables vs. Shared variables

OpenMP applications are designed to run exclusively on shared memory systems. Your personal computer is an example of a shared memory system. Sometimes, threads will need to use memory which is available to all other threads. These are ‘shared’ variables in the sense that they are available and consistent across all threads. Other times, a thread may have to use memory and variables which have different values depending on the thread. Here is an example of private variables.

#pragma omp parallel private(tid)
{
  // This statement will run on each thread.
  // If there are 4 threads, this will execute 4 times in total

  tid = omp_get_thread_num();
  printf("Running on thread %d\n", tid);

  if (tid == 0)
  {
    nThreads = omp_get_num_threads();
    printf("Total number of threads: %d\n", nThreads);
  }
}

In this example, there is one variable which is declared as private during the parallel code block definition. This variable, tid, is to store the current thread ID. Because each thread has it’s own ID, this variable should be private because it is different for each thread. The variable, nThreads, for example, is only used by thread 0. So in this case, it is not necessary to declare nThreads as a private variable because it is only used by one thread, and also because this should be the same value across all threads.

Download the example source code here