Until now, we have only talked about synchronous, blocking communications in MPI. This tutorial will focus on Asynchronous, non-blocking communication with MPI. Asynchronous communication is often the key to achieving high performance computing with MPI applications. Using asynchronous communication has several advantages.
The biggest advantage is that the functions are non-blocking, which allows threads to continue doing computations while communication with another thread is still pending. Another advantage is that if used correctly, you can bypass some internal MPI buffers, and dramatically increase the communication bandwidth of your program. There are many flavors of asynchronous communication, but the most basic ones are called MPI_Isend and MPI_Irecv.
Unlike the synchronous version of MPI_Send, MPI_Isend is non-blocking. If a thread uses this function to send data, the function will return immediately, usually before the data is finished being sent. WARNING. Because the function returns before the data transaction has finished, you must be extra careful not to change any of the data in the buffer that you passed as an argument into MPI_Isend. For example, the following code is not safe.
int i=123; MPI_Request myRequest; MPI_Isend(&i, 1, MPI_INT, 1, MY_LITTLE_TAG, MPI_COMM_WORLD, &myRequest); i=234;
In the code above, notice how the address of i is passed in as the buffer, because we only intend on sending one integer. Now notice how we change i directly after the MPI_Isend function call. This practice is inherently unsafe, because it is possible that the backend MPI routines will read the value of i after we changed it to 234, so 234 will be sent instead of 123. This is a race condition, which can be very difficult to debug.
Avoiding race conditions
If a thread that sends data through MPI_Isend needs to reuse the buffer for something else at some later point in time, it is necessary to wait until the asynchronous send is finished. Luckily, MPI has provided us with a function which does exactly this, MPI_Wait. Consider the sample code below:
int i=123; MPI_Request myRequest; MPI_Isend(&i, 1, MPI_INT, 1, MY_LITTLE_TAG, MPI_COMM_WORLD, &myRequest); // do some calculations here // Before we re-use variable i, we need to wait until the asynchronous function call is complete MPI_Status myStatus; MPI_Wait(&myRequest, &myStatus); i=234;
In the code above, you can see that before the thread can reuse the buffer or variable passed into MPI_Isend, it must call MPI_Wait first to avoid any potential race conditions. In the example above, the thread can continue to perform calculations while the communication is taking place, which can make for more efficient parallel programs.
The receiving side
So far, we’ve only covered how to send data asynchronously, but now it’s time to discuss how to receive data asynchronously. Just as there is an MPI_recv, there is an MPI_Irecv function. When called, this function returns immediately, which allows the thread to continue doing calculations of some sort while the data is still being received. However, before the receiving thread actually uses the data it is receiving, it must call MPI_Wait to ensure that the data is valid
Performance with asynchronous receive
MPI is a fairly complex protocol with many different implementations by different companies. The main reason asynchronous communication is important is because it can help make your programs more efficient, and thus, faster. Due to the nature of asynchronous sends and receives, a sending thread could call MPI_Isend before the receiving thread calls MPI_Irecv. In such a case, The MPI implementation would temporarily store the information in an internal buffer somewhere in the receiving machine. However, if MPI_Irecv is called before the sending thread calls MPI_Isend, then the MPI implementation would simply start filling in data in the buffer you provided instead of using an internal buffer first. This effectively eliminates the need for a memory copy on the receiving thread, which can be fairly costly for large pieces of data. Therefor, if you’re sending large pieces of data, it may be in your best interest to call MPI_Irecv before the sending thread calls MPI_Isend. This can be hard to accomplish sometimes, but if it’s possible to do in your application, this can help squeeze out a little extra performance.
Other flavors of asynchronous communication
MPI is a complicated protocol and has many different flavors of functions to send and receive data both synchronously, and asynchronously, but this tutorial just shows you the basic functions that are most often used. The use of other, less standard, functions may have unexpected negative performance impacts to your program, so it’s a good idea to read up carefully if you intend on using other functions. Generally, the more complicated functions are not recommended for high performance applications.