Point to Point Communication

MPI provides a set of send and receive functions that allow the communication of typed data with an associated tag. Typing of the message contents is necessary for heterogeneous support - the type information is needed so that correct data representation conversions can be performed as data is sent from one architecture to another. The tag allows selectivity of messages at the receiving end: one can receive on a particular tag, or one can wild-card this quantity, allowing reception of messages with any tag. Message selectivity on the source process of the message is also provided.

A fragment of code appears in figure 1 for the example of process 0 sending a message to process 1. This code executes on both process 0 and process 1. The example sends a character string. MPI_COMM_WORLD is a default communicator provided upon start-up. Among other things, a communicator serves to define the allowed set of processes involved in a communication operation. Process ranks are integers, serve to label processes, and are discovered by inquiry to a communicator (see the call to MPI_Comm_rank()). The typing of the communication is evident by the specification of MPI_CHAR. The receiving process specified that the incoming data was to be placed in msg and that it had a maximum size of 20 elements, of type MPI_CHAR. The variable status, set by MPI_Recv(), gives information on the source and tag of the message and how many elements were actually received. For example, the receiver can examine this variable to find out the actual length of the character string received.

This example employed blocking send and receive functions. The send call blocks until the send buffer can be reclaimed (i.e., after the send, process 0 can safely over-write the contents of msg). Similarly, the receive function blocks until the receive buffer actually contains the contents of the message. MPI also provides non-blocking send and receive functions that allow the possible overlap of message transmittal with computation, or the overlap of multiple message transmittals with one-another. Non-blocking functions always come in two parts: the posting functions, which begin the requested operation; and the test-for-completion functions, which allow the application program to discover whether the requested operation has completed.

This seems like rather a lot to say about a simple transmittal of data from one process to another, but there is even more. To understand why, we examine two aspects of the communication: the semantics of the communication primitives, and the underlying protocols that implement them. Consider the previous example, on process 0, after the blocking send has completed. The question arises: if the send has completed, does this tell us anything about the receiving process? Can we know that the receive has finished, or even, that it has begun?

Such questions of semantics are related to the nature of the underlying protocol implementing the operations. If one wishes to implement a protocol minimizing the copying and buffering of data, the most natural semantics might be the ``rendezvous'' version, where completion of the send implies the receive has been initiated (at least). On the other hand, a protocol that attempts to block processes for the minimal amount of time will necessarily end up doing more buffering and copying of data.

The trouble is, one choice of semantics is not best for all applications, nor is it best for all architectures. Because the primary goal of MPI is to standardize the operations, yet not sacrifice performance, the decision was made to include all the major choices for point to point semantics in the standard.

An additional, complicating factor is that the amount of space available for buffering is always finite. On some systems the amount of space available for buffering may be small or non-existent. For this reason, MPI does not mandate a minimal amount of buffering, and the standard is very careful about the semantics it requires.

The above complexities are manifested in MPI by the existence of modes for point to point communication. Both blocking and non-blocking communications have modes. The mode allows one to choose the semantics of the send operation and, in effect, to influence the underlying protocol of the transfer of data.

In standard mode the completion of the send does not necessarily mean that the matching receive has started, and no assumption should be made in the application program about whether the out-going data is buffered by MPI. In buffered mode the user can guarantee that a certain amount of buffering space is available. The catch is that the space must be explicitly provided by the application program. In synchronous mode a rendezvous semantics between sender and receiver is used. Finally, there is ready mode. This allows the user to exploit extra knowledge to simplify the protocol and potentially achieve higher performance. In a ready-mode send, the user asserts that the matching receive already has been posted.

Next: User-defined Datatypes Up: An Introduction to the Previous: What MPI Does

Jack Dongarra
Tue Jan 17 21:48:11 EST 1995