The purpose of the COMMS1, or Pingpong, benchmark [22,10] is to measure the basic communication properties of a message-passing computer. A message of variable length, n, is sent from a master processor to a slave processor. The slave receives the message into a Fortran data array, and immediately returns it to the master. Half the time for this message pingpong is recorded as the time, t, to send a message of length, n. In the COMMS2 benchmark there is a message exchange in which two processors simultaneously send messages to each other and return them. In this case advantage can be taken of bidirectional links, and a greater bandwidth can be obtained than is possible with COMMS1. In both benchmarks, the time as a function of message length is fitted by least squares using the parameters (\rnhalf) [16,19] to the following linear timing model:
t = (n + \nhalf)/\rinf (2)
when the communication rate is given by
r = \frac {\rinf}{1+\nhalf/n} = \rinf \pipe (n/\nhalf) (3)
where \pipe (x) = \frac {1}{1 + 1/x} (4)
and the startup time is
t_0 = \nhalf/\rinf (5)
In the above equations, \rinf~ is the asymptotic bandwidth of communication which is approached as the message length tends to infinity (hence the subscript), and \nhalf~ is the message length required to achieve half this asymptotic rate. Hence \nhalf~ is called the half-performance message length.
The importance of the parameter \nhalf is that it provides a yardstick with which to measure message-length, and thereby enables one to distinguish the two regimes of short and long messages. For long messages (n > \nhalf), the denominator in equation (5) is approximately unity and the communication rate is approximately constant at its asymptotic rate, \rinf
r \approx \rinf (6)
For short messages (n < \nhalf), the communication rate is best expressed in the algebraically equivalent form
r = \frac {\pi_0 n} {(1+ n/\nhalf)} (7)
where \pi_0 = t_0 ^{-1} = \rinf/\nhalf (8)
For short messages, the denominator in equation 9 is approximately unity, so that
r \approx \pi_0 n = n / t_0 (9)
In sharp contrast to the approximately constant rate in the long-message limit, the communication rate in the short message limit is seen to be approximately proportional to the message length. The constant of proportionality, \pi_0, is known as the specific performance, and can be expressed conveniently in units of kilobyte per second per byte (kB/s)/B or `k/s'. Unfortunately since an SI prefix, such as k, cannot stand alone without a unit symbol, this unit must be written either as 10^3/s or as kHz, where Hz is a special unit name for per second (s^{-1}).
Thus, in general, we may say that \rinf~ characterises the long-message performance and \pi_0 the short-message performance. The COMMS1 benchmark computes all four of the above parameters, (\rinf, \nhalf, t_0, \rmand \pi_0), because each emphasises a different aspect of performance. However only two of them are independent. In the case that there are different modes of transmission for messages shorter or longer than a certain length, the benchmark can read in this breakpoint and perform a separate least-squares fit for the two regions. An example is the Intel iPSC/860 which has a different message protocol for messages shorter than and longer than 100 byte.
Because of the finite (and often large) value of t_0, the above is a two-parameter description of communication performance. It is therefore incorrect, and sometimes positively misleading, to quote only one of the parameters (e.g. just \rinf, as is often done) to describe the performance. The most useful pairs of parameters are (\rnhalf), (\pi_0, \nhalf) and (t_0, \rinf), depending on whether one is concerned with long vectors, short vectors or a direct comparison with hardware times. Note also that, although \nhalf is defined as the message length required to obtain half the asymptotic rate \rinf, the two parameters \rnhalf are sufficient to calculate the communication rate for any message length via equation 5, or equivalently using \pi_0 instead of \rinf via 9.
The COMMS1 and COMMS2 benchmarks exist as part of the Genesis benchmarks [23].