Active messaging was first introduced at the University of California at Berkeley as a means to reduce the communication latency inherent in distributed computing. The idea was driven by the discrepancy in performance between traditional multiphase communication protocols and the actual access time of the network device. The concept of AM is as elegant as it is simple. When an AM arrives, the address of a user-defined handler is read out of the message's header. This handler is then invoked with a the pointer to the message as its argument. This handler is small and compact. It performs some finite amount of work, which may include sending a reply AM, and then returns. The user's program can then continue. (This process is similar to that of an interrupt-driven device driver, the critical difference being that the user's code is executed upon reception of an interrupt from the device.) The role of the user-defined handler is merely to fill or empty the data from the network's buffers. Usually this consists of a simple read() or memcpy() and the setting of a few flags that are polled during computation. Active messages are intended to be the primitives upon which all other communication operations are based. It has been shown that both the two phase protocol used in most networking stacks and the less common three phase deadlock-free protocol map quite naturally to the AM paradigm[17]. This mapping implies that AM much more closely resembles the underlying communication model than do traditional communication protocols. Distributed applications can thus be redesigned with AM and can realize significant improvement in effective bandwidth. Active messages are not the same as remote procedure calls (RPCs). RPCs are heavyweight procedures usually consisting of many system calls. RPCs are often executed in a different process from the one that services the network. Moreover, RPCs are designed to perform a significant number of computations in the user's program only upon reception of a message. A program using active messaging, on the other hand, is working all the time and is interrupted only when there is data to be exchanged. The job of the handler specified in the message is simply to extract the data from the network and integrate it into the compute loop. In this way, computation and communication can be overlapped-hiding much of the latency encountered when using today's networks and their associated access methods.
In this paper we explore the feasibility of developing a PVM-AM layer. Any addition to the PVM suite must be portable to a wide range of platforms. Unfortunately, the portability of a piece of software is often inversely proportional to its performance in driving the hardware efficiently. Here, we equate good performance with achieving a high percentage of the hardware's theoretical bandwidth. For an effective implementation of active messaging in PVM, then, a tradeoff must be made: minimizing network access latency while maintaining a significant level of portability.