The advent of the Mark III machine [Peterson:85a] generated a rapid development in applications software. In the previous five years, the crystalline system had shown itself to be a powerful tool for extracting maximum performance from the machines, but the new Mark III encouraged us to look at some of the ``programmability'' issues, which had previously been of secondary importance.
The first and most natural development was the generalization of the CrOS system for the new hardware [Johnson:86c], [Kolawa:86d]. Christened ``CrOS III,'' it allowed us the flexibility of arbitrary message lengths (rather than multiples of the FIFO size), hardware-supported collective communication-the Mark III allowed hardware support of simultaneous communication down multiple channels, which allowed fast cube and subcube broadcast [Fox:88a]. All of these enhancements, however, maintained the original concept of nearest-neighbor (in a hypercube) communication supported by collective communication routines that operated throughout (or on a subset of) the machine. In retrospect, the hypercube's specific nature of CrOS should not have been preserved in the major redesign of CrOS III.