5.2.8 A Ray Tracer-and an ``Operating System''

Next: 5.2.9 The Crystal Router Up: 5.2 A ``Packet'' History Previous: 5.2.7 Host Programs

5.2.8 A Ray Tracer-and an ``Operating System''

While the radical change that led to Cubix was happening, the non-crystalline users were still developing alternative communication strategies. As mentioned earlier, rdsort never became popular due to the burden of bookkeeping that was placed on the user and the unfamiliarity of the interrupt concept.

The ``9 routines'' [Johnson:86a] attempted to alleviate the most burdensome issues by removing the interrupt nature of the system and performing buffering and queueing internally. The resultant system was a simple generalization of the wtELT concept, which replaced the ``channel'' argument with a processor number. As a result, messages could be sent to non-neighboring nodes. An additional level of sophistication was provided by associating a ``type'' with each message. The recipient of the messages could then sort incoming data into functional categories based on this type.

The benefit to the user was a simplified programming model. The only remaining problem was how to handle this new found freedom of sending messages to arbitrary destinations.

We had originally planned to build a ray tracer from the tools developed while studying melting. There is, however, a fundamental difference between the melting process and the distributed database searching that goes on in a ray tracer. Ray tracing is relatively simple if the whole model can be stored in each processor, but we were considering the case of a geometric database larger than this.

Melting posed problems because the exact nature of the interaction was known only statistically-we might need to communicate with all processors up to two hops away from our node, or three hops, or some indeterminate number. Other than this, however, the problem was quite homogeneous, and every node could perform the same tasks as the others.

The database search is inherently non-deterministic and badly load-balanced because it is impossible to map objects into the nodes where they will be used. As a result, database queries need to be directed through a tree structure until they find the necessary information and return it to the calling node.

A suitable methodology for performing this kind of exercise seemed to be a real multitasking system where ``processes'' could be created and destroyed on nodes in a dynamic fashion which would then map naturally onto the complex database search patterns. We decided to create an operating system.

The crystalline system had been described, at least in the written word, as an operating system. The concept of writing a real operating system with file systems, terminals, multitasking, and so on, was clearly impossible while communication was restricted to single hops across hypercube channels. The new system, however, promised much more. The result was the ``Multitasking, Object-Oriented, Operating System'' (MOOOS, commonly known as MOOSE) [Salmon:86a]. The follow-up MOOS II is described in Section 15.2.

The basic idea was to allow for multitasking-running more than one process per node, with remote task creation, scheduling, semaphores, signals-to include everything that a real operating system would have. The implementation of this system proved quite troublesome and strained the capabilities of our compilers and hardware beyond their breaking points, but was nothing by comparison with the problems encountered by the users.

The basic programming model was of simple, ``light weight'' processes communicating through message box/pipe constructs. The overall structure was vaguely reminiscent of the standard UNIX multiprocessing system; fork/exec and pipes (Figure 5.4). Unfortunately, this was by now completely alien to our programmers, who were all more familiar with the crystalline methods previously employed. In particular, problems were encountered with naming. While a process that created a child would automatically know its child's ``ID,'' it was much more difficult for siblings to identify each other, and hence, to communicate. As a natural result, it was reasonably easy to build tree structures but difficult to perform domain decompositions. Despite these problems, useful programs were developed including the parallel ray tracer with a distributed database that had originally motivated the design [Goldsmith:86a;87a].

Figure 5.4: A ``MOOSE'' Process Tree

An important problem was that architectures offered no memory protection between the lightweight processes running on a node. One had to guess how much memory to allocate to each process, which complicated debugging when the user guessed wrong. Later, the Intel iPSC implemented the hardware memory protection, which made life simpler ([Koller:88b] and Section 15.2).

In using MOOSE, we wanted to explore dynamic load-balancing issues. A problem with standard domain decompositions is that irregularities in the work loads assigned to processors lead to inefficiencies since the entire simulation, proceeding in lock step, executes at the speed of the slowest node. The Crystal Router, developed at the same time as MOOSE, offered a simpler strategy.

Next: 5.2.9 The Crystal Router Up: 5.2 A ``Packet'' History Previous: 5.2.7 Host Programs

Guy Robinson
Wed Mar 1 10:19:35 EST 1995