Machine type | RISC-based distributed-memory multi-processor |
---|---|
Models | ADENART64, ADENART256 |
Operating system | Internal OS transparent to the user, SunOS (Sun's Unix variant) on the front-end system |
Connection structure | HX-net (see remarks) |
Compilers | ADETRAN, an extended Fortran 77 |
System parameters:
Model | ADENART64 | ADENART256 |
---|---|---|
Clock cycle | 50 ns | 50 ns |
Theor. peak performance | ||
Per Proc. (64 bits) | 10 Mflop/s | 10 Mflop/s |
Maximal (64 bits) | 0.64 Gflop/s | 2.56 Gflop/s |
Main memory | 0.5GB | 0.5GB |
Memory/node | 8MB | 2MB |
{ Communication bandwidth} | 20 MB/s | 20MB/s |
No. of processors | 64 | 256 |
Remarks:
The ADENART has an interesting interconnection structure that is somewhere halfway between a crossbar and a grid. The processors are organised in planes, where for each plane all processors are connected by a crossbar. Between planes there is a connection structure that connects each crossbar node in a plane directly with its corresponding counterpart on all other planes. So, for a processor (i,j) in plane data that are required by processor (k,j) in the same plane can be transported by simply shifting it through the in-plane crossbar which can be accomplished in one step. For processors in different planes the number of steps is at most two. In the first step the data is routed to the right crossbar node in one plane and after being send to the plane where the target processor resides, send there from the corresponding crossbar node to the processor that requires them. The connection structure is called HX-net by Matsushita. Because of the connection structure the number of processors is constrained to be of the form and presently in the two model numbers available n is 3 or 4 (a machine with 1024 processors, n=5, is being considered). As remarked, the complexity of the network is lower than that of a crossbar: instead of while the efficiency is half of that of a crossbar: a maximum of 2 steps instead of 1.
The processors consist of a proprietary RISC processor with a peak speed of 20 Mflop/s in perfect pipeline mode, however, a ``sustained speed'' of 10 Mflop/s is quoted by Matsushita to arrive at the peak performance given in the system parameters list above. The inter-processor bandwidth is 20 MB/s, which is quite reasonable with respect to the processor speed. At this moment nothing is known about the message setup overhead however. Curiously enough, the amount of memory per node is 4 times larger for the ADENART64 than for the 256-processor model (8MB against 2MB per node). The latter memory size seems fairly small for a processor node that is meant to process large amounts of data. The front-end machine that hosts the ADENART is a Solbourne (Sun 4 compatible) workstation.
Measured Performances: In [#kadota#
Next: The Meiko Computing Surface
Up: Distributed-memory MIMD systems
Previous: The Intel Paragon XP.
Jack Dongarra
Sat Feb 10 15:12:38 EST 1996