Machine type | RISC-based distributed-memory multi-processor |
---|---|
Models | ADENART64, ADENART256 |
Operating system | Internal OS transparent to the user, SunOS (Sun's Unix variant) on the front-end system |
Connection structure | HX-net (see remarks) |
Compilers | ADETRAN, an extended Fortran 77 |
System parameters:
Model | ADENART64 | ADENART256 |
---|---|---|
Clock cycle | 50 ns | 50 ns |
Theor. peak performance | ||
Per Proc. (64 bits) | 10 Mflop/s | 10 Mflop/s |
Maximal (64 bits) | 0.64 Gflop/s | 2.56 Gflop/s |
Main memory | 0.5GB | 0.5GB |
Memory/node | 8MB | 2MB |
{ Communication bandwidth} | 20 MB/s | 20MB/s |
No. of processors | 64 | 256 |
Remarks:
The ADENART has an interesting interconnection structure that is somewhere
halfway between a crossbar and a grid. The processors are organised in planes,
where for each plane all processors are connected by a crossbar. Between planes
there is a connection structure that connects each crossbar node in a plane
directly with its corresponding counterpart on all other planes. So, for a
processor (i,j) in plane data that are required by processor (k,j) in the
same plane can be transported by simply shifting it through the in-plane
crossbar which can be accomplished in one step. For processors in different
planes the number of steps is at most two. In the first step the data is
routed to the right crossbar node in one plane and after being send to the
plane where the target processor resides, send there from the corresponding
crossbar node to the processor that requires them. The connection structure is
called HX-net by Matsushita. Because of the connection structure the number of
processors is constrained to be of the form and presently in the two
model numbers available n is 3 or 4 (a machine with 1024 processors, n=5,
is being considered). As remarked, the complexity of the network is lower than
that of a crossbar:
instead of
while the efficiency is
half of that of a crossbar: a maximum of 2 steps instead of 1.
The processors consist of a proprietary RISC processor with a peak speed of 20 Mflop/s in perfect pipeline mode, however, a ``sustained speed'' of 10 Mflop/s is quoted by Matsushita to arrive at the peak performance given in the system parameters list above. The inter-processor bandwidth is 20 MB/s, which is quite reasonable with respect to the processor speed. At this moment nothing is known about the message setup overhead however. Curiously enough, the amount of memory per node is 4 times larger for the ADENART64 than for the 256-processor model (8MB against 2MB per node). The latter memory size seems fairly small for a processor node that is meant to process large amounts of data. The front-end machine that hosts the ADENART is a Solbourne (Sun 4 compatible) workstation.
Measured Performances: In [#kadota#