next up previous contents
Next: The C-DAC PARAM 9000/SS. Up: Distributed-memory MIMD systems Previous: The Alex AVX 2.

The Avalon A12.

Machine type RISC-based distributed-memory multi-processor
Models Avalon A12
Operating system AVALON micro kernel based Unix (Image compatible with Digital Unix
Connection structure Multistage variable (see remarks)
Compilers Fortran 77, Fortran 90, HPF, ANSI C

System parameters:

Model A12
Clock cycle 3.3 ns
Theor. peak performance
Per proc. (64-bit) 600 Mflop/s
Maximal (64-bit) --
Memory/node --
Memory (maximal) --
Communication bandwidth
Point-to-point 128-400 MB/s
Bisectional (full system) --
No. of processors --

Remarks:

The Avalon technical documentation is not entirely helpful in providing complete information with regard to system configurations. Therefore the list of system parameters above is somewhat incomplete. The A12 will be based on the DEC Alpha 21164 RISC procesor. This processor has a clock cycle of 3.3 ns. Because the Alpha 21164 has dual floating-point arithmetic pipes it will deliver a theoretical peak performance of 600 Mflop/s. The total performance of the system, however, cannot be specified because the maximum number of processors is not given. In addition to the usual first and second level cache that reside on chip, a 1 MB third level cache is provided on each A12 CPU card. The bandwidth to/from the first level cache is sufficient to transport two operands to the CPU and to ship one result back in one cycle. The second level cache has two-thirds of is bandwidth, while the third level cache has the capability of providing an 64-bit word every two cycles. The bandwidth to/from memory is 400 MB/s or one 64-bit word every 6 cycles. The memory has two-way interleaved banks but the size of the memory is not specified in the documentation.

Each CPU card contains a Alpha 21164 processor, the third level or B cache and the local memory for that node. Twelve CPU cards can be housed in one crate which has a full crossbar backplane. This yields a internode bandwidth of slightly under 400 MB/s between the cards within one crate. Apart from the 12 slots for CPU cards, there are two extra dual channel slots that can accomodate communication cards that provide the connections with other crates. For the in-crate crossbar CMOS technology is used. However, for the intercrate connections ECL logic is employed. The actual connections between crates are made by coaxial cables. This way of connection provides a large flexibility in the overall interconnection topology: one could build trees or toruses or a secondary level crossbar (is the last case one crate should be filled entirely with communication cards to build a 144 processor system). The communication speed between crates is less fast (but still respectable): 128 MB/s.

I/O can be configured in various ways: It is possible to put 32-bit or 64-bit PCI expansion cards on each CPU card to obtain what Avalon calls ``Type 1 I/O nodes''. Also, a direct switch connection via a variant of the communication card can be made to the outside world. Depending on the number of cards the bandwidth is 400 or 800 MB/s for this type 3 I/O node. The type 2 I/O node is in fact a dedicated TCP/IP connection as needed for the control workstation as required by the system.

Measured Performances: The A12 is expected to be available by the first quarter of 1996. As yet no systems are benchmarkable. So, no performance figures are known at this moment.



next up previous contents
Next: The C-DAC PARAM 9000/SS. Up: Distributed-memory MIMD systems Previous: The Alex AVX 2.



Jack Dongarra
Sat Feb 10 15:12:38 EST 1996