Machine type | RISC-based SMP-clustered DM-MIMD system. |
---|---|
Models | AlphaServer SC45 |
Operating system | Tru64 Unix (Compaq's flavour of Unix) |
Connection structure | Fat Tree |
Compilers | Fortran 77, HPF, C, C++ |
Vendors information Web page | h18002.www1.hp.com/alphaserver/sc/ |
Year of introduction | 2002. |
System parameters:
Model | AlphaServer SC45 |
---|---|
Clock cycle | 1.25 GHz |
Theor. peak performance | |
Per proc. (64-bit) | 2.5 Gflop/s |
Maximal (64-bit) | 10 Tflop/s |
Main memory | ≤ 8 TB |
No. of processors | ≤ 4096 |
Memory bandwidth | |
Processor—memory | 1.33 GB/s |
Between cluster nodes | 280 MB/s |
Remarks:
The AlphaServer SC is the very high end of HP's AlphaServer line (SC stands for SuperComputer). The system is typical for the present development of SMP-based clustered systems. In the SC system the basic SMP node is the Compaq ES45, a 4-CPU SMP system with the Alpha 21264a (EV68) as its processor. The clock frequency is 1.25 GHz. The SMP node has a crossbar as its internal network with an aggregate bandwidth of 5.2 GB/s (1.33 GB/s/processor). This is sufficient to deliver 1.0.64 byte/clock cycle to each processor in the node simultaneously.
Within a node the system is a shared memory machine that allows for shared-memory parallel processing, for instance by using OpenMP. When more than four processors are required, one has to use a message passing programming model like MPI, PVM, or HPF (HP/Compaq is one of the few companies that still provides its own HPF compiler).
For communication between the SMP nodes the SC uses QsNet, a network manufactured by QSW Limited. In fact QsNet is the follow-on of the network employed in the former Meiko CS-2 systems (see QsNet). The network has the structure of a fat tree, is based on PCI technology, and has a point-to-point bandwidth of 280 MB/s. Because of its fat tree structure the bandwidth in the upper level of the network is 340 MB/s sustained. The peak bandwidth is, according to the documentation, “500 MB/s per server” without further specification which looks impressive but is not very informative. QSW claims a very low latency of 5 µs for MPI messages.
Measured Performances: In [42] a performance of 13.88 Tflop/s on a 2-way cluster of fully configured AlphaServer SC45s (8192 processors) was reported solving a full linear system of order 633,000 with an efficiency of 68.0%.