next up previous contents
Next: The IBM BlueGene/L. Up: Recount of (almost) available ... Previous: The HP/Compaq AlphaServer SC45 series.

The IBM eServer p690

Machine type RISC-based distributed-memory multi-processor cluster
Models IBM eServer p690.
Operating system AIX (IBMs Unix variant), Linux.
Connection structure Ω-switch
Compilers XL Fortran (Fortran 90), (HPF), XL C, C++
Vendors information Web page www-1.ibm.com/servers/eserver/pseries/hardware/highend/p690.html
Year of introduction 2002 (16/32-CPU POWER4+ SMP).

System parameters:

Model eServer p690
Clock cycle 1.9 GHz
Theor. peak performance
Per Proc. (64-bits) 7.6 Gflop/s
Per 16-proc. HPC node 121.6 Gflop/s
Per 32-proc. Turbo node 243.2 Gflop/s
Maximal 124.5 Tflop/s
Main memory
Memory/node ≤ 1 TB
Memory/maximal 512 TB
No. of processors 8--16,384
Communication bandwidth
Node-to-node (bidirectional) 2 GB/s

Remarks:

The eServer p690 is the successor of the RS/6000 SP. It retains much of the macro structure of this system: multi-CPU nodes are connected within a frame either by a dedicated switch or by other means, like switched Ethernet. The structure of the nodes, however, has changed considerably, see the POWER4+ The so-called Federation switch is the fourth generation of the high-performance interconnects made for the p690 series. The Federation switch is, like its predecessors, an Ω-switch as described in the section on SM-MIMD systems. It has a bi-directional link speed of 2 GB/s and an MPI latency of 5—7 µs. Although we mentioned only the highest speed option for the communication, the high-performance switch, there is a wide range of other options that could be chosen instead, e.g., Gbit Ethernet is also possible.
Applications can be run using PVM or MPI. IBM used to support High Performance Fortran, both a proprietary version and a compiler from the Portland Group. It is not clear whether this is still the case. IBM uses its own PVM version from which the data format converter XDR has been stripped. This results in a lower overhead at the cost of generality. Also the MPI implementation, MPI-F, is optimised for the p690-based systems. As the nodes are in effect shared-memory SMP systems, within the nodes OpenMP can be employed for shared-memory parallelism and it can be freely mixed with MPI if needed. In addition to its own AIX OS IBM also supports some Linux distributions: both the professional versions of RedHat and SuSe Linux are available for the p690 series.

The standard commercial models that are marketed contain up to 128 nodes. However, on special request systems with up to 512 nodes can be built. This largest configuration is used in the table above (although never a system of a size exceeding 128 nodes has been sold yet). A POWER5-based system p690 system might come onto the market soon but no definite plans in this direction are known.

Measured Performances:
In [42] a performance of 6188 Gflop/s for a 1600 processor system with the slightly slower 1.7 GHz variant of the processor is reported for solving a dense linear system of order N = 355,000 yielding an efficiency of 57%. A system with 8 Turbo nodes was reported to obtain a speed of 737 Gflop/s out of 1331 Gflop/s on a linear system of size 285,000, an efficiency of 55%. As this type of application primarily operates from the L1 cache, the more or less similar efficiencies are as expected.



next up previous contents
Next: The IBM BlueGene/L. Up: Recount of (almost) available ... Previous: The HP/Compaq AlphaServer SC45 series.

Aad van der Steen
Wed Oct 13 13:22:23 CEST 2004