Machine type | Shared-memory SMP system. |
---|---|
Models | TX-7 i9010, i9510. |
Operating system | Linux, HP-UX (HP's Unix variant). |
Connection structure | Crossbar. |
Compilers | Fortran 90, HPF, ANSI C, C++ |
Vendors information Web page | http://www.hpce.nec.com/465.0.html |
Year of introduction | 2002. |
System parameters:
Model | i9010 | i9510 |
---|---|---|
Clock cycle | 1.5 GHz | 1.5 GHz |
Theor. peak performance | ||
Per Proc. (64 bits) | 6 Gflop/s | 6 Gflop/s |
Maximal | 96 Gflop/s | 192 Gflop/s |
Main memory | ≤ 64 GB | ≤ 128 GB |
No. of processors | 16 | 32 |
Remarks:
The TX-7 series is offered in 4 models of which we only discuss the 2 largest ones. The TX-7 is another of the Itanium 2-based servers (see also the Bull NovaScale, the SGI Altix3000, and the Unisys ES7000) which recently appeared on the market. The largest configuration presently offered is the TX-7/i9510 with 32 1.5 GHz Itanium 2 processors. NEC had already some experience with Itanium servers offering 16-processor Itanium 1 servers under the name AsuzA. So, the TX-7 systems can be seen as a second generation.
The processors are connected by a flat crossbar. NEC still sells its TX-7s with the choice of processors that Intel has: 1.3, 1.4, and 1.5 GHz processors with L3 caches of 3--6 MB depending on the clock frequency (see the Itanium 2 for a full description).
Unlike the other vendors that employ the Itanium 2 processors, NEC offers its own compilers including an HPF compiler which is probably available for compatibility with the software for the NEC SX-6 because it is hardly useful on a shared-memory system like the TX-7. The software also includes MPI and OpenMP. Apart from Linux also HP-UX is offered as an Operating System which may be useful for migration of HP-developed applications to a TX-7.
Measured Performances:
In the spring of 2004 rather extensive benchmark experiments with the EuroBen Benchmark were performed on a 16-processor
TX-7 i9010 with the 1.5 GHz variant of the processor. Using the EuroBen
benchmark, the MPI version of a dense matrix-vector multiply was found to be
14.5 Gflop/s on 16 processors while both for solving a dense linear system of
size N = 2,000 and a 1-D FFT of size N = 16,384 speeds of
3.8--4.1 Gflop/s are observed (see [40]).