next up previous contents
Next: Distributed-memory SIMD systems Up: Shared-memory SIMD systems Previous: Shared-memory SIMD systems

The Hitachi S3600 series.

Machine type Vectorprocessor
Models S3600/120, S3600/140, S3600/160, S3600/180
Operating system VOS3/HAP/ES (IBM MVS compatible) and OSF/1
Compilers FORT77/HAP vectorising Fortran 77

System parameters:

Model S3600/120 S3600/140 S3600/160 S3600/180
Clock cycle VPU 4 ns 4 ns 4 ns 4 ns
Clock cycle scal. proc. 8 ns 8 ns 8 ns 8 ns
Theor. peak performance 0.25 Gflop/s 0.5 Gflop/s 1.0 Gflop/s 2 Gflop/s
Main memory 128-256MB 256-512MB 256-512MB 512-1024MB
Extended memory <=6GB <=16GB <=16GB <=16GB

Remarks:

The speed differences between the different models stem from replication of the multiply/add pipe in the models S3600/120-180. The /160 and /180 models have respectively two- and four-fold sets of a separate add- and a multifunctional multiply/add vector pipes. This should lead to a maximum of 3 results per clock cycle per pipe set. So, contrary to the information given by the vendor, the maximum performance of, e.g., the /180 should in some situations be 3 Gflop/s instead of 2.

Note that the clock cycle of the scalar processor is twice that of the VPU. The memory bandwidth from the memory from/to the CPU is 2 operands per clock cycle via 1 load and 1 load/store pipe per arithmetic pipe set), which is somewhat less than optimal. It is not possible to load two operands and store one result in one cycle. The /120 model lacks a separate load pipe, only a load/store pipe is present.

A unique feature of the S3600, as in its direct predecessor the S-820, is that all machines of the series are air cooled. All other machines in this class relied at least on water cooling.

Unlike the S-820 series, the S3600 series is also marketed worldwide, not only in Japan. This is also the case for the S3800 SM-MIMD machines.

Measured performances: In [4] a speed of 851 Mflop/s for the solution of a full linear system of order 1000 is reported for the S3600/160. The S3600/180 attains a performance of 1672 Mflop/s on the same problem.



Aad van der Steen
Thu Feb 27 14:09:51 MET 1997