next up previous contents
Next: The HP/Convex C4 series. Up: Shared-memory MIMD systems Previous: The Cray Research Inc.

The Hitachi S3800 series.

Machine type Vectorprocessor
Models S-3800/x60, S-3800/y8z; x=1,2 y=1,2,4 z=0,2
Operating system VOS3/HAP/ES (IBM MVS compatible) and OSF/1
Compilers FORT77/HAP vectorising Fortran 77

System parameters:

Model S3800/x60 S3800/y8z
Clock cycle VPU 2 ns 2 ns
Clock cycle scal. proc. 6 ns 6 ns
Theor. peak performance 4-8 Gflop/s 8-32 Gflop/s
No. of processors
Scalar 1-2 1-4
Vector 1-2 1-4
Main memory 256-1024MB 512-2048MB
Extended memory <=16GB <=32GB

Remarks:

The S3800 is the current top-end system of Hitachi's S-3000 series. Five different models are offered: The 160 and the 260 in which the 260 is simply the 2-CPU version of the the 160. Furthermore, there is a sub-series 180, 280, and 480, of which the 280 and 480 are again 2-CPU and 4-CPU versions of the 180. However, in addition, there is a model 182 with 2 scalar processors and 1 vector processor as is offered in the Fujitsu VPX200 series and for the same reason: context switching delay between jobs should be reduced by this scheme. The smallest model, the S-3800/160 has 4 multi-functional multiply/add pipes which may deliver up to 8 results per clock cycle. This is equivalent to 4 Gflop/s. In the /180 the number of pipes is doubled to 8 with a corresponding peak performance of 8 Gflop/s. All models feature one or more separate divide pipes. As the multi-headed systems can work in parallel, the top model, the S-3800/480, may theoretically attain a speed of 32 Gflop/s.

Hitachi now delivers an auto-parallelising compiler, which features parallelising compiler directives similar to those of Cray and NEC. The OSF/1 system can be run under the MVS-like VOS3/HAP/ES, but it can also be run as a native operating system.

Measured Performances: The first S-3000 system, a S-3800/480, was installed in January 1993 at the University of Tokyo. Tests with the EuroBen benchmark were done on this system in July-September 1993. During these tests a speed of 5.7 Gflop/s was observed for the evaluation of a tex2html_wrap_inline1172 degree polynomial on a single processor. In matrix-vector multiplication, speeds of 6.5 Gflop/s on one processor were measured (see [#euroben##1#, #hitachi##1#]). In [#linpackbm##1#] a speed of 28.4 Gflop/s on 4 processors is reported for the solution of an order 15,500 dense linear system. The efficiency is here 89%.



next up previous contents
Next: The HP/Convex C4 series. Up: Shared-memory MIMD systems Previous: The Cray Research Inc.



Jack Dongarra
Sat Feb 10 15:12:38 EST 1996