next up previous contents
Next: The Digital Equipment Corp. Up: Shared-memory MIMD systems Previous: The Hitachi S3800 series.

The HP/Convex C4 series.

Machine type Shared-memory multi-vectorprocessor
Models C46x0, x = 1,...,4
Operating system ConvexOS (Convex's Unix variant)
Compilers Fortran, C, C++, ADA, Lisp
Vendors information Web page

System parameters:

Model C4600
Clock cycle 7.41 ns
Theor. peak performance
Per proc. (64-bit prec.) 810 Mflop/s
Per proc. (32-bit prec.) 1620 Mflop/s
Maximal, 64-bit precision 3240 Mflop/s
Maximal, 32-bit precision 6480 Mflop/s
No. of processors 1-4
Main memory <=4GB
Memory bandwidth
Single proc. bandwidth 1080 MB/s


Recently (November 1995), Convex Computer Corp. has become a subsidiary of Hewlett Packard. This has, at least for the moment no impact on the products that are marketed by HP/Convex. Both the vectorprocessors and the Exemplar SPP series (see section 3.4) will stay on the market. The C4600 series is the fourth generation of vectorprocessors from Convex. Unlike in the former C3800 series, with a maximum of 8 processors, the highest number of processors is four in the C4640 model. A major difference with the former generations is that more functional unit sets per CPU are present: six general purpose functional units. This brings the number of floating-point results per cycle to 6 in the ideal case. Because the floating-point units are general the opportunities for linking or independent processing are increased with respect to specialised multiply and add pipes which increases the scheduling density of operations. In addition, some logical operations can be done in the functional units which enables 32-bit convolutions to be done in excess of 1 Gflop/s (this is called the ``extended architecture'' in Convex jargon).

As in the former C3400 and C3800 GaAs components are used to arrive at the cycle time of 7.41 ns. Also like in these former models, there is difference in speed of a factor of two between single precision (32 bits) and double precision (64 bits) calculations.

As for the Convex Exemplar SPP-1200 (see 3.4.8) an ''application compiler'' is available that is capable of interprocedural analysis. This can greatly enhance the vectorisability of some codes and in general is beneficial in optimising large codes.

Measured performances: Traditionally, Convex systems are able to obtain a significant fraction of their theoretical peak performance. On a C220 (functionally equivalent to a C3220) 77.6 and 88.9 Mflop/s out of the theoretical 100 Mflop/s have been observed for a Fortran 77 and a library implementation of a linear system solver, respectively [#rijk##1#]. The C4600 proves to be no exception: on one processor the solution of a dense linear system of order N=1000 shows a speed of 683 Mflop/s on one processor for 64-bits precision and of 1320 Mflop/s on a C4620. At 32-bits precision speeds of resp. 1227 and 2252 Mflop/s were found on the C4610 and the C4620. In [#linpackbm##1#] a speed 1.933 Gflop/s out of 3.24 Gflop/s at maximum is reported.

next up previous contents
Next: The Digital Equipment Corp. Up: Shared-memory MIMD systems Previous: The Hitachi S3800 series.

Jack Dongarra
Sat Feb 10 15:12:38 EST 1996