As we expand the ScaLAPACK test suite to encompass more rigorous testing,
particularly for floating point values close to the edge of representable
numbers (as is present in the LAPACK test suite), we are reminded of additional
dangers which must be avoided in floating point arithmetic. For example, it is a
sad reflection that some compilers still do not implement complex arithmetic
carefully. In particular, unscaled complex division still occurs on certain
architectures, leading to unnecessary overflow. To handle this
difficulty ScaLAPACK, as LAPACK, restricts the range of representable numbers by
a call to routine PDLABAD (in double precision), the equivalent of the
LAPACK routine DLABAD, which, if desired, takes the square root of the
smallest and largest representable numbers for the computation to protect from
unnecessary underflow or overflow. PDLABAD calls DLABAD locally on
each process and then communicates the minimum and maximum value respectively.
Arguably we should have separate routines for real and complex arithmetic, but
since we hope that the need for DLABAD will eventually disappear we have
so far resisted taking that step.
This is particularly irritating if one machine in a network is causing us to impose unnecessary restrictions on all the machines in the network, but without this, catastrophic results can occur during computations near the overflow or underflow thresholds.
Another problem that we have encountered during testing is in the way that
subnormal (denormalized) numbers are handled on certain (near) IEEE
architectures. By default, some architectures flush subnormal numbers to
zero.
Thus, if the computation involves numbers near underflow and a subnormal number
is communicated to such a machine, the computational results may be invalid and
the subsequent behavior unpredictable. Often such machines have a compiler
switch to allow the handling of subnormal numbers, but it can be non-obvious and
we cannot guarantee that users will use such a switch.
This behavior occurred during the heterogeneous testing of the linear least squares routines when the input test matrix was a full-rank matrix scaled near underflow. During the course of the computation a subnormal number was communicated, this value was unrecognized on receipt, and a floating point exception was flagged. The execution on the processor was killed, subsequently causing the execution on the other processors to hang. As we expand the test suite we expect to discover such behavior in other parts of ScaLAPACK, since we do not believe that there was anything special about the least squares routines.
A solution would be to replace subnormal numbers either with zero, or with the nearest normal number, but we are somewhat reluctant to implement this within ScaLAPACK, since this does not seem to be the right software level at which to do this.
A simple example program to illustrate this problem is given in Appendix A.