next up previous
Next: Testing the PBLAS Up: Case Studies on The Previous: LAPACK

ScaLAPACK and the PBLAS and BLACS

The ScaLAPACK project is a continuation of the LAPACK project and part of the project has been concerned with porting the LAPACK software to distributed memory parallel machines, producing the ScaLAPACK software package [&make_named_href('', "node12.html#CDDDOPSWW:DMW:95x","[]"), &make_named_href('', "node12.html#CDOPWW:DGJK:95x","[]"), &make_named_href('', "node12.html#CDW:NumAlg:95x","[]")]. Naturally, ScaLAPACK shares the goals of LAPACK mentioned above and hence uses the same approach of promoting and utilizing standards. ScaLAPACK additionally aims at scalability as the problem size and number of processors grows on distributed memory parallel machines.

As an aid to achieving these goals the ScaLAPACK software has been designed to look as much like the LAPACK software as possible. Because the BLAS have proven to be very useful tools both within LAPACK and outside, the ScaLAPACK project chose to build a set of Parallel BLAS, or PBLAS [&make_named_href('', "node12.html#CDOPWW:DMW:95x","[]")], whose interface is as similar to the BLAS as possible. This decision has permitted the ScaLAPACK code to be quite similar, and sometimes nearly identical, to the analogous LAPACK code. Only one substantially new routine was added to the PBLAS, matrix transposition, since this is a complicated operation in a distributed memory environment [&make_named_href('', "node12.html#CDW:PC:95","[]")].

It is hoped that the PBLAS will provide a distributed memory standard, just as the BLAS have provided a shared memory standard. This would simplify and encourage the development of high performance and portable parallel numerical software, as well as providing manufacturers with a small set of routines to be optimized. The acceptance of the PBLAS requires reasonable compromises among competing goals of functionality and simplicity. The PBLAS, like ScaLAPACK, perform global operations and call the BLAS to perform computations at single (local) nodes. In addition they call a set of Basic Linear Algebra Communication Subprograms, the BLACS [&make_named_href('', "node12.html#DW:UTK-cs:95","[]")], to perform the communication between processors. The BLACS can be thought of as playing the role for communication as the BLAS do for computation. The software hierarchy for ScaLAPACK is illustrated in Figure 1.

  
Figure 1:

The fact that ScaLAPACK software has the same structure as LAPACK greatly facilitates the production, maintenance and portability of the software, and has also enabled the user interface to be almost the same, thus making it much easier for users to port their programs between LAPACK and ScaLAPACK. Further details of the design of ScaLAPACK, together with performance results, can be found in CDDDOPSWW:DMW:95x.


next up previous
Next: Testing the PBLAS Up: Case Studies on The Previous: LAPACK

Jack Dongarra
Tue Sep 3 09:41:41 EDT 1996