Next: Achieving High Performance on
Up: Achieving High Performance with
Previous: Achieving High Performance with
Assuming that the ScaLAPACK installation was done correctly, the users need
only make sure that they are using an appropriate number of processors
and that their matrices are efficiently distributed. Here is a
checklist to get started.
- Use the right number of processors.
- Rule of thumb: for an matrix.
This provides a local matrix of size approximately 1000 by 1000.
- Do not try to solve a small problem on too many processors.
- Do not exceed physical memory.
- Use an efficient data distribution.
- Block size (i.e., MB,NB) = 64.
- Square processor grid, .
- Use efficient machine-specific BLAS (not the Fortran 77 reference
implementation BLAS) and BLACS (nondebug, BLACSDBGLVL=0 in
Bmake.inc)
If the performance is still below that expected, see
section 5.3. For guidelines on tuning for higher performance,
see section 5.4.
Susan Blackford
Tue May 13 09:21:01 EDT 1997