Next: PerformancePortability and Scalability
Up: Achieving High Performance with
Previous: Achieving High Performance on
If a network meets the following guidelines, ScaLAPACK will perform
well on it (see section 5.1.1). If a network
of workstations does not meet one or more of these guidelines, read the
rest of this chapter for more information.
- The bandwidth per node, if measured in
Megabytes per second per node, should be no less than one tenth of the peak floating-point rate as measured in megaflops/second/node.
- The underlying network must allow simultaneous messages, that is,
not standard ethernet and not FDDI .
- Message latency should be no more than 500 microseconds.
- All processors should be similar in architecture and performance.
ScaLAPACK will be limited by
the slowest processor. Data format conversion significantly
reduces communication performance.
- No other jobs should be allowed to execute on the processors
that are being used. If the processors are gang scheduled and there
is enough physical memory for all jobs on all processors,
this requirement may be relaxed, but we do not recommend doing so
without careful study.
- No more than one process should be executed per processor.
Vendor specifications and actual performance often differ considerably,
especially in communication latency and bandwidth. Users should make sure
that they are using the most efficient BLAS and BLACS available on their
system.
Susan Blackford
Tue May 13 09:21:01 EDT 1997