Goals - Port LAPACK to distributed-memory environments.
Efficiency
- Optimized compute and communication engines
- Block-partitioned algorithms (Level 3 BLAS) utilize memory hierarchy and yield good node performance
Scalability
- as the problem size and number of processors grow
Reliability
- Whenever possible, use LAPACK algorithms and error bounds.
Portability
- isolate machine dependencies to BLAS and the BLACS
Flexibility
- Modularity: Build rich set of linear algebra tools: BLAS, BLACS, PBLAS
Ease-of-Use
- Calling interface similar to LAPACK