SC97 Tutorial on Linear Algebra Algorithms and Software for Large Scientific Problems
Outline
High-Performance Computing Today
Growth of Microprocessor Performance
Scalable Multiprocessors
Performance Numbers on RISC Processors
Top500 Fastest Installed Computers
PPT Slide
The Maturation of Highly Parallel Technology
Architecture Alternatives
Directions
Challenges in Developing Distributed Memory Libaries
EISPACK and LINPACK
Memory Hierarchy
Level 1, 2 and 3 BLAS
Why Higher Level BLAS?
BLAS for Performance
History of Block Partitioned Algorithms
Blocked Partitioned Algorithms
LAPACK
Derivation of Blocked AlgorithmsCholesky Factorization A = UTU
LINPACK Implementation
LAPACK Implementation
Derivation of Blocked Algorithms
LAPACK Blocked Algorithms
LAPACK Contents
LAPACK Ongoing Work
ATLAS Project (Automaticly Tuned Linear Algebra Software)
ScaLAPACK
Programming Style
Overall Structure of Software
PBLAS
ScaLAPACK Structure
Choosing a Data Distribution
Possible Data Layouts
Distribution and Storage
Parallelism in ScaLAPACK
ScaLAPACK - What’s Included
Heterogeneous Computing
Performance
ScaLAPACK on a Cluster
HPF Interface for ScaLAPACK
Out of Core Approach
Out of Core Algorithm
References
Email: dongarra@cs.utk.edu
Home Page: http://www.netlib.org/utk/people/JackDongarra/