Tools for Numerical Performance and Analysis
Automatically Tuned Linear Algebra Software - ATLAS

Chip Technology (Data from Top500 Report)

Where Does the Performance Go? or
Why Should I Cares About the Memory Hierarchy?

Memory Hierarchy

Performance
Software

How To Get Performance From Commodity Processors?

ATLAS

ATLAS (DGEMM n = 500)

Why ATLAS Is Fast?

Code Generation
Strategy

Ax = b, n = 500, (LU Right-Looking)

Recursive Approach for Other Level 3 BLAS

500x500 Recursive Level 3 BLAS on UltraSparc 2 200

500x500 Level 2 BLAS DGEMV

Multi-Threaded DGEMM
Intel PIII 550 MHz

ATLAS

Gaussian Elimination

Gaussian Elimination via a Recursive Algorithm

Matlab Code (Without Pivoting)

Recursive Factorizations

Next Release of ATLAS to Include Recursive LU and LL^T

Recursive LU vs LAPACK/ATLAS

Recursive LL^T vs LAPACK/ATLAS (Lower)

Future Plans for ATLAS

PAPI - Definition

Performance Data (cont.)

PAPI Platforms

Show PAPI Demos

Contributors to These Ideas