Performance Issues - Cache &Bandwidth
Performance instability
- Small changes in the architecture may cause dramatic changes in delivered performance.
Latency tolerant and bandwidth parsimonious algorithms and software are critical
- Sometimes this means recompute rather than store/load
Need to help the compiler
Have a hard time today getting performance
Level 3 BLAS as a starting point