http://www.cs.utexas.edu/users/rvdg/software.html Scalable Basic Linear Algebra Subprograms (sB_BLAS) <title_line>scalable implementation of common matrix-matrix operations <author>Almadena Chtchelkanova John Gunnels Greg Morrow James Overfelt Robert A. van de Geijn University of Texas at Austin <contact>Robert A. van de Geijn / rvdg@cs.utexas.edu <abstract> We present straight forward techniques for a highly efficient, scalable implementation of common matrix-matrix operations generally known as the Level 3 Basic Linear Algebra Subprograms (BLAS). This work builds on our recent discovery of a parallel matrix-matrix multiplication implementation, which has yielded superior performance, and requires little work space. We show that the techniques used for the matrix-matrix multiplication naturally extend to all important level 3 BLAS and thus this approach becomes an enabling technology for efficient parallel implementation of these routines and libraries that use BLAS. <keywords>GAMS D1. Elementary vector and matrix operations <category>numerical-linalg </urc>