The traditional algorithm for QR factorization is based on the use of elementary Householder matrices of the general form
where v is a column vector and
is a scalar. This leads to an algorithm with very good vector performance, especially if coded to use Level 2 BLAS.
The key to developing a block form of this algorithm is to represent a product of b elementary Householder matrices of order n as a block form of a Householder matrix . This can be done in various ways. LAPACK uses the following form [68]:
where V is an n-by-n matrix whose columns are the individual vectors
associated with the Householder matrices
, and T is an upper triangular matrix of order b. Extra work is required to compute the elements of T, but once again this is compensated for by the greater speed of applying the block form. Table 3.6 summarizes results obtained with the LAPACK routine SGEQRF /DGEQRF .
------------------------------------------------- No. of Block Values of n processors size 100 1000 ------------------------------------------------- CONVEX C-4640 1 64 81 521 CONVEX C-4640 4 64 94 1204 CRAY C90 1 128 384 859 CRAY C90 16 128 390 7641 DEC 3000-500X Alpha 1 32 50 86 IBM POWER2 model 590 1 32 108 208 IBM RISC Sys/6000-550 1 32 30 61 SGI POWER CHALLENGE 1 64 61 190 SGI POWER CHALLENGE 4 64 39 342 -------------------------------------------------