The traditional
algorithm for QR
factorization
is based on the use
of elementary Householder
matrices of the general
form
where v is a column
vector and is
a scalar. This leads
to an algorithm with
very good vector
performance, especially
if coded to use Level 2
PBLAS.
The key to developing
a distributed block form
of this algorithm is to
represent a product of
b elementary Householder
matrices of order n as
a block form of a Householder
matrix .
This can be done in various
ways. ScaLAPACK uses the
following form [31]:
where V is an n by b
matrix whose columns are the
individual vectors associated with
the Householder matrices , and T
is an upper triangular matrix
of order b. Extra work is
required to compute the
elements of T, but once again
this is compensated for by the
greater speed of applying the
block form. Table 7
summarizes results obtained with
the ScaLAPACK routine
PDGEQRF .
Table 7: Speed in Megaflop/s of PDGEQRF for Square Matrices of
Order N