The LU factorization applies a sequence of Gaussian eliminations
to form ,
where
and
are
matrices, and
is an
matrix.
is unit lower triangular (lower triangular with
1's on the main diagonal),
is upper triangular, and
is a permutation matrix, which is stored in a
vector.
At the -th step of the computation (
), it is assumed that
the
submatrix of
(
,
) is to be partitioned as follows,
where the block is
,
is
,
is
, and
is
.
is a unit lower triangular matrix, and
is an upper triangular matrix.
At first, a sequence of Gaussian eliminations is performed on the first
panel of
(i.e.,
and
).
Once this is completed, the matrices
,
, and
are known, and we can rearrange the block equations,
The LU factorization can be done by recursively applying the steps
outlined above to the matrix
.
Figure 3 shows a snapshot of the block LU factorization.
It shows how the column panel,
and
, and the row panel,
and
, are computed, and how the trailing submatrix
is updated. In the figure,
the shaded areas represent data for which the corresponding computations
are completed.
Later, row interchanges will be applied to
and
.
Figure 3: A snapshot of block LU factorization.
The computation of the above steps in the LAPACK routine, DGETRF, involves the following operations:
The corresponding parallel implementation of the ScaLAPACK routine, PDGETRF, proceeds as follows: