Suppose the matrix A is partitioned as shown in Figure
5, and we seek a factorization A=LU, where the partitioning
of L and U is also shown in Figure 5. Then we may write,
where is
,
is
,
is
, and
is
.
and
are lower triangular matrices with 1s on the main diagonal,
and
and
are upper triangular matrices.
Figure 5: Block LU factorization of the partitioned matrix A. is
,
is
,
is
,
and
is
.
and
are
lower triangular matrices with 1's on the main diagonal,
and
and
are upper triangular matrices.
Equations 1 and 2 taken together perform an LU
factorization on the first panel of A (i.e.,
and
). Once this is completed,
the matrices
,
, and
are known, and the lower
triangular system in Eq. 3 can be solved to give
.
Finally, we rearrange Eq. 4 as,
From this equation we see that the problem of finding and
reduces to finding the LU factorization of the
matrix
. This can be done by applying the steps
outlined above to
instead of to A. Repeating these steps K times,
where
where denotes the least integer greater than or equal to x,
we obtain the LU factorization of the original
matrix A. For
an in-place algorithm, A is overwritten by L and U - the 1s on the
diagonal of L do not need to be stored explicitly. Similarly,
when A is updated by Eq. 5 this may also be done in place.
After k of these K steps, the first kr columns of L and the first
kr rows of U have been evaluated, and matrix A has been updated to the
form shown in Figure 6, in which panel B is
and C is
. Step k+1 then
proceeds as follows,
Figure 6: Stage k+1 of the block LU factorization algorithm showing
how the panels B and C, and the trailing submatrix E are updated.
The trapezoidal submatrices L and U have already been factored in
previous steps. L has kr columns, and U has kr rows. In the step
shown another r columns of L and r rows of U are evaluated.
The LAPACK implementation of this form of LU factorization uses the Level 3
BLAS routines xTRSM and xGEMM to perform the triangular solve and rank-r
update. We can regard the algorithm as acting on matrices that have been
partitioned into blocks of elements, as shown in Figure
7.
Figure 7: Block-partitioned matrix A. Each block consists of
matrix elements.