At first it may appear that the sequential time of solving a factorization is of the order of the number of variables, but things are not quite that bad. Consider the special case of central differences on a regular domain of points. The variables
Figure: Wavefront solution of
from a central difference problem
on a domain of points.
on any diagonal in the domain, that is, in locations with , depend only on those on the previous diagonal, that is, with . Therefore it is possible to process the operations on such a diagonal, or `wavefront' , in parallel (see figure ), or have a vector computer pipeline them; see Van der Vorst [205][203].
Another way of vectorizing the solution of the triangular factors is to use some form of expansion of the inverses of the factors. Consider for a moment a lower triangular matrix, normalized to the form where is strictly lower triangular). Its inverse can be given as either of the following two series:
(The first series is called a ``Neumann expansion'', the second an ``Euler expansion''. Both series are finite, but their length prohibits practical use of this fact.) Parallel or vectorizable preconditioners can be derived from an incomplete factorization by taking a small number of terms in either series. Experiments indicate that a small number of terms, while giving high execution rates, yields almost the full precision of the more recursive triangular solution (see Axelsson and Eijkhout [15] and Van der Vorst [201]).
There are some practical considerations in implementing these expansion algorithms. For instance, because of the normalization the in equation () is not . Rather, if we have a preconditioner (as described in section ) described by
then we write
Now we can choose whether or not to store the product . Doing so doubles the storage requirements for the matrix, not doing so means that separate multiplications by and have to be performed in the expansion.
Figure: Preconditioning step algorithm for
a Neumann expansion
of an incomplete factorization .
Suppose then that the products and have been stored. We then define by
and replace solving a system for by computing . This algorithm is given in figure .