The term preconditioning is often used in a numerical method as an extra step designed to accelerate the convergence. Preconditioning is an important technique in iterative methods for solving large systems of linear equations. Preconditioning for eigenvalue problems is not as straightforward and has been slower to catch on. There is a large set of literature in Russian and some of it is discussed in §11.3. Ruhe and Wiberg [380,373,374] used SOR and the conjugate gradient method to find eigenvalues in the 1970s. Around the same time, quantum chemists, including Davidson [99], developed correction methods for their large symmetric matrices. Later Scott recognized that Davidson's approach is a form of preconditioning, and generalizations were given in [335]. A recent development, the Jacobi-Davidson method, has been presented in detail in earlier chapters.
In this introduction, we attempt to point out some connections, first with some comparisons between preconditioning eigenvalues and linear equations. Then there is a brief discussion of preconditioned versus unpreconditioned methods. And finally, connections are given between various preconditioned eigensolvers.
In this paragraph, we look at how the task of preconditioning compares for eigenvalues versus linear equations. We give several reasons why preconditioning eigenvalue problems is more difficult. The matrix being preconditioned is generally nearly singular. Interior eigenvalues are difficult, similar to preconditioning indefinite systems of linear equations. Unlike preconditioned linear equations, preconditioned eigenvalue methods may need some sort of restarting, even in the symmetric case. However, restarting is often necessary even for unpreconditioned Krylov eigenvalue methods that compute the eigenvector. Finally, we mention that eigenvalue preconditioning methods tend to focus on computing one eigenvalue at a time. This is different from Krylov eigenvalue methods. There is no similar added difficulty in going from unpreconditioned to preconditioned linear equation methods.
In spite of these difficulties, which no doubt contributed to the slow development in preconditioning eigenvalue problems, there is much potential in situations where a good preconditioner can be found. A number of factors should go into the decision as to whether a preconditioning method is best. These include the effectiveness of the preconditioner, the number of eigenvalues desired, whether eigenvectors are needed, and the competing methods in the particular situation. Generally, preconditioning is more competitive if not too many eigenvalues are desired and if computing the eigenvectors is required. We also want to mention that finding eigenvalues far in the interior of the spectrum is so tough for Krylov methods that either preconditioning or complete factorization of a shifted matrix must be considered. Mesh eigenproblems (see §11.3.1) is one important practical example of a class of eigenproblems, where preconditioning seems particularly promising.
The rest of this introduction looks at similarities between the various
preconditioning methods that will be described in this chapter. It was
already mentioned that preconditioning methods generally find just one
eigenvalue at a time. This is quite
different from an unpreconditioned Krylov subspace method, which can find
several eigenvalues simultaneously. Sometimes a block approach is used to
alleviate this problem for a preconditioning method;
e.g., see §11.3.9. Another similarity among preconditioning
methods is that global convergence may be trickier.
Some methods, e.g., the locally optimal
preconditioned conjugate gradient method of §11.3.8,
are practically very robust and typically converge with a random initial
guess in numerical experiments.
However, some other methods, e.g., based on inner-outer iterations,
work better if reasonable approximations to eigenvectors are
available as starting vectors. This leads us to the topic of hybrid
methods, with a Krylov approach to start with and a preconditioning
approach to follow. In fact, a number of operators could be used for the
Krylov method including ,
(where
is an approximation to a
shifted matrix
), and even the preconditioned operator
, for
and
fixed. Hybrid methods could
deliver more robustness, and they deserve further examination.
We look at one more similarity among eigenvalue preconditioning methods.
For ease of presenting, we will assume there is a standard symmetric
eigenvalue problem. It appears that common to all methods is the
operator
Next, the RQI with the preconditioned
conjugate gradient method (PCG) of §11.3.7 has the operator
in the inner iteration of PCG, where
is the Rayleigh quotient and
is the preconditioner. And
Davidson's original method, Algorithm 11.2 of §11.2.4 (see also
§11.3.6)
for standard eigenvalue problems, uses
,
where
is the diagonal of
and
is the
current approximate eigenvalue that is found by the Rayleigh-Ritz
procedure. Again this fits the form of (11.1).
So if the same
preconditioning is used, RQI (with PCG) and the Davidson method have the same
operator at the core of the method. The difference is that the Davidson
method changes
at every step, while RQI has
fixed during a
run of PCG, and also the Davidson method uses the vectors it generates to
solve an eigenvalue problem, while RQI uses them for solving linear
equations. The linear equations in RQI are an intermediate step toward
the eventual goal of solving the eigenvalue problem.
Finally we look at Jacobi-Davidson method; see §11.2.5
and §11.3.7. What is interesting is that the operator
(11.1) shows up twice. First, in the outer iteration, from the
Jacobi-Davidson correction equation (4.49), letting
, we have
. Second and more importantly, in the inner iteration
of generating the approximate solution to
using a
preconditioned Krylov subspace iterative method, the operator is
, where
is the preconditioner for
. This operator is similar to
(11.1), with an added deflation of
. The subspace generated
by this preconditioned operator is used for the solution of linear equations.
With many eigenvalue preconditioning methods using at their core closely
related operators, results with the same preconditioner might be expected
to be similar. This is not the case. The
implementation particular to each method can make a big difference, both in
convergence rates and in expenses. For example, it will be discussed in
the next section that inexact Cayley transforms (and the
Jacobi-Davidson method) of §11.2.5
are much better than inexact shift-and-invert Krylov methods, even though
they are similar. This is due partly to the fact that the
operator in the shift-and-invert method has
fixed,
instead of approaching an eigenvalue (the
is changed in the inexact
rational Krylov method, also mentioned in the next section). We can say
that all the preconditioning methods share a limitation. They
can only be as effective as
allows them to be. This
depends on how good a preconditioner
is.
The Davidson method is perhaps the purest preconditioning method in the sense that with every application of the operator, it uses the best information known (new approximations to the desired eigenpair), and it applies the vectors it generates directly to an eigenvalue problem. On the other hand, there is significant expense for every iteration. Meanwhile, for some other methods, including RQI and Jacobi-Davidson, the inner iteration of a preconditioned linear equations iterative method can be more efficient in both expense and storage. Jacobi-Davidson has this efficiency and also greater robustness than RQI because it uses a subspace for the eigenvalue problem instead of one vector, and because it uses the residual of an approximate eigenvector as the right-hand side of the linear equations, instead of the vector itself.
Finally, the simultaneous PCG method,
Algorithm 11.11 of §11.3.9,
the most promising method of §11.3,
has advantages of fast convergence of the Davidson method without its
significant cost and applies the PCG directly
to an eigenvalue problem, thus removing any need for inner iterations.
Numerical experiments show great practical robustness of the method
with respect to initial approximations, a particular choice of the
preconditioner, and condition number of the matrix .
The two sections that follow both look at some of the details of preconditioning eigenvalue problems, but with some different insights.