next up previous contents index
Next: Symmetric Indefinite Lanczos Method Up: Generalized Non-Hermitian Eigenvalue Problems Previous: Numerical Example   Contents   Index


Rational Krylov Subspace Method
 A. Ruhe

The rational Krylov subspace method computes eigenvalues and eigenvectors of a regular pencil,

\begin{displaymath}
(A-\lambda B)x=0\;.
\vspace*{-12pt}%% long page
\pagebreak\end{displaymath} (230)

It is a generalization of the shift-and-invert Arnoldi algorithm, in which several factorizations with different shifts are used in one run. This way we get good approximations to all the eigenvalues in a union of regions around the shifts chosen; see [378]. A typical application is model reduction of a linear dynamical system, where one wants to get the response over a prescribed range of frequencies; see, e.g., [379] or [184].

Rational Krylov starts as a shifted and inverted Arnoldi iteration with shift $\sigma_1$ and starting vector $v_1$. It computes an orthonormal basis $V_j$ using the basic recursion,

\begin{displaymath}
(A-\sigma_1B)^{-1}BV_j=V_{j+1}H_{j+1,j}\;.
\end{displaymath} (231)

(The subscripts indicate sizes of matrices, $H_{j+1,j}$ is $(j+1) \times j$, and $V_j$ is a matrix of $j$ columns each of full length $n$.) We may compute Ritz values from $H_{j,j}$ and Ritz approximate eigenvectors from $V_j$ in the usual way. Solve the small-sized eigenvalue problem
\begin{displaymath}
H_{j,j}Z^{(j)}=Z^{(j)}\Theta^{(j)}
\end{displaymath} (232)

and get the Ritz values $\lambda^{(j)}_i=\sigma_1+1/\theta_i^{(j)}$ and Ritz vectors $y^{(j)}_i=V_jz_{.,i}^{(j)}$ for $i=1,\dots,j$. We get good approximations after a moderate number of steps $j$ to those eigenvalues $\lambda_i$ that are closest to the shift $\sigma_1$.

Now the idea behind the rational Krylov algorithm is to continue with a new shift $\sigma_2$, when we want to get approximations to the eigenvalues close to that new shift. The tricky thing is to avoid throwing away all the information gathered in the basis $V_j$, computed using the old shift $\sigma_1$. This is indeed possible if we replace the basis $V_{j+1}$ with a new basis $W_{j+1}$, which spans the same subspace as $V_{j+1}$ but can be interpreted as the orthogonal basis of an Arnoldi recursion (8.18) with the new shift $\sigma_2$. At the same time the trapezoidal (Hessenberg) matrix $H_{j+1,j}$ is replaced by a new matrix of the same form, $\tilde{H}_{j+1,j}$.

Rewrite the recursion (8.18) as

\begin{displaymath}BV_j=(A-\sigma_1B)V_{j+1}H_{j+1,j}\;.\end{displaymath}

Add a matrix to the left-hand side, so that $\sigma_1$ is replaced by the new shift $\sigma_2$ in the right-hand side,

\begin{displaymath}(\sigma_1-\sigma_2)BV_{j+1}H_{j+1,j}+BV_j=(A-\sigma_2B)V_{j+1}H_{j+1,j}\;,\end{displaymath}

and note that $B$ is the only large matrix in the left-hand side,

\begin{displaymath}
BV_{j+1}(I_{j+1,j}+(\sigma_1-\sigma_2)H_{j+1,j})=(A-\sigma_2B)V_{j+1}H_{j+1,j}\;.\end{displaymath}

The matrix $K_{j+1,j}=I_{j+1,j}+(\sigma_1-\sigma_2)H_{j+1,j}$ in the left-hand side is trapezoidal with the same pattern of nonzeros as $H_{j+1,j}$, and we may come back to an expression similar to the recursion (8.18) by premultiplying with $(A-\sigma_2B)^{-1}$ to get

\begin{displaymath}(A-\sigma_2B)^{-1}BV_{j+1}K_{j+1,j}=V_{j+1}H_{j+1,j}\;.\end{displaymath}

It remains to get rid of the factor $K_{j+1,j}$ on the left-hand side, and this is done in the following way. Make a QR decomposition of $K_{j+1,j}$ and get

\begin{displaymath}(A-\sigma_2B)^{-1}BV_{j+1}Q_{j+1,j+1}
\left[\begin{array}{c}{R_{j,j}}\\ 0\end{array}\right]=V_{j+1}H_{j+1,j}\;.\end{displaymath}

Multiply from the right with the inverse of the triangular matrix $R_{j,j}$, which we know is nonsingular if the Hessenberg matrix is unreduced,
\begin{displaymath}
(A-\sigma_2B)^{-1}BV_{j+1}Q_{j+1,j}=V_{j+1}H_{j+1,j}R_{j,j}^{-1}\;.
\end{displaymath} (233)

Introduce the new orthogonal basis $V_{j+1}Q_{j+1,j+1}$ and note that the pencil (8.17) is represented by the full matrix,

\begin{displaymath}L_{j+1,j}=Q_{j+1,j+1}^{\ast}H_{j+1,j}R_{j,j}^{-1},\end{displaymath}

in this new basis. We may transform this $L_{j+1,j}$ into trapezoidal (upper Hessenberg) form by applying the Householder algorithm from the bottom upwards (see [377]),

\begin{displaymath}
L_{j+1,j}= \left[\begin{array}{cc}P_j & 0 \\ 0&
1\end{array}\right]\tilde{H}_{j+1,j}P_j^{\ast}\;.\end{displaymath}

Multiply the recursion (8.20) by the orthogonal $P_j$ from the right and get

\begin{displaymath}(A-\sigma_2B)^{-1}BV_{j+1}Q_{j+1,j}P_j=V_{j+1}Q_{j+1}
\left[\begin{array}{cc}P_j & 0 \\ 0& 1\end{array}\right]\tilde{H}_{j+1,j}\end{displaymath}

or

\begin{displaymath}(A-\sigma_2B)^{-1}BW_{j}=W_{j+1}\tilde{H}_{j+1,j},\end{displaymath}

an Arnoldi recursion (8.18) with the new shift $\sigma_2$, the new orthogonal basis,

\begin{displaymath}W_{j+1}=V_{j+1}Q_{j+1,j+1}
\left[\begin{array}{cc}P_j & 0 \\ 0& 1\end{array}\right]\;,\end{displaymath}

and the new trapezoidal (upper Hessenberg) matrix $\tilde{H}_{j+1,j}$.

We once again stress that all these transformations are effected without actually performing any operations on the large $n \times n$ matrices $A$ and $B$. We may even accumulate all the $Q$ and $P$ factors and avoid forming $W$ explicitly, thus avoiding all extra work on long vectors. Also note that even if we get an Arnoldi recursion, it does not start at the original starting vector $v_1$ but at $w_1$, which is a linear combination of all the vectors computed during the whole rational Krylov algorithm.

In a practical implementation, this is combined with locking, purging, and implicit restart. First run shifted and inverted Arnoldi with the first shift $\sigma_1$. When an appropriate number of eigenvalues around $\sigma_1$ have converged, say after $m$ steps, lock these converged eigenvalues and purge those that are altogether outside the interesting region, leaving a $(j \leq m)$-step Arnoldi recursion (8.18). Then introduce the new shift $\sigma_2$ and follow the description in this section to get a new basis $W_{j+1}$ which replaces $V_{j+1}$. Start at the new shift by operating on the last vector of this new basis

\begin{displaymath}r:=(A-\sigma_2B)^{-1}Bv_{j+1}\end{displaymath}

and get the next basis vector $v_{j+2}$ in the Arnoldi recursion (8.18) with the new shift $\sigma_2$. Continue until we get convergence for a set of eigenvalues around $\sigma_2$, and repeat the same procedure with new shifts until either all interesting eigenvalues have converged or all the shifts in the prescribed frequency range have been used.


\begin{algorithm}{Rational Krylov Subspace Method for GNHEP
}
{
\begin{tabbing}
...
...begin{array}{cc}P_j & 0 \\ 0& 1\end{array}\right]$\end{tabbing}}
\end{algorithm}

Notice that in step (3) we make just $m-j$ operations with the shifted and inverted matrix, the first $j$ basis vectors are those saved from the previous shift $\sigma_{i-1}$. The lock, purge, and deflation operations in step (4) are very similar to those for implicitly restarted Arnoldi as described in §7.6; we have actually used a nonhermitian version of thick restart [463]. In step (5) we choose the new shift $\sigma_{i+1}$ as a value in the complex plane where we are interested in studying the behavior of the matrix pencil (8.17). However, a choice too close to an eigenvalue will result in a nearly singular matrix $A-\sigma_{i+1} B$. Moreover, a new shift at a nearly converged Ritz value will make the upper triangular factor $R_{j,j}$ in step (6) close to singular.

When the second matrix $B$ in the original pencil (8.17) is positive definite, we may choose to make the basis $V_j$ $B$ orthogonal. If then $A$ is also Hermitian, so is $H_{j,j}$, and we may work with tridiagonal matrices; see [322].

This formulation of rational Krylov is different from the one used earlier [378], where the same set of basis vectors $V_j$ is used throughout and the pencil (8.17) is transformed into two Hessenberg matrices. We have found that this new formulation gives a more natural way to continue with a new shift and signal when we risk losing accuracy due to linear dependence.

See the report [379] for a numerical example from model reduction.


next up previous contents index
Next: Symmetric Indefinite Lanczos Method Up: Generalized Non-Hermitian Eigenvalue Problems Previous: Numerical Example   Contents   Index
Susan Blackford 2000-11-20