Rice University Progress Report 95/96

Personnel at Rice U.

D.C. Sorensen
K.J. Maschhoff ( Post Doc. )
C. Yang ( Grad Student )

Progress 95-96

Considerable improvements have been made to ARPACK (Fortran 77 software for large scale eigenproblems). In addition to correcting a number of reported deficiencies to enhance accuracy, efficiency, and reliability, we have included a number of new subroutines. We now have a substantial set of driver routines that may be used as templates by users to construct their own interfaces to ARPACK. These drivers include every precision and all standard and shift-invert modes needed for symmetric, nonsymmetric, standard or generalized eigenproblems.

In addition to the standard drivers, we developed a special driver for computing a partial Singular Value Decomposition and also one for banded problems. We also added the capability to use complex (single or double precision) arithmetic. These additions greatly increased the applicability of ARPACK.

We implemented a matlab version called ``speig'' that will be adopted by Mathworks in the near future as the procedure for solving large sparse eigenproblems.

We have nearly completed a users guide for ARPACK.
A new parallel version of the ARPACK software package for computing eigenvalues and eigenvectors of large (sparse) matrices was developed which incorporated the many improvements made in the serial ARPACK package since the development of the first parallel version. The new parallel package PARPACK incorporates the latest improvements and standardization efforts in message passing. This new version supports both the BLACS (Basic Linear Algebra Communication Subprograms) primitives developed for the ScaLAPACK project and the MPI (Message Passing Interface) standard.

Parallel ARPACK (PARPACK) is now provided as an enhancement to the serial ARPACK package. It's redesign has greatly simplified software maintenance and enforces a level of algorithmic consistency between the parallel and serial implementations.

The portability of and scalability of PARPACK has been demonstrated on multiple platforms including the SGI Power Challenge Cluster at Rice, the IBM SP2 at Maui, Sun stations, the Intel Paragon at Sandia, the CRAY T3D at the Ohio supercomputing center. This work is reported in the proceedings of the 1996 Copper Mt. Conference on Iterative Methods and will be a forthcomming CAAM Tech. Rept. (by K. Maschhoff and D. Sorensen).

PARPACK is available via anonymous ftp from Rice University and from Netlib under the SCALAPACK directory.
Incorporation of the deflation technique of "locking" into the symmetric ARPACK codes and the development of an out-of-core solver.

The deflation technique of "locking" reduces the serial bottleneck caused by operations on the projected matrix by providing a mechanism for keeping the active portion of this matrix small. Operations on the projected matrix due to the implicit restart mechanism are replicated on each processor and this redundancy presents a potential bottleneck which can prevent scalability when approximations to a large percentage of eigenvalues are required. The potential for such a bottleneck is hopefully diminished with this locking scheme in place.

The out-of-core solver is an extension of the symmetric ARPACK code with locking which accommodates situations where converged Ritz vectors must be stored on disk. Different strategies for maintaining orthogonalization to these out-of-core vectors are under investigation.

The out-of-core solver is currently being tested at the MPRCL at Sandia Nat. Labs on the Intel paragon.
Extension of IRA to Block Formulations Joint work with Rich Lehoucq ( Argonne Nat. Lab)

Recent work by Lehoucq and Sorensen indicates that an unblocked Arnoldi/Lanczos method coupled with a deflation strategy can be used reliably to compute multiple and/or clustered eigenvalues provided an appropriate convergence criteria is selected. A blocked reduction may, however, prove to be more efficient in many situations.

Recently we have begun to evaluate the differences between the blocked and unblocked variants of Arnoldi/Lanczos methods in order to evaluate the necessity and feasibility of providing a blocked method in future versions of ARPACK. Basically we would like to assess whether the benefits of a blocked reduction outweigh the added computational complexity.

In our study we first consider the careful numerical implementation of block methods. Robust deflation, orthogonalization and restarting strategies are presented. In particular, we demonstrate how the Implicitly Restarted Arnoldi (IRA) method may extended to block formulations. Finally, we perform a series of careful numerical experiments to assess the differences between the blocked and unblocked variants. The goal of our study is to provide the numerical analyst and software developer a better understanding of the many issues involved.

A blocked version has been fully implemented in a Matlab test code and preliminary results of this study will be presented at the SIAM Conference on Sparse Matrices in October. This work is in a preliminary stage of development that we expect to complete near the end of the 96-97 academic year.
We have developed a new a new Krylov subspace iteration for large scale eigenvalue problems that is able to accelerate the convergence through an inexact (iterative) solution to a shift-invert equation. The method also takes full advantage of an exact solution when it is possible to apply a sparse direct method to solve the shift-invert equations. We call this new iteration the Truncated RQ iteration (TRQ). It is based upon a recursion that develops in the leading k columns of the implicitly shifted RQ iteration for dense matrices. Inverse-iteration-like convergence to a partial Schur decomposition occurs in the leading k columns of the updated basis vectors and Hessenberg matrices. The TRQ iteration is competitive with the Rational Krylov Method of Ruhe when the shift-invert equations can be solved directly and with the Jacobi-Davidson Method of Sleijpen and Van der Vorst when these equations are solved inexactly with a preconditioned iterative method. The TRQ iteration is related to both of these but is derived directly from the RQ iteration and thus inherits the convergence properties of that method. Existing RQ deflation strategies may be employed directly in the TRQ iteration.

Publications and Reports

D.C. Sorensen, "Implicitly Restarted Arnoldi/Lanczos Methods and Large Scale SVD Applications," (invited paper) in SVD and Signal Processing, III, M. Moonen and B. De Moor (eds.) , Elsevier Science, Amsterdam, (1995).
E. Grimme, D.C. Sorensen and P. Van Dooren, "Model reduction of state space systems via an implicitly restarted Lanczos method", Numerical Algorithms (to appear).
D.C. Sorensen, "Minimization of a Large Scale Quadratic Function Subject to an Ellipsoidal Constraint", Rice U. CAAM-TR 94-27, SIAM J. Optimization (to appear).
T.D. Romo, J.B. Clarage, D.C. Sorensen and G.N. Phillips, Jr., "Automatic Identification of Discrete Substates in Proteins: Singular Value Decomposition Analysis of Time Averaged Crystallographic Refinements," Proteins: Structure, Function, and Genetics, 22, 311-321,(1995).
R.B. Lehoucq and D.C. Sorensen, "Deflation Techniques for an Implicitly Re-started Arnoldi Iteration," Rice U. CAAM-TR 94-13, SIAM J. Matrix Analysis and Applications.
D.C. Sorensen, "Implicitly Restarted Arnoldi/Lanczos Methods for Large Scale Eigenvalue Calculations," (invited survey paper), in Parallel Numerical Algorithms: Proceedings of an ICASE/LaRC Workshop, May 23-25, 1994, Hampton, VA, D. E. Keyes, A. Sameh, and V. Venkatakrishnan, eds., Kluwer, (to appear).
R.J. Radke, "Matlab Implementation of the Arnoldi-Lanczos Method for Large-Scale Sparse Eigenvalue Computation", Masters Thesis, Dept. Computational and Applied Mathematics, Rice University, Houston, TX (1996).
D.C. Sorensen and C. Yang, "A Truncated RQ-iteration for Large Scale Eigenvalue Problems", Rice U. CAAM-TR96-06, (submitted to SIAM J. Matrix Analysis and Applications)(1996).

Software released during 95-96:

R.B. Lehoucq, D.C. Sorensen, P.A. Vu, C. Yang, ARPACK: Fortran subroutines for solving large scale eigenvalue problems, Release 2.1, available from netlib@ornl.gov in the scalapack directory or from the URL ftp://ftp.caam.rice.edu/pub/people/sorensen/ARPACK.
K. Maschhoff, D.C. Sorensen, PARPACK: Parallel version of ARPACK for solving large scale eigenvalue problems, Release 1.1, available from netlib@ornl.gov in the scalapack directory or from the URL ftp://ftp.caam.rice.edu/pub/people/kristyn.