*> \brief ZHEEVX_2STAGE computes the eigenvalues and, optionally, the left and/or right eigenvectors for HE matrices * * @precisions fortran z -> s d c * * =========== DOCUMENTATION =========== * * Online html documentation available at * http://www.netlib.org/lapack/explore-html/ * *> \htmlonly *> Download ZHEEVX_2STAGE + dependencies *> *> [TGZ] *> *> [ZIP] *> *> [TXT] *> \endhtmlonly * * Definition: * =========== * * SUBROUTINE ZHEEVX_2STAGE( JOBZ, RANGE, UPLO, N, A, LDA, VL, VU, * IL, IU, ABSTOL, M, W, Z, LDZ, WORK, * LWORK, RWORK, IWORK, IFAIL, INFO ) * * IMPLICIT NONE * * .. Scalar Arguments .. * CHARACTER JOBZ, RANGE, UPLO * INTEGER IL, INFO, IU, LDA, LDZ, LWORK, M, N * DOUBLE PRECISION ABSTOL, VL, VU * .. * .. Array Arguments .. * INTEGER IFAIL( * ), IWORK( * ) * DOUBLE PRECISION RWORK( * ), W( * ) * COMPLEX*16 A( LDA, * ), WORK( * ), Z( LDZ, * ) * .. * * *> \par Purpose: * ============= *> *> \verbatim *> *> ZHEEVX_2STAGE computes selected eigenvalues and, optionally, eigenvectors *> of a complex Hermitian matrix A using the 2stage technique for *> the reduction to tridiagonal. Eigenvalues and eigenvectors can *> be selected by specifying either a range of values or a range of *> indices for the desired eigenvalues. *> \endverbatim * * Arguments: * ========== * *> \param[in] JOBZ *> \verbatim *> JOBZ is CHARACTER*1 *> = 'N': Compute eigenvalues only; *> = 'V': Compute eigenvalues and eigenvectors. *> Not available in this release. *> \endverbatim *> *> \param[in] RANGE *> \verbatim *> RANGE is CHARACTER*1 *> = 'A': all eigenvalues will be found. *> = 'V': all eigenvalues in the half-open interval (VL,VU] *> will be found. *> = 'I': the IL-th through IU-th eigenvalues will be found. *> \endverbatim *> *> \param[in] UPLO *> \verbatim *> UPLO is CHARACTER*1 *> = 'U': Upper triangle of A is stored; *> = 'L': Lower triangle of A is stored. *> \endverbatim *> *> \param[in] N *> \verbatim *> N is INTEGER *> The order of the matrix A. N >= 0. *> \endverbatim *> *> \param[in,out] A *> \verbatim *> A is COMPLEX*16 array, dimension (LDA, N) *> On entry, the Hermitian matrix A. If UPLO = 'U', the *> leading N-by-N upper triangular part of A contains the *> upper triangular part of the matrix A. If UPLO = 'L', *> the leading N-by-N lower triangular part of A contains *> the lower triangular part of the matrix A. *> On exit, the lower triangle (if UPLO='L') or the upper *> triangle (if UPLO='U') of A, including the diagonal, is *> destroyed. *> \endverbatim *> *> \param[in] LDA *> \verbatim *> LDA is INTEGER *> The leading dimension of the array A. LDA >= max(1,N). *> \endverbatim *> *> \param[in] VL *> \verbatim *> VL is DOUBLE PRECISION *> If RANGE='V', the lower bound of the interval to *> be searched for eigenvalues. VL < VU. *> Not referenced if RANGE = 'A' or 'I'. *> \endverbatim *> *> \param[in] VU *> \verbatim *> VU is DOUBLE PRECISION *> If RANGE='V', the upper bound of the interval to *> be searched for eigenvalues. VL < VU. *> Not referenced if RANGE = 'A' or 'I'. *> \endverbatim *> *> \param[in] IL *> \verbatim *> IL is INTEGER *> If RANGE='I', the index of the *> smallest eigenvalue to be returned. *> 1 <= IL <= IU <= N, if N > 0; IL = 1 and IU = 0 if N = 0. *> Not referenced if RANGE = 'A' or 'V'. *> \endverbatim *> *> \param[in] IU *> \verbatim *> IU is INTEGER *> If RANGE='I', the index of the *> largest eigenvalue to be returned. *> 1 <= IL <= IU <= N, if N > 0; IL = 1 and IU = 0 if N = 0. *> Not referenced if RANGE = 'A' or 'V'. *> \endverbatim *> *> \param[in] ABSTOL *> \verbatim *> ABSTOL is DOUBLE PRECISION *> The absolute error tolerance for the eigenvalues. *> An approximate eigenvalue is accepted as converged *> when it is determined to lie in an interval [a,b] *> of width less than or equal to *> *> ABSTOL + EPS * max( |a|,|b| ) , *> *> where EPS is the machine precision. If ABSTOL is less than *> or equal to zero, then EPS*|T| will be used in its place, *> where |T| is the 1-norm of the tridiagonal matrix obtained *> by reducing A to tridiagonal form. *> *> Eigenvalues will be computed most accurately when ABSTOL is *> set to twice the underflow threshold 2*DLAMCH('S'), not zero. *> If this routine returns with INFO>0, indicating that some *> eigenvectors did not converge, try setting ABSTOL to *> 2*DLAMCH('S'). *> *> See "Computing Small Singular Values of Bidiagonal Matrices *> with Guaranteed High Relative Accuracy," by Demmel and *> Kahan, LAPACK Working Note #3. *> \endverbatim *> *> \param[out] M *> \verbatim *> M is INTEGER *> The total number of eigenvalues found. 0 <= M <= N. *> If RANGE = 'A', M = N, and if RANGE = 'I', M = IU-IL+1. *> \endverbatim *> *> \param[out] W *> \verbatim *> W is DOUBLE PRECISION array, dimension (N) *> On normal exit, the first M elements contain the selected *> eigenvalues in ascending order. *> \endverbatim *> *> \param[out] Z *> \verbatim *> Z is COMPLEX*16 array, dimension (LDZ, max(1,M)) *> If JOBZ = 'V', then if INFO = 0, the first M columns of Z *> contain the orthonormal eigenvectors of the matrix A *> corresponding to the selected eigenvalues, with the i-th *> column of Z holding the eigenvector associated with W(i). *> If an eigenvector fails to converge, then that column of Z *> contains the latest approximation to the eigenvector, and the *> index of the eigenvector is returned in IFAIL. *> If JOBZ = 'N', then Z is not referenced. *> Note: the user must ensure that at least max(1,M) columns are *> supplied in the array Z; if RANGE = 'V', the exact value of M *> is not known in advance and an upper bound must be used. *> \endverbatim *> *> \param[in] LDZ *> \verbatim *> LDZ is INTEGER *> The leading dimension of the array Z. LDZ >= 1, and if *> JOBZ = 'V', LDZ >= max(1,N). *> \endverbatim *> *> \param[out] WORK *> \verbatim *> WORK is COMPLEX*16 array, dimension (MAX(1,LWORK)) *> On exit, if INFO = 0, WORK(1) returns the optimal LWORK. *> \endverbatim *> *> \param[in] LWORK *> \verbatim *> LWORK is INTEGER *> The length of the array WORK. LWORK >= 1, when N <= 1; *> otherwise *> If JOBZ = 'N' and N > 1, LWORK must be queried. *> LWORK = MAX(1, 8*N, dimension) where *> dimension = max(stage1,stage2) + (KD+1)*N + N *> = N*KD + N*max(KD+1,FACTOPTNB) *> + max(2*KD*KD, KD*NTHREADS) *> + (KD+1)*N + N *> where KD is the blocking size of the reduction, *> FACTOPTNB is the blocking used by the QR or LQ *> algorithm, usually FACTOPTNB=128 is a good choice *> NTHREADS is the number of threads used when *> openMP compilation is enabled, otherwise =1. *> If JOBZ = 'V' and N > 1, LWORK must be queried. Not yet available *> *> If LWORK = -1, then a workspace query is assumed; the routine *> only calculates the optimal size of the WORK array, returns *> this value as the first entry of the WORK array, and no error *> message related to LWORK is issued by XERBLA. *> \endverbatim *> *> \param[out] RWORK *> \verbatim *> RWORK is DOUBLE PRECISION array, dimension (7*N) *> \endverbatim *> *> \param[out] IWORK *> \verbatim *> IWORK is INTEGER array, dimension (5*N) *> \endverbatim *> *> \param[out] IFAIL *> \verbatim *> IFAIL is INTEGER array, dimension (N) *> If JOBZ = 'V', then if INFO = 0, the first M elements of *> IFAIL are zero. If INFO > 0, then IFAIL contains the *> indices of the eigenvectors that failed to converge. *> If JOBZ = 'N', then IFAIL is not referenced. *> \endverbatim *> *> \param[out] INFO *> \verbatim *> INFO is INTEGER *> = 0: successful exit *> < 0: if INFO = -i, the i-th argument had an illegal value *> > 0: if INFO = i, then i eigenvectors failed to converge. *> Their indices are stored in array IFAIL. *> \endverbatim * * Authors: * ======== * *> \author Univ. of Tennessee *> \author Univ. of California Berkeley *> \author Univ. of Colorado Denver *> \author NAG Ltd. * *> \date June 2016 * *> \ingroup complex16HEeigen * *> \par Further Details: * ===================== *> *> \verbatim *> *> All details about the 2stage techniques are available in: *> *> Azzam Haidar, Hatem Ltaief, and Jack Dongarra. *> Parallel reduction to condensed forms for symmetric eigenvalue problems *> using aggregated fine-grained and memory-aware kernels. In Proceedings *> of 2011 International Conference for High Performance Computing, *> Networking, Storage and Analysis (SC '11), New York, NY, USA, *> Article 8 , 11 pages. *> http://doi.acm.org/10.1145/2063384.2063394 *> *> A. Haidar, J. Kurzak, P. Luszczek, 2013. *> An improved parallel singular value algorithm and its implementation *> for multicore hardware, In Proceedings of 2013 International Conference *> for High Performance Computing, Networking, Storage and Analysis (SC '13). *> Denver, Colorado, USA, 2013. *> Article 90, 12 pages. *> http://doi.acm.org/10.1145/2503210.2503292 *> *> A. Haidar, R. Solca, S. Tomov, T. Schulthess and J. Dongarra. *> A novel hybrid CPU-GPU generalized eigensolver for electronic structure *> calculations based on fine-grained memory aware tasks. *> International Journal of High Performance Computing Applications. *> Volume 28 Issue 2, Pages 196-209, May 2014. *> http://hpc.sagepub.com/content/28/2/196 *> *> \endverbatim * * ===================================================================== SUBROUTINE ZHEEVX_2STAGE( JOBZ, RANGE, UPLO, N, A, LDA, VL, VU, $ IL, IU, ABSTOL, M, W, Z, LDZ, WORK, $ LWORK, RWORK, IWORK, IFAIL, INFO ) * IMPLICIT NONE * * -- LAPACK driver routine (version 3.8.0) -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- * June 2016 * * .. Scalar Arguments .. CHARACTER JOBZ, RANGE, UPLO INTEGER IL, INFO, IU, LDA, LDZ, LWORK, M, N DOUBLE PRECISION ABSTOL, VL, VU * .. * .. Array Arguments .. INTEGER IFAIL( * ), IWORK( * ) DOUBLE PRECISION RWORK( * ), W( * ) COMPLEX*16 A( LDA, * ), WORK( * ), Z( LDZ, * ) * .. * * ===================================================================== * * .. Parameters .. DOUBLE PRECISION ZERO, ONE PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) COMPLEX*16 CONE PARAMETER ( CONE = ( 1.0D+0, 0.0D+0 ) ) * .. * .. Local Scalars .. LOGICAL ALLEIG, INDEIG, LOWER, LQUERY, TEST, VALEIG, $ WANTZ CHARACTER ORDER INTEGER I, IINFO, IMAX, INDD, INDE, INDEE, INDIBL, $ INDISP, INDIWK, INDRWK, INDTAU, INDWRK, ISCALE, $ ITMP1, J, JJ, LLWORK, $ NSPLIT, LWMIN, LHTRD, LWTRD, KD, IB, INDHOUS DOUBLE PRECISION ABSTLL, ANRM, BIGNUM, EPS, RMAX, RMIN, SAFMIN, $ SIGMA, SMLNUM, TMP1, VLL, VUU * .. * .. External Functions .. LOGICAL LSAME INTEGER ILAENV2STAGE DOUBLE PRECISION DLAMCH, ZLANHE EXTERNAL LSAME, DLAMCH, ZLANHE, ILAENV2STAGE * .. * .. External Subroutines .. EXTERNAL DCOPY, DSCAL, DSTEBZ, DSTERF, XERBLA, ZDSCAL, $ ZLACPY, ZSTEIN, ZSTEQR, ZSWAP, ZUNGTR, ZUNMTR, $ ZHETRD_2STAGE * .. * .. Intrinsic Functions .. INTRINSIC DBLE, MAX, MIN, SQRT * .. * .. Executable Statements .. * * Test the input parameters. * LOWER = LSAME( UPLO, 'L' ) WANTZ = LSAME( JOBZ, 'V' ) ALLEIG = LSAME( RANGE, 'A' ) VALEIG = LSAME( RANGE, 'V' ) INDEIG = LSAME( RANGE, 'I' ) LQUERY = ( LWORK.EQ.-1 ) * INFO = 0 IF( .NOT.( LSAME( JOBZ, 'N' ) ) ) THEN INFO = -1 ELSE IF( .NOT.( ALLEIG .OR. VALEIG .OR. INDEIG ) ) THEN INFO = -2 ELSE IF( .NOT.( LOWER .OR. LSAME( UPLO, 'U' ) ) ) THEN INFO = -3 ELSE IF( N.LT.0 ) THEN INFO = -4 ELSE IF( LDA.LT.MAX( 1, N ) ) THEN INFO = -6 ELSE IF( VALEIG ) THEN IF( N.GT.0 .AND. VU.LE.VL ) $ INFO = -8 ELSE IF( INDEIG ) THEN IF( IL.LT.1 .OR. IL.GT.MAX( 1, N ) ) THEN INFO = -9 ELSE IF( IU.LT.MIN( N, IL ) .OR. IU.GT.N ) THEN INFO = -10 END IF END IF END IF IF( INFO.EQ.0 ) THEN IF( LDZ.LT.1 .OR. ( WANTZ .AND. LDZ.LT.N ) ) THEN INFO = -15 END IF END IF * IF( INFO.EQ.0 ) THEN IF( N.LE.1 ) THEN LWMIN = 1 WORK( 1 ) = LWMIN ELSE KD = ILAENV2STAGE( 1, 'ZHETRD_2STAGE', JOBZ, $ N, -1, -1, -1 ) IB = ILAENV2STAGE( 2, 'ZHETRD_2STAGE', JOBZ, $ N, KD, -1, -1 ) LHTRD = ILAENV2STAGE( 3, 'ZHETRD_2STAGE', JOBZ, $ N, KD, IB, -1 ) LWTRD = ILAENV2STAGE( 4, 'ZHETRD_2STAGE', JOBZ, $ N, KD, IB, -1 ) LWMIN = N + LHTRD + LWTRD WORK( 1 ) = LWMIN END IF * IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) $ INFO = -17 END IF * IF( INFO.NE.0 ) THEN CALL XERBLA( 'ZHEEVX_2STAGE', -INFO ) RETURN ELSE IF( LQUERY ) THEN RETURN END IF * * Quick return if possible * M = 0 IF( N.EQ.0 ) THEN RETURN END IF * IF( N.EQ.1 ) THEN IF( ALLEIG .OR. INDEIG ) THEN M = 1 W( 1 ) = DBLE( A( 1, 1 ) ) ELSE IF( VALEIG ) THEN IF( VL.LT.DBLE( A( 1, 1 ) ) .AND. VU.GE.DBLE( A( 1, 1 ) ) ) $ THEN M = 1 W( 1 ) = DBLE( A( 1, 1 ) ) END IF END IF IF( WANTZ ) $ Z( 1, 1 ) = CONE RETURN END IF * * Get machine constants. * SAFMIN = DLAMCH( 'Safe minimum' ) EPS = DLAMCH( 'Precision' ) SMLNUM = SAFMIN / EPS BIGNUM = ONE / SMLNUM RMIN = SQRT( SMLNUM ) RMAX = MIN( SQRT( BIGNUM ), ONE / SQRT( SQRT( SAFMIN ) ) ) * * Scale matrix to allowable range, if necessary. * ISCALE = 0 ABSTLL = ABSTOL IF( VALEIG ) THEN VLL = VL VUU = VU END IF ANRM = ZLANHE( 'M', UPLO, N, A, LDA, RWORK ) IF( ANRM.GT.ZERO .AND. ANRM.LT.RMIN ) THEN ISCALE = 1 SIGMA = RMIN / ANRM ELSE IF( ANRM.GT.RMAX ) THEN ISCALE = 1 SIGMA = RMAX / ANRM END IF IF( ISCALE.EQ.1 ) THEN IF( LOWER ) THEN DO 10 J = 1, N CALL ZDSCAL( N-J+1, SIGMA, A( J, J ), 1 ) 10 CONTINUE ELSE DO 20 J = 1, N CALL ZDSCAL( J, SIGMA, A( 1, J ), 1 ) 20 CONTINUE END IF IF( ABSTOL.GT.0 ) $ ABSTLL = ABSTOL*SIGMA IF( VALEIG ) THEN VLL = VL*SIGMA VUU = VU*SIGMA END IF END IF * * Call ZHETRD_2STAGE to reduce Hermitian matrix to tridiagonal form. * INDD = 1 INDE = INDD + N INDRWK = INDE + N INDTAU = 1 INDHOUS = INDTAU + N INDWRK = INDHOUS + LHTRD LLWORK = LWORK - INDWRK + 1 * CALL ZHETRD_2STAGE( JOBZ, UPLO, N, A, LDA, RWORK( INDD ), $ RWORK( INDE ), WORK( INDTAU ), $ WORK( INDHOUS ), LHTRD, WORK( INDWRK ), $ LLWORK, IINFO ) * * If all eigenvalues are desired and ABSTOL is less than or equal to * zero, then call DSTERF or ZUNGTR and ZSTEQR. If this fails for * some eigenvalue, then try DSTEBZ. * TEST = .FALSE. IF( INDEIG ) THEN IF( IL.EQ.1 .AND. IU.EQ.N ) THEN TEST = .TRUE. END IF END IF IF( ( ALLEIG .OR. TEST ) .AND. ( ABSTOL.LE.ZERO ) ) THEN CALL DCOPY( N, RWORK( INDD ), 1, W, 1 ) INDEE = INDRWK + 2*N IF( .NOT.WANTZ ) THEN CALL DCOPY( N-1, RWORK( INDE ), 1, RWORK( INDEE ), 1 ) CALL DSTERF( N, W, RWORK( INDEE ), INFO ) ELSE CALL ZLACPY( 'A', N, N, A, LDA, Z, LDZ ) CALL ZUNGTR( UPLO, N, Z, LDZ, WORK( INDTAU ), $ WORK( INDWRK ), LLWORK, IINFO ) CALL DCOPY( N-1, RWORK( INDE ), 1, RWORK( INDEE ), 1 ) CALL ZSTEQR( JOBZ, N, W, RWORK( INDEE ), Z, LDZ, $ RWORK( INDRWK ), INFO ) IF( INFO.EQ.0 ) THEN DO 30 I = 1, N IFAIL( I ) = 0 30 CONTINUE END IF END IF IF( INFO.EQ.0 ) THEN M = N GO TO 40 END IF INFO = 0 END IF * * Otherwise, call DSTEBZ and, if eigenvectors are desired, ZSTEIN. * IF( WANTZ ) THEN ORDER = 'B' ELSE ORDER = 'E' END IF INDIBL = 1 INDISP = INDIBL + N INDIWK = INDISP + N CALL DSTEBZ( RANGE, ORDER, N, VLL, VUU, IL, IU, ABSTLL, $ RWORK( INDD ), RWORK( INDE ), M, NSPLIT, W, $ IWORK( INDIBL ), IWORK( INDISP ), RWORK( INDRWK ), $ IWORK( INDIWK ), INFO ) * IF( WANTZ ) THEN CALL ZSTEIN( N, RWORK( INDD ), RWORK( INDE ), M, W, $ IWORK( INDIBL ), IWORK( INDISP ), Z, LDZ, $ RWORK( INDRWK ), IWORK( INDIWK ), IFAIL, INFO ) * * Apply unitary matrix used in reduction to tridiagonal * form to eigenvectors returned by ZSTEIN. * CALL ZUNMTR( 'L', UPLO, 'N', N, M, A, LDA, WORK( INDTAU ), Z, $ LDZ, WORK( INDWRK ), LLWORK, IINFO ) END IF * * If matrix was scaled, then rescale eigenvalues appropriately. * 40 CONTINUE IF( ISCALE.EQ.1 ) THEN IF( INFO.EQ.0 ) THEN IMAX = M ELSE IMAX = INFO - 1 END IF CALL DSCAL( IMAX, ONE / SIGMA, W, 1 ) END IF * * If eigenvalues are not in order, then sort them, along with * eigenvectors. * IF( WANTZ ) THEN DO 60 J = 1, M - 1 I = 0 TMP1 = W( J ) DO 50 JJ = J + 1, M IF( W( JJ ).LT.TMP1 ) THEN I = JJ TMP1 = W( JJ ) END IF 50 CONTINUE * IF( I.NE.0 ) THEN ITMP1 = IWORK( INDIBL+I-1 ) W( I ) = W( J ) IWORK( INDIBL+I-1 ) = IWORK( INDIBL+J-1 ) W( J ) = TMP1 IWORK( INDIBL+J-1 ) = ITMP1 CALL ZSWAP( N, Z( 1, I ), 1, Z( 1, J ), 1 ) IF( INFO.NE.0 ) THEN ITMP1 = IFAIL( I ) IFAIL( I ) = IFAIL( J ) IFAIL( J ) = ITMP1 END IF END IF 60 CONTINUE END IF * * Set WORK(1) to optimal complex workspace size. * WORK( 1 ) = LWMIN * RETURN * * End of ZHEEVX_2STAGE * END