LAPACK 3.11.0
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches

◆ zgejsv()

subroutine zgejsv ( character*1  JOBA,
character*1  JOBU,
character*1  JOBV,
character*1  JOBR,
character*1  JOBT,
character*1  JOBP,
integer  M,
integer  N,
complex*16, dimension( lda, * )  A,
integer  LDA,
double precision, dimension( n )  SVA,
complex*16, dimension( ldu, * )  U,
integer  LDU,
complex*16, dimension( ldv, * )  V,
integer  LDV,
complex*16, dimension( lwork )  CWORK,
integer  LWORK,
double precision, dimension( lrwork )  RWORK,
integer  LRWORK,
integer, dimension( * )  IWORK,
integer  INFO 
)

ZGEJSV

Download ZGEJSV + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 ZGEJSV computes the singular value decomposition (SVD) of a complex M-by-N
 matrix [A], where M >= N. The SVD of [A] is written as

              [A] = [U] * [SIGMA] * [V]^*,

 where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N
 diagonal elements, [U] is an M-by-N (or M-by-M) unitary matrix, and
 [V] is an N-by-N unitary matrix. The diagonal elements of [SIGMA] are
 the singular values of [A]. The columns of [U] and [V] are the left and
 the right singular vectors of [A], respectively. The matrices [U] and [V]
 are computed and stored in the arrays U and V, respectively. The diagonal
 of [SIGMA] is computed and stored in the array SVA.

Arguments:

Parameters
[in]JOBA
          JOBA is CHARACTER*1
         Specifies the level of accuracy:
       = 'C': This option works well (high relative accuracy) if A = B * D,
              with well-conditioned B and arbitrary diagonal matrix D.
              The accuracy cannot be spoiled by COLUMN scaling. The
              accuracy of the computed output depends on the condition of
              B, and the procedure aims at the best theoretical accuracy.
              The relative error max_{i=1:N}|d sigma_i| / sigma_i is
              bounded by f(M,N)*epsilon* cond(B), independent of D.
              The input matrix is preprocessed with the QRF with column
              pivoting. This initial preprocessing and preconditioning by
              a rank revealing QR factorization is common for all values of
              JOBA. Additional actions are specified as follows:
       = 'E': Computation as with 'C' with an additional estimate of the
              condition number of B. It provides a realistic error bound.
       = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings
              D1, D2, and well-conditioned matrix C, this option gives
              higher accuracy than the 'C' option. If the structure of the
              input matrix is not known, and relative accuracy is
              desirable, then this option is advisable. The input matrix A
              is preprocessed with QR factorization with FULL (row and
              column) pivoting.
       = 'G': Computation as with 'F' with an additional estimate of the
              condition number of B, where A=B*D. If A has heavily weighted
              rows, then using this condition number gives too pessimistic
              error bound.
       = 'A': Small singular values are not well determined by the data 
              and are considered as noisy; the matrix is treated as
              numerically rank deficient. The error in the computed
              singular values is bounded by f(m,n)*epsilon*||A||.
              The computed SVD A = U * S * V^* restores A up to
              f(m,n)*epsilon*||A||.
              This gives the procedure the licence to discard (set to zero)
              all singular values below N*epsilon*||A||.
       = 'R': Similar as in 'A'. Rank revealing property of the initial
              QR factorization is used do reveal (using triangular factor)
              a gap sigma_{r+1} < epsilon * sigma_r in which case the
              numerical RANK is declared to be r. The SVD is computed with
              absolute error bounds, but more accurately than with 'A'.
[in]JOBU
          JOBU is CHARACTER*1
         Specifies whether to compute the columns of U:
       = 'U': N columns of U are returned in the array U.
       = 'F': full set of M left sing. vectors is returned in the array U.
       = 'W': U may be used as workspace of length M*N. See the description
              of U.
       = 'N': U is not computed.
[in]JOBV
          JOBV is CHARACTER*1
         Specifies whether to compute the matrix V:
       = 'V': N columns of V are returned in the array V; Jacobi rotations
              are not explicitly accumulated.
       = 'J': N columns of V are returned in the array V, but they are
              computed as the product of Jacobi rotations, if JOBT = 'N'.
       = 'W': V may be used as workspace of length N*N. See the description
              of V.
       = 'N': V is not computed.
[in]JOBR
          JOBR is CHARACTER*1
         Specifies the RANGE for the singular values. Issues the licence to
         set to zero small positive singular values if they are outside
         specified range. If A .NE. 0 is scaled so that the largest singular
         value of c*A is around SQRT(BIG), BIG=DLAMCH('O'), then JOBR issues
         the licence to kill columns of A whose norm in c*A is less than
         SQRT(SFMIN) (for JOBR = 'R'), or less than SMALL=SFMIN/EPSLN,
         where SFMIN=DLAMCH('S'), EPSLN=DLAMCH('E').
       = 'N': Do not kill small columns of c*A. This option assumes that
              BLAS and QR factorizations and triangular solvers are
              implemented to work in that range. If the condition of A
              is greater than BIG, use ZGESVJ.
       = 'R': RESTRICTED range for sigma(c*A) is [SQRT(SFMIN), SQRT(BIG)]
              (roughly, as described above). This option is recommended.
                                             ===========================
         For computing the singular values in the FULL range [SFMIN,BIG]
         use ZGESVJ.
[in]JOBT
          JOBT is CHARACTER*1
         If the matrix is square then the procedure may determine to use
         transposed A if A^* seems to be better with respect to convergence.
         If the matrix is not square, JOBT is ignored. 
         The decision is based on two values of entropy over the adjoint
         orbit of A^* * A. See the descriptions of WORK(6) and WORK(7).
       = 'T': transpose if entropy test indicates possibly faster
         convergence of Jacobi process if A^* is taken as input. If A is
         replaced with A^*, then the row pivoting is included automatically.
       = 'N': do not speculate.
         The option 'T' can be used to compute only the singular values, or
         the full SVD (U, SIGMA and V). For only one set of singular vectors
         (U or V), the caller should provide both U and V, as one of the
         matrices is used as workspace if the matrix A is transposed.
         The implementer can easily remove this constraint and make the
         code more complicated. See the descriptions of U and V.
         In general, this option is considered experimental, and 'N'; should
         be preferred. This is subject to changes in the future.
[in]JOBP
          JOBP is CHARACTER*1
         Issues the licence to introduce structured perturbations to drown
         denormalized numbers. This licence should be active if the
         denormals are poorly implemented, causing slow computation,
         especially in cases of fast convergence (!). For details see [1,2].
         For the sake of simplicity, this perturbations are included only
         when the full SVD or only the singular values are requested. The
         implementer/user can easily add the perturbation for the cases of
         computing one set of singular vectors.
       = 'P': introduce perturbation
       = 'N': do not perturb
[in]M
          M is INTEGER
         The number of rows of the input matrix A.  M >= 0.
[in]N
          N is INTEGER
         The number of columns of the input matrix A. M >= N >= 0.
[in,out]A
          A is COMPLEX*16 array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is DOUBLE PRECISION array, dimension (N)
          On exit,
          - For WORK(1)/WORK(2) = ONE: The singular values of A. During the
            computation SVA contains Euclidean column norms of the
            iterated matrices in the array A.
          - For WORK(1) .NE. WORK(2): The singular values of A are
            (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if
            sigma_max(A) overflows or if small singular values have been
            saved from underflow by scaling the input matrix A.
          - If JOBR='R' then some of the singular values may be returned
            as exact zeros obtained by "set to zero" because they are
            below the numerical rank threshold or are denormalized numbers.
[out]U
          U is COMPLEX*16 array, dimension ( LDU, N )
          If JOBU = 'U', then U contains on exit the M-by-N matrix of
                         the left singular vectors.
          If JOBU = 'F', then U contains on exit the M-by-M matrix of
                         the left singular vectors, including an ONB
                         of the orthogonal complement of the Range(A).
          If JOBU = 'W'  .AND. (JOBV = 'V' .AND. JOBT = 'T' .AND. M = N),
                         then U is used as workspace if the procedure
                         replaces A with A^*. In that case, [V] is computed
                         in U as left singular vectors of A^* and then
                         copied back to the V array. This 'W' option is just
                         a reminder to the caller that in this case U is
                         reserved as workspace of length N*N.
          If JOBU = 'N'  U is not referenced, unless JOBT='T'.
[in]LDU
          LDU is INTEGER
          The leading dimension of the array U,  LDU >= 1.
          IF  JOBU = 'U' or 'F' or 'W',  then LDU >= M.
[out]V
          V is COMPLEX*16 array, dimension ( LDV, N )
          If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'W', AND (JOBU = 'U' AND JOBT = 'T' AND M = N),
                         then V is used as workspace if the pprocedure
                         replaces A with A^*. In that case, [U] is computed
                         in V as right singular vectors of A^* and then
                         copied back to the U array. This 'W' option is just
                         a reminder to the caller that in this case V is
                         reserved as workspace of length N*N.
          If JOBV = 'N'  V is not referenced, unless JOBT='T'.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V,  LDV >= 1.
          If JOBV = 'V' or 'J' or 'W', then LDV >= N.
[out]CWORK
          CWORK is COMPLEX*16 array, dimension (MAX(2,LWORK))
          If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
          LRWORK=-1), then on exit CWORK(1) contains the required length of
          CWORK for the job parameters used in the call.
[in]LWORK
          LWORK is INTEGER
          Length of CWORK to confirm proper allocation of workspace.
          LWORK depends on the job:

          1. If only SIGMA is needed ( JOBU = 'N', JOBV = 'N' ) and
            1.1 .. no scaled condition estimate required (JOBA.NE.'E'.AND.JOBA.NE.'G'):
               LWORK >= 2*N+1. This is the minimal requirement.
               ->> For optimal performance (blocked code) the optimal value
               is LWORK >= N + (N+1)*NB. Here NB is the optimal
               block size for ZGEQP3 and ZGEQRF.
               In general, optimal LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZGEQRF), LWORK(ZGESVJ)).
            1.2. .. an estimate of the scaled condition number of A is
               required (JOBA='E', or 'G'). In this case, LWORK the minimal
               requirement is LWORK >= N*N + 2*N.
               ->> For optimal performance (blocked code) the optimal value
               is LWORK >= max(N+(N+1)*NB, N*N+2*N)=N**2+2*N.
               In general, the optimal length LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZGEQRF), LWORK(ZGESVJ),
                            N*N+LWORK(ZPOCON)).
          2. If SIGMA and the right singular vectors are needed (JOBV = 'V'),
             (JOBU = 'N')
            2.1   .. no scaled condition estimate requested (JOBE = 'N'):    
            -> the minimal requirement is LWORK >= 3*N.
            -> For optimal performance, 
               LWORK >= max(N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
               where NB is the optimal block size for ZGEQP3, ZGEQRF, ZGELQ,
               ZUNMLQ. In general, the optimal length LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3), N+LWORK(ZGESVJ),
                       N+LWORK(ZGELQF), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMLQ)).
            2.2 .. an estimate of the scaled condition number of A is
               required (JOBA='E', or 'G').
            -> the minimal requirement is LWORK >= 3*N.      
            -> For optimal performance, 
               LWORK >= max(N+(N+1)*NB, 2*N,2*N+N*NB)=2*N+N*NB,
               where NB is the optimal block size for ZGEQP3, ZGEQRF, ZGELQ,
               ZUNMLQ. In general, the optimal length LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3), LWORK(ZPOCON), N+LWORK(ZGESVJ),
                       N+LWORK(ZGELQF), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMLQ)).   
          3. If SIGMA and the left singular vectors are needed
            3.1  .. no scaled condition estimate requested (JOBE = 'N'):
            -> the minimal requirement is LWORK >= 3*N.
            -> For optimal performance:
               if JOBU = 'U' :: LWORK >= max(3*N, N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
               where NB is the optimal block size for ZGEQP3, ZGEQRF, ZUNMQR.
               In general, the optimal length LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMQR)). 
            3.2  .. an estimate of the scaled condition number of A is
               required (JOBA='E', or 'G').
            -> the minimal requirement is LWORK >= 3*N.
            -> For optimal performance:
               if JOBU = 'U' :: LWORK >= max(3*N, N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
               where NB is the optimal block size for ZGEQP3, ZGEQRF, ZUNMQR.
               In general, the optimal length LWORK is computed as
               LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZPOCON),
                        2*N+LWORK(ZGEQRF), N+LWORK(ZUNMQR)).
          4. If the full SVD is needed: (JOBU = 'U' or JOBU = 'F') and 
            4.1. if JOBV = 'V'  
               the minimal requirement is LWORK >= 5*N+2*N*N. 
            4.2. if JOBV = 'J' the minimal requirement is 
               LWORK >= 4*N+N*N.
            In both cases, the allocated CWORK can accommodate blocked runs
            of ZGEQP3, ZGEQRF, ZGELQF, SUNMQR, ZUNMLQ.

          If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
          LRWORK=-1), then on exit CWORK(1) contains the optimal and CWORK(2) contains the
          minimal length of CWORK for the job parameters used in the call.
[out]RWORK
          RWORK is DOUBLE PRECISION array, dimension (MAX(7,LWORK))
          On exit,
          RWORK(1) = Determines the scaling factor SCALE = RWORK(2) / RWORK(1)
                    such that SCALE*SVA(1:N) are the computed singular values
                    of A. (See the description of SVA().)
          RWORK(2) = See the description of RWORK(1).
          RWORK(3) = SCONDA is an estimate for the condition number of
                    column equilibrated A. (If JOBA = 'E' or 'G')
                    SCONDA is an estimate of SQRT(||(R^* * R)^(-1)||_1).
                    It is computed using ZPOCON. It holds
                    N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
                    where R is the triangular factor from the QRF of A.
                    However, if R is truncated and the numerical rank is
                    determined to be strictly smaller than N, SCONDA is
                    returned as -1, thus indicating that the smallest
                    singular values might be lost.

          If full SVD is needed, the following two condition numbers are
          useful for the analysis of the algorithm. They are provided for
          a developer/implementer who is familiar with the details of
          the method.

          RWORK(4) = an estimate of the scaled condition number of the
                    triangular factor in the first QR factorization.
          RWORK(5) = an estimate of the scaled condition number of the
                    triangular factor in the second QR factorization.
          The following two parameters are computed if JOBT = 'T'.
          They are provided for a developer/implementer who is familiar
          with the details of the method.
          RWORK(6) = the entropy of A^* * A :: this is the Shannon entropy
                    of diag(A^* * A) / Trace(A^* * A) taken as point in the
                    probability simplex.
          RWORK(7) = the entropy of A * A^*. (See the description of RWORK(6).)
          If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
          LRWORK=-1), then on exit RWORK(1) contains the required length of
          RWORK for the job parameters used in the call.
[in]LRWORK
          LRWORK is INTEGER
          Length of RWORK to confirm proper allocation of workspace.
          LRWORK depends on the job:

       1. If only the singular values are requested i.e. if
          LSAME(JOBU,'N') .AND. LSAME(JOBV,'N')
          then:
          1.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
               then: LRWORK = max( 7, 2 * M ).
          1.2. Otherwise, LRWORK  = max( 7,  N ).
       2. If singular values with the right singular vectors are requested
          i.e. if
          (LSAME(JOBV,'V').OR.LSAME(JOBV,'J')) .AND.
          .NOT.(LSAME(JOBU,'U').OR.LSAME(JOBU,'F'))
          then:
          2.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
          then LRWORK = max( 7, 2 * M ).
          2.2. Otherwise, LRWORK  = max( 7,  N ).
       3. If singular values with the left singular vectors are requested, i.e. if
          (LSAME(JOBU,'U').OR.LSAME(JOBU,'F')) .AND.
          .NOT.(LSAME(JOBV,'V').OR.LSAME(JOBV,'J'))
          then:
          3.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
          then LRWORK = max( 7, 2 * M ).
          3.2. Otherwise, LRWORK  = max( 7,  N ).
       4. If singular values with both the left and the right singular vectors
          are requested, i.e. if
          (LSAME(JOBU,'U').OR.LSAME(JOBU,'F')) .AND.
          (LSAME(JOBV,'V').OR.LSAME(JOBV,'J'))
          then:
          4.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
          then LRWORK = max( 7, 2 * M ).
          4.2. Otherwise, LRWORK  = max( 7, N ).

          If, on entry, LRWORK = -1 or LWORK=-1, a workspace query is assumed and 
          the length of RWORK is returned in RWORK(1). 
[out]IWORK
          IWORK is INTEGER array, of dimension at least 4, that further depends 
          on the job:

          1. If only the singular values are requested then:
             If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') ) 
             then the length of IWORK is N+M; otherwise the length of IWORK is N.
          2. If the singular values and the right singular vectors are requested then:
             If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') ) 
             then the length of IWORK is N+M; otherwise the length of IWORK is N. 
          3. If the singular values and the left singular vectors are requested then:
             If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') ) 
             then the length of IWORK is N+M; otherwise the length of IWORK is N. 
          4. If the singular values with both the left and the right singular vectors
             are requested, then:      
             4.1. If LSAME(JOBV,'J') the length of IWORK is determined as follows:
                  If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') ) 
                  then the length of IWORK is N+M; otherwise the length of IWORK is N. 
             4.2. If LSAME(JOBV,'V') the length of IWORK is determined as follows:
                  If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') ) 
                  then the length of IWORK is 2*N+M; otherwise the length of IWORK is 2*N.
        
          On exit,
          IWORK(1) = the numerical rank determined after the initial
                     QR factorization with pivoting. See the descriptions
                     of JOBA and JOBR.
          IWORK(2) = the number of the computed nonzero singular values
          IWORK(3) = if nonzero, a warning message:
                     If IWORK(3) = 1 then some of the column norms of A
                     were denormalized floats. The requested high accuracy
                     is not warranted by the data.
          IWORK(4) = 1 or -1. If IWORK(4) = 1, then the procedure used A^* to
                     do the job as specified by the JOB parameters.
          If the call to ZGEJSV is a workspace query (indicated by LWORK = -1 or
          LRWORK = -1), then on exit IWORK(1) contains the required length of 
          IWORK for the job parameters used in the call.
[out]INFO
          INFO is INTEGER
           < 0:  if INFO = -i, then the i-th argument had an illegal value.
           = 0:  successful exit;
           > 0:  ZGEJSV  did not converge in the maximal allowed number
                 of sweeps. The computed values may be inaccurate.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
  ZGEJSV implements a preconditioned Jacobi SVD algorithm. It uses ZGEQP3,
  ZGEQRF, and ZGELQF as preprocessors and preconditioners. Optionally, an
  additional row pivoting can be used as a preprocessor, which in some
  cases results in much higher accuracy. An example is matrix A with the
  structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned
  diagonal matrices and C is well-conditioned matrix. In that case, complete
  pivoting in the first QR factorizations provides accuracy dependent on the
  condition number of C, and independent of D1, D2. Such higher accuracy is
  not completely understood theoretically, but it works well in practice.
  Further, if A can be written as A = B*D, with well-conditioned B and some
  diagonal D, then the high accuracy is guaranteed, both theoretically and
  in software, independent of D. For more details see [1], [2].
     The computational range for the singular values can be the full range
  ( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS
  & LAPACK routines called by ZGEJSV are implemented to work in that range.
  If that is not the case, then the restriction for safe computation with
  the singular values in the range of normalized IEEE numbers is that the
  spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not
  overflow. This code (ZGEJSV) is best used in this restricted range,
  meaning that singular values of magnitude below ||A||_2 / DLAMCH('O') are
  returned as zeros. See JOBR for details on this.
     Further, this implementation is somewhat slower than the one described
  in [1,2] due to replacement of some non-LAPACK components, and because
  the choice of some tuning parameters in the iterative part (ZGESVJ) is
  left to the implementer on a particular machine.
     The rank revealing QR factorization (in this code: ZGEQP3) should be
  implemented as in [3]. We have a new version of ZGEQP3 under development
  that is more robust than the current one in LAPACK, with a cleaner cut in
  rank deficient cases. It will be available in the SIGMA library [4].
  If M is much larger than N, it is obvious that the initial QRF with
  column pivoting can be preprocessed by the QRF without pivoting. That
  well known trick is not used in ZGEJSV because in some cases heavy row
  weighting can be treated with complete pivoting. The overhead in cases
  M much larger than N is then only due to pivoting, but the benefits in
  terms of accuracy have prevailed. The implementer/user can incorporate
  this extra QRF step easily. The implementer can also improve data movement
  (matrix transpose, matrix copy, matrix transposed copy) - this
  implementation of ZGEJSV uses only the simplest, naive data movement.
Contributor:
Zlatko Drmac, Department of Mathematics, Faculty of Science, University of Zagreb (Zagreb, Croatia); drmac.nosp@m.@mat.nosp@m.h.hr
References:
 [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
     SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
     LAPACK Working note 169.
 [2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
     SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
     LAPACK Working note 170.
 [3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR
     factorization software - a case study.
     ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28.
     LAPACK Working note 176.
 [4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
     QSVD, (H,K)-SVD computations.
     Department of Mathematics, University of Zagreb, 2008, 2016.
Bugs, examples and comments:
Please report all bugs and send interesting examples and/or comments to drmac.nosp@m.@mat.nosp@m.h.hr. Thank you.

Definition at line 566 of file zgejsv.f.

569*
570* -- LAPACK computational routine --
571* -- LAPACK is a software package provided by Univ. of Tennessee, --
572* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
573*
574* .. Scalar Arguments ..
575 IMPLICIT NONE
576 INTEGER INFO, LDA, LDU, LDV, LWORK, LRWORK, M, N
577* ..
578* .. Array Arguments ..
579 COMPLEX*16 A( LDA, * ), U( LDU, * ), V( LDV, * ),
580 $ CWORK( LWORK )
581 DOUBLE PRECISION SVA( N ), RWORK( LRWORK )
582 INTEGER IWORK( * )
583 CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV
584* ..
585*
586* ===========================================================================
587*
588* .. Local Parameters ..
589 DOUBLE PRECISION ZERO, ONE
590 parameter( zero = 0.0d0, one = 1.0d0 )
591 COMPLEX*16 CZERO, CONE
592 parameter( czero = ( 0.0d0, 0.0d0 ), cone = ( 1.0d0, 0.0d0 ) )
593* ..
594* .. Local Scalars ..
595 COMPLEX*16 CTEMP
596 DOUBLE PRECISION AAPP, AAQQ, AATMAX, AATMIN, BIG, BIG1,
597 $ COND_OK, CONDR1, CONDR2, ENTRA, ENTRAT, EPSLN,
598 $ MAXPRJ, SCALEM, SCONDA, SFMIN, SMALL, TEMP1,
599 $ USCAL1, USCAL2, XSC
600 INTEGER IERR, N1, NR, NUMRANK, p, q, WARNING
601 LOGICAL ALMORT, DEFR, ERREST, GOSCAL, JRACC, KILL, LQUERY,
602 $ LSVEC, L2ABER, L2KILL, L2PERT, L2RANK, L2TRAN, NOSCAL,
603 $ ROWPIV, RSVEC, TRANSP
604*
605 INTEGER OPTWRK, MINWRK, MINRWRK, MINIWRK
606 INTEGER LWCON, LWLQF, LWQP3, LWQRF, LWUNMLQ, LWUNMQR, LWUNMQRM,
607 $ LWSVDJ, LWSVDJV, LRWQP3, LRWCON, LRWSVDJ, IWOFF
608 INTEGER LWRK_ZGELQF, LWRK_ZGEQP3, LWRK_ZGEQP3N, LWRK_ZGEQRF,
609 $ LWRK_ZGESVJ, LWRK_ZGESVJV, LWRK_ZGESVJU, LWRK_ZUNMLQ,
610 $ LWRK_ZUNMQR, LWRK_ZUNMQRM
611* ..
612* .. Local Arrays
613 COMPLEX*16 CDUMMY(1)
614 DOUBLE PRECISION RDUMMY(1)
615*
616* .. Intrinsic Functions ..
617 INTRINSIC abs, dcmplx, conjg, dlog, max, min, dble, nint, sqrt
618* ..
619* .. External Functions ..
620 DOUBLE PRECISION DLAMCH, DZNRM2
621 INTEGER IDAMAX, IZAMAX
622 LOGICAL LSAME
623 EXTERNAL idamax, izamax, lsame, dlamch, dznrm2
624* ..
625* .. External Subroutines ..
629 $ xerbla
630*
631 EXTERNAL zgesvj
632* ..
633*
634* Test the input arguments
635*
636 lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
637 jracc = lsame( jobv, 'J' )
638 rsvec = lsame( jobv, 'V' ) .OR. jracc
639 rowpiv = lsame( joba, 'F' ) .OR. lsame( joba, 'G' )
640 l2rank = lsame( joba, 'R' )
641 l2aber = lsame( joba, 'A' )
642 errest = lsame( joba, 'E' ) .OR. lsame( joba, 'G' )
643 l2tran = lsame( jobt, 'T' ) .AND. ( m .EQ. n )
644 l2kill = lsame( jobr, 'R' )
645 defr = lsame( jobr, 'N' )
646 l2pert = lsame( jobp, 'P' )
647*
648 lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
649*
650 IF ( .NOT.(rowpiv .OR. l2rank .OR. l2aber .OR.
651 $ errest .OR. lsame( joba, 'C' ) )) THEN
652 info = - 1
653 ELSE IF ( .NOT.( lsvec .OR. lsame( jobu, 'N' ) .OR.
654 $ ( lsame( jobu, 'W' ) .AND. rsvec .AND. l2tran ) ) ) THEN
655 info = - 2
656 ELSE IF ( .NOT.( rsvec .OR. lsame( jobv, 'N' ) .OR.
657 $ ( lsame( jobv, 'W' ) .AND. lsvec .AND. l2tran ) ) ) THEN
658 info = - 3
659 ELSE IF ( .NOT. ( l2kill .OR. defr ) ) THEN
660 info = - 4
661 ELSE IF ( .NOT. ( lsame(jobt,'T') .OR. lsame(jobt,'N') ) ) THEN
662 info = - 5
663 ELSE IF ( .NOT. ( l2pert .OR. lsame( jobp, 'N' ) ) ) THEN
664 info = - 6
665 ELSE IF ( m .LT. 0 ) THEN
666 info = - 7
667 ELSE IF ( ( n .LT. 0 ) .OR. ( n .GT. m ) ) THEN
668 info = - 8
669 ELSE IF ( lda .LT. m ) THEN
670 info = - 10
671 ELSE IF ( lsvec .AND. ( ldu .LT. m ) ) THEN
672 info = - 13
673 ELSE IF ( rsvec .AND. ( ldv .LT. n ) ) THEN
674 info = - 15
675 ELSE
676* #:)
677 info = 0
678 END IF
679*
680 IF ( info .EQ. 0 ) THEN
681* .. compute the minimal and the optimal workspace lengths
682* [[The expressions for computing the minimal and the optimal
683* values of LCWORK, LRWORK are written with a lot of redundancy and
684* can be simplified. However, this verbose form is useful for
685* maintenance and modifications of the code.]]
686*
687* .. minimal workspace length for ZGEQP3 of an M x N matrix,
688* ZGEQRF of an N x N matrix, ZGELQF of an N x N matrix,
689* ZUNMLQ for computing N x N matrix, ZUNMQR for computing N x N
690* matrix, ZUNMQR for computing M x N matrix, respectively.
691 lwqp3 = n+1
692 lwqrf = max( 1, n )
693 lwlqf = max( 1, n )
694 lwunmlq = max( 1, n )
695 lwunmqr = max( 1, n )
696 lwunmqrm = max( 1, m )
697* .. minimal workspace length for ZPOCON of an N x N matrix
698 lwcon = 2 * n
699* .. minimal workspace length for ZGESVJ of an N x N matrix,
700* without and with explicit accumulation of Jacobi rotations
701 lwsvdj = max( 2 * n, 1 )
702 lwsvdjv = max( 2 * n, 1 )
703* .. minimal REAL workspace length for ZGEQP3, ZPOCON, ZGESVJ
704 lrwqp3 = 2 * n
705 lrwcon = n
706 lrwsvdj = n
707 IF ( lquery ) THEN
708 CALL zgeqp3( m, n, a, lda, iwork, cdummy, cdummy, -1,
709 $ rdummy, ierr )
710 lwrk_zgeqp3 = int( cdummy(1) )
711 CALL zgeqrf( n, n, a, lda, cdummy, cdummy,-1, ierr )
712 lwrk_zgeqrf = int( cdummy(1) )
713 CALL zgelqf( n, n, a, lda, cdummy, cdummy,-1, ierr )
714 lwrk_zgelqf = int( cdummy(1) )
715 END IF
716 minwrk = 2
717 optwrk = 2
718 miniwrk = n
719 IF ( .NOT. (lsvec .OR. rsvec ) ) THEN
720* .. minimal and optimal sizes of the complex workspace if
721* only the singular values are requested
722 IF ( errest ) THEN
723 minwrk = max( n+lwqp3, n**2+lwcon, n+lwqrf, lwsvdj )
724 ELSE
725 minwrk = max( n+lwqp3, n+lwqrf, lwsvdj )
726 END IF
727 IF ( lquery ) THEN
728 CALL zgesvj( 'L', 'N', 'N', n, n, a, lda, sva, n, v,
729 $ ldv, cdummy, -1, rdummy, -1, ierr )
730 lwrk_zgesvj = int( cdummy(1) )
731 IF ( errest ) THEN
732 optwrk = max( n+lwrk_zgeqp3, n**2+lwcon,
733 $ n+lwrk_zgeqrf, lwrk_zgesvj )
734 ELSE
735 optwrk = max( n+lwrk_zgeqp3, n+lwrk_zgeqrf,
736 $ lwrk_zgesvj )
737 END IF
738 END IF
739 IF ( l2tran .OR. rowpiv ) THEN
740 IF ( errest ) THEN
741 minrwrk = max( 7, 2*m, lrwqp3, lrwcon, lrwsvdj )
742 ELSE
743 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj )
744 END IF
745 ELSE
746 IF ( errest ) THEN
747 minrwrk = max( 7, lrwqp3, lrwcon, lrwsvdj )
748 ELSE
749 minrwrk = max( 7, lrwqp3, lrwsvdj )
750 END IF
751 END IF
752 IF ( rowpiv .OR. l2tran ) miniwrk = miniwrk + m
753 ELSE IF ( rsvec .AND. (.NOT.lsvec) ) THEN
754* .. minimal and optimal sizes of the complex workspace if the
755* singular values and the right singular vectors are requested
756 IF ( errest ) THEN
757 minwrk = max( n+lwqp3, lwcon, lwsvdj, n+lwlqf,
758 $ 2*n+lwqrf, n+lwsvdj, n+lwunmlq )
759 ELSE
760 minwrk = max( n+lwqp3, lwsvdj, n+lwlqf, 2*n+lwqrf,
761 $ n+lwsvdj, n+lwunmlq )
762 END IF
763 IF ( lquery ) THEN
764 CALL zgesvj( 'L', 'U', 'N', n,n, u, ldu, sva, n, a,
765 $ lda, cdummy, -1, rdummy, -1, ierr )
766 lwrk_zgesvj = int( cdummy(1) )
767 CALL zunmlq( 'L', 'C', n, n, n, a, lda, cdummy,
768 $ v, ldv, cdummy, -1, ierr )
769 lwrk_zunmlq = int( cdummy(1) )
770 IF ( errest ) THEN
771 optwrk = max( n+lwrk_zgeqp3, lwcon, lwrk_zgesvj,
772 $ n+lwrk_zgelqf, 2*n+lwrk_zgeqrf,
773 $ n+lwrk_zgesvj, n+lwrk_zunmlq )
774 ELSE
775 optwrk = max( n+lwrk_zgeqp3, lwrk_zgesvj,n+lwrk_zgelqf,
776 $ 2*n+lwrk_zgeqrf, n+lwrk_zgesvj,
777 $ n+lwrk_zunmlq )
778 END IF
779 END IF
780 IF ( l2tran .OR. rowpiv ) THEN
781 IF ( errest ) THEN
782 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj, lrwcon )
783 ELSE
784 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj )
785 END IF
786 ELSE
787 IF ( errest ) THEN
788 minrwrk = max( 7, lrwqp3, lrwsvdj, lrwcon )
789 ELSE
790 minrwrk = max( 7, lrwqp3, lrwsvdj )
791 END IF
792 END IF
793 IF ( rowpiv .OR. l2tran ) miniwrk = miniwrk + m
794 ELSE IF ( lsvec .AND. (.NOT.rsvec) ) THEN
795* .. minimal and optimal sizes of the complex workspace if the
796* singular values and the left singular vectors are requested
797 IF ( errest ) THEN
798 minwrk = n + max( lwqp3,lwcon,n+lwqrf,lwsvdj,lwunmqrm )
799 ELSE
800 minwrk = n + max( lwqp3, n+lwqrf, lwsvdj, lwunmqrm )
801 END IF
802 IF ( lquery ) THEN
803 CALL zgesvj( 'L', 'U', 'N', n,n, u, ldu, sva, n, a,
804 $ lda, cdummy, -1, rdummy, -1, ierr )
805 lwrk_zgesvj = int( cdummy(1) )
806 CALL zunmqr( 'L', 'N', m, n, n, a, lda, cdummy, u,
807 $ ldu, cdummy, -1, ierr )
808 lwrk_zunmqrm = int( cdummy(1) )
809 IF ( errest ) THEN
810 optwrk = n + max( lwrk_zgeqp3, lwcon, n+lwrk_zgeqrf,
811 $ lwrk_zgesvj, lwrk_zunmqrm )
812 ELSE
813 optwrk = n + max( lwrk_zgeqp3, n+lwrk_zgeqrf,
814 $ lwrk_zgesvj, lwrk_zunmqrm )
815 END IF
816 END IF
817 IF ( l2tran .OR. rowpiv ) THEN
818 IF ( errest ) THEN
819 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj, lrwcon )
820 ELSE
821 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj )
822 END IF
823 ELSE
824 IF ( errest ) THEN
825 minrwrk = max( 7, lrwqp3, lrwsvdj, lrwcon )
826 ELSE
827 minrwrk = max( 7, lrwqp3, lrwsvdj )
828 END IF
829 END IF
830 IF ( rowpiv .OR. l2tran ) miniwrk = miniwrk + m
831 ELSE
832* .. minimal and optimal sizes of the complex workspace if the
833* full SVD is requested
834 IF ( .NOT. jracc ) THEN
835 IF ( errest ) THEN
836 minwrk = max( n+lwqp3, n+lwcon, 2*n+n**2+lwcon,
837 $ 2*n+lwqrf, 2*n+lwqp3,
838 $ 2*n+n**2+n+lwlqf, 2*n+n**2+n+n**2+lwcon,
839 $ 2*n+n**2+n+lwsvdj, 2*n+n**2+n+lwsvdjv,
840 $ 2*n+n**2+n+lwunmqr,2*n+n**2+n+lwunmlq,
841 $ n+n**2+lwsvdj, n+lwunmqrm )
842 ELSE
843 minwrk = max( n+lwqp3, 2*n+n**2+lwcon,
844 $ 2*n+lwqrf, 2*n+lwqp3,
845 $ 2*n+n**2+n+lwlqf, 2*n+n**2+n+n**2+lwcon,
846 $ 2*n+n**2+n+lwsvdj, 2*n+n**2+n+lwsvdjv,
847 $ 2*n+n**2+n+lwunmqr,2*n+n**2+n+lwunmlq,
848 $ n+n**2+lwsvdj, n+lwunmqrm )
849 END IF
850 miniwrk = miniwrk + n
851 IF ( rowpiv .OR. l2tran ) miniwrk = miniwrk + m
852 ELSE
853 IF ( errest ) THEN
854 minwrk = max( n+lwqp3, n+lwcon, 2*n+lwqrf,
855 $ 2*n+n**2+lwsvdjv, 2*n+n**2+n+lwunmqr,
856 $ n+lwunmqrm )
857 ELSE
858 minwrk = max( n+lwqp3, 2*n+lwqrf,
859 $ 2*n+n**2+lwsvdjv, 2*n+n**2+n+lwunmqr,
860 $ n+lwunmqrm )
861 END IF
862 IF ( rowpiv .OR. l2tran ) miniwrk = miniwrk + m
863 END IF
864 IF ( lquery ) THEN
865 CALL zunmqr( 'L', 'N', m, n, n, a, lda, cdummy, u,
866 $ ldu, cdummy, -1, ierr )
867 lwrk_zunmqrm = int( cdummy(1) )
868 CALL zunmqr( 'L', 'N', n, n, n, a, lda, cdummy, u,
869 $ ldu, cdummy, -1, ierr )
870 lwrk_zunmqr = int( cdummy(1) )
871 IF ( .NOT. jracc ) THEN
872 CALL zgeqp3( n,n, a, lda, iwork, cdummy,cdummy, -1,
873 $ rdummy, ierr )
874 lwrk_zgeqp3n = int( cdummy(1) )
875 CALL zgesvj( 'L', 'U', 'N', n, n, u, ldu, sva,
876 $ n, v, ldv, cdummy, -1, rdummy, -1, ierr )
877 lwrk_zgesvj = int( cdummy(1) )
878 CALL zgesvj( 'U', 'U', 'N', n, n, u, ldu, sva,
879 $ n, v, ldv, cdummy, -1, rdummy, -1, ierr )
880 lwrk_zgesvju = int( cdummy(1) )
881 CALL zgesvj( 'L', 'U', 'V', n, n, u, ldu, sva,
882 $ n, v, ldv, cdummy, -1, rdummy, -1, ierr )
883 lwrk_zgesvjv = int( cdummy(1) )
884 CALL zunmlq( 'L', 'C', n, n, n, a, lda, cdummy,
885 $ v, ldv, cdummy, -1, ierr )
886 lwrk_zunmlq = int( cdummy(1) )
887 IF ( errest ) THEN
888 optwrk = max( n+lwrk_zgeqp3, n+lwcon,
889 $ 2*n+n**2+lwcon, 2*n+lwrk_zgeqrf,
890 $ 2*n+lwrk_zgeqp3n,
891 $ 2*n+n**2+n+lwrk_zgelqf,
892 $ 2*n+n**2+n+n**2+lwcon,
893 $ 2*n+n**2+n+lwrk_zgesvj,
894 $ 2*n+n**2+n+lwrk_zgesvjv,
895 $ 2*n+n**2+n+lwrk_zunmqr,
896 $ 2*n+n**2+n+lwrk_zunmlq,
897 $ n+n**2+lwrk_zgesvju,
898 $ n+lwrk_zunmqrm )
899 ELSE
900 optwrk = max( n+lwrk_zgeqp3,
901 $ 2*n+n**2+lwcon, 2*n+lwrk_zgeqrf,
902 $ 2*n+lwrk_zgeqp3n,
903 $ 2*n+n**2+n+lwrk_zgelqf,
904 $ 2*n+n**2+n+n**2+lwcon,
905 $ 2*n+n**2+n+lwrk_zgesvj,
906 $ 2*n+n**2+n+lwrk_zgesvjv,
907 $ 2*n+n**2+n+lwrk_zunmqr,
908 $ 2*n+n**2+n+lwrk_zunmlq,
909 $ n+n**2+lwrk_zgesvju,
910 $ n+lwrk_zunmqrm )
911 END IF
912 ELSE
913 CALL zgesvj( 'L', 'U', 'V', n, n, u, ldu, sva,
914 $ n, v, ldv, cdummy, -1, rdummy, -1, ierr )
915 lwrk_zgesvjv = int( cdummy(1) )
916 CALL zunmqr( 'L', 'N', n, n, n, cdummy, n, cdummy,
917 $ v, ldv, cdummy, -1, ierr )
918 lwrk_zunmqr = int( cdummy(1) )
919 CALL zunmqr( 'L', 'N', m, n, n, a, lda, cdummy, u,
920 $ ldu, cdummy, -1, ierr )
921 lwrk_zunmqrm = int( cdummy(1) )
922 IF ( errest ) THEN
923 optwrk = max( n+lwrk_zgeqp3, n+lwcon,
924 $ 2*n+lwrk_zgeqrf, 2*n+n**2,
925 $ 2*n+n**2+lwrk_zgesvjv,
926 $ 2*n+n**2+n+lwrk_zunmqr,n+lwrk_zunmqrm )
927 ELSE
928 optwrk = max( n+lwrk_zgeqp3, 2*n+lwrk_zgeqrf,
929 $ 2*n+n**2, 2*n+n**2+lwrk_zgesvjv,
930 $ 2*n+n**2+n+lwrk_zunmqr,
931 $ n+lwrk_zunmqrm )
932 END IF
933 END IF
934 END IF
935 IF ( l2tran .OR. rowpiv ) THEN
936 minrwrk = max( 7, 2*m, lrwqp3, lrwsvdj, lrwcon )
937 ELSE
938 minrwrk = max( 7, lrwqp3, lrwsvdj, lrwcon )
939 END IF
940 END IF
941 minwrk = max( 2, minwrk )
942 optwrk = max( minwrk, optwrk )
943 IF ( lwork .LT. minwrk .AND. (.NOT.lquery) ) info = - 17
944 IF ( lrwork .LT. minrwrk .AND. (.NOT.lquery) ) info = - 19
945 END IF
946*
947 IF ( info .NE. 0 ) THEN
948* #:(
949 CALL xerbla( 'ZGEJSV', - info )
950 RETURN
951 ELSE IF ( lquery ) THEN
952 cwork(1) = optwrk
953 cwork(2) = minwrk
954 rwork(1) = minrwrk
955 iwork(1) = max( 4, miniwrk )
956 RETURN
957 END IF
958*
959* Quick return for void matrix (Y3K safe)
960* #:)
961 IF ( ( m .EQ. 0 ) .OR. ( n .EQ. 0 ) ) THEN
962 iwork(1:4) = 0
963 rwork(1:7) = 0
964 RETURN
965 ENDIF
966*
967* Determine whether the matrix U should be M x N or M x M
968*
969 IF ( lsvec ) THEN
970 n1 = n
971 IF ( lsame( jobu, 'F' ) ) n1 = m
972 END IF
973*
974* Set numerical parameters
975*
976*! NOTE: Make sure DLAMCH() does not fail on the target architecture.
977*
978 epsln = dlamch('Epsilon')
979 sfmin = dlamch('SafeMinimum')
980 small = sfmin / epsln
981 big = dlamch('O')
982* BIG = ONE / SFMIN
983*
984* Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N
985*
986*(!) If necessary, scale SVA() to protect the largest norm from
987* overflow. It is possible that this scaling pushes the smallest
988* column norm left from the underflow threshold (extreme case).
989*
990 scalem = one / sqrt(dble(m)*dble(n))
991 noscal = .true.
992 goscal = .true.
993 DO 1874 p = 1, n
994 aapp = zero
995 aaqq = one
996 CALL zlassq( m, a(1,p), 1, aapp, aaqq )
997 IF ( aapp .GT. big ) THEN
998 info = - 9
999 CALL xerbla( 'ZGEJSV', -info )
1000 RETURN
1001 END IF
1002 aaqq = sqrt(aaqq)
1003 IF ( ( aapp .LT. (big / aaqq) ) .AND. noscal ) THEN
1004 sva(p) = aapp * aaqq
1005 ELSE
1006 noscal = .false.
1007 sva(p) = aapp * ( aaqq * scalem )
1008 IF ( goscal ) THEN
1009 goscal = .false.
1010 CALL dscal( p-1, scalem, sva, 1 )
1011 END IF
1012 END IF
1013 1874 CONTINUE
1014*
1015 IF ( noscal ) scalem = one
1016*
1017 aapp = zero
1018 aaqq = big
1019 DO 4781 p = 1, n
1020 aapp = max( aapp, sva(p) )
1021 IF ( sva(p) .NE. zero ) aaqq = min( aaqq, sva(p) )
1022 4781 CONTINUE
1023*
1024* Quick return for zero M x N matrix
1025* #:)
1026 IF ( aapp .EQ. zero ) THEN
1027 IF ( lsvec ) CALL zlaset( 'G', m, n1, czero, cone, u, ldu )
1028 IF ( rsvec ) CALL zlaset( 'G', n, n, czero, cone, v, ldv )
1029 rwork(1) = one
1030 rwork(2) = one
1031 IF ( errest ) rwork(3) = one
1032 IF ( lsvec .AND. rsvec ) THEN
1033 rwork(4) = one
1034 rwork(5) = one
1035 END IF
1036 IF ( l2tran ) THEN
1037 rwork(6) = zero
1038 rwork(7) = zero
1039 END IF
1040 iwork(1) = 0
1041 iwork(2) = 0
1042 iwork(3) = 0
1043 iwork(4) = -1
1044 RETURN
1045 END IF
1046*
1047* Issue warning if denormalized column norms detected. Override the
1048* high relative accuracy request. Issue licence to kill nonzero columns
1049* (set them to zero) whose norm is less than sigma_max / BIG (roughly).
1050* #:(
1051 warning = 0
1052 IF ( aaqq .LE. sfmin ) THEN
1053 l2rank = .true.
1054 l2kill = .true.
1055 warning = 1
1056 END IF
1057*
1058* Quick return for one-column matrix
1059* #:)
1060 IF ( n .EQ. 1 ) THEN
1061*
1062 IF ( lsvec ) THEN
1063 CALL zlascl( 'G',0,0,sva(1),scalem, m,1,a(1,1),lda,ierr )
1064 CALL zlacpy( 'A', m, 1, a, lda, u, ldu )
1065* computing all M left singular vectors of the M x 1 matrix
1066 IF ( n1 .NE. n ) THEN
1067 CALL zgeqrf( m, n, u,ldu, cwork, cwork(n+1),lwork-n,ierr )
1068 CALL zungqr( m,n1,1, u,ldu,cwork,cwork(n+1),lwork-n,ierr )
1069 CALL zcopy( m, a(1,1), 1, u(1,1), 1 )
1070 END IF
1071 END IF
1072 IF ( rsvec ) THEN
1073 v(1,1) = cone
1074 END IF
1075 IF ( sva(1) .LT. (big*scalem) ) THEN
1076 sva(1) = sva(1) / scalem
1077 scalem = one
1078 END IF
1079 rwork(1) = one / scalem
1080 rwork(2) = one
1081 IF ( sva(1) .NE. zero ) THEN
1082 iwork(1) = 1
1083 IF ( ( sva(1) / scalem) .GE. sfmin ) THEN
1084 iwork(2) = 1
1085 ELSE
1086 iwork(2) = 0
1087 END IF
1088 ELSE
1089 iwork(1) = 0
1090 iwork(2) = 0
1091 END IF
1092 iwork(3) = 0
1093 iwork(4) = -1
1094 IF ( errest ) rwork(3) = one
1095 IF ( lsvec .AND. rsvec ) THEN
1096 rwork(4) = one
1097 rwork(5) = one
1098 END IF
1099 IF ( l2tran ) THEN
1100 rwork(6) = zero
1101 rwork(7) = zero
1102 END IF
1103 RETURN
1104*
1105 END IF
1106*
1107 transp = .false.
1108*
1109 aatmax = -one
1110 aatmin = big
1111 IF ( rowpiv .OR. l2tran ) THEN
1112*
1113* Compute the row norms, needed to determine row pivoting sequence
1114* (in the case of heavily row weighted A, row pivoting is strongly
1115* advised) and to collect information needed to compare the
1116* structures of A * A^* and A^* * A (in the case L2TRAN.EQ..TRUE.).
1117*
1118 IF ( l2tran ) THEN
1119 DO 1950 p = 1, m
1120 xsc = zero
1121 temp1 = one
1122 CALL zlassq( n, a(p,1), lda, xsc, temp1 )
1123* ZLASSQ gets both the ell_2 and the ell_infinity norm
1124* in one pass through the vector
1125 rwork(m+p) = xsc * scalem
1126 rwork(p) = xsc * (scalem*sqrt(temp1))
1127 aatmax = max( aatmax, rwork(p) )
1128 IF (rwork(p) .NE. zero)
1129 $ aatmin = min(aatmin,rwork(p))
1130 1950 CONTINUE
1131 ELSE
1132 DO 1904 p = 1, m
1133 rwork(m+p) = scalem*abs( a(p,izamax(n,a(p,1),lda)) )
1134 aatmax = max( aatmax, rwork(m+p) )
1135 aatmin = min( aatmin, rwork(m+p) )
1136 1904 CONTINUE
1137 END IF
1138*
1139 END IF
1140*
1141* For square matrix A try to determine whether A^* would be better
1142* input for the preconditioned Jacobi SVD, with faster convergence.
1143* The decision is based on an O(N) function of the vector of column
1144* and row norms of A, based on the Shannon entropy. This should give
1145* the right choice in most cases when the difference actually matters.
1146* It may fail and pick the slower converging side.
1147*
1148 entra = zero
1149 entrat = zero
1150 IF ( l2tran ) THEN
1151*
1152 xsc = zero
1153 temp1 = one
1154 CALL dlassq( n, sva, 1, xsc, temp1 )
1155 temp1 = one / temp1
1156*
1157 entra = zero
1158 DO 1113 p = 1, n
1159 big1 = ( ( sva(p) / xsc )**2 ) * temp1
1160 IF ( big1 .NE. zero ) entra = entra + big1 * dlog(big1)
1161 1113 CONTINUE
1162 entra = - entra / dlog(dble(n))
1163*
1164* Now, SVA().^2/Trace(A^* * A) is a point in the probability simplex.
1165* It is derived from the diagonal of A^* * A. Do the same with the
1166* diagonal of A * A^*, compute the entropy of the corresponding
1167* probability distribution. Note that A * A^* and A^* * A have the
1168* same trace.
1169*
1170 entrat = zero
1171 DO 1114 p = 1, m
1172 big1 = ( ( rwork(p) / xsc )**2 ) * temp1
1173 IF ( big1 .NE. zero ) entrat = entrat + big1 * dlog(big1)
1174 1114 CONTINUE
1175 entrat = - entrat / dlog(dble(m))
1176*
1177* Analyze the entropies and decide A or A^*. Smaller entropy
1178* usually means better input for the algorithm.
1179*
1180 transp = ( entrat .LT. entra )
1181*
1182* If A^* is better than A, take the adjoint of A. This is allowed
1183* only for square matrices, M=N.
1184 IF ( transp ) THEN
1185* In an optimal implementation, this trivial transpose
1186* should be replaced with faster transpose.
1187 DO 1115 p = 1, n - 1
1188 a(p,p) = conjg(a(p,p))
1189 DO 1116 q = p + 1, n
1190 ctemp = conjg(a(q,p))
1191 a(q,p) = conjg(a(p,q))
1192 a(p,q) = ctemp
1193 1116 CONTINUE
1194 1115 CONTINUE
1195 a(n,n) = conjg(a(n,n))
1196 DO 1117 p = 1, n
1197 rwork(m+p) = sva(p)
1198 sva(p) = rwork(p)
1199* previously computed row 2-norms are now column 2-norms
1200* of the transposed matrix
1201 1117 CONTINUE
1202 temp1 = aapp
1203 aapp = aatmax
1204 aatmax = temp1
1205 temp1 = aaqq
1206 aaqq = aatmin
1207 aatmin = temp1
1208 kill = lsvec
1209 lsvec = rsvec
1210 rsvec = kill
1211 IF ( lsvec ) n1 = n
1212*
1213 rowpiv = .true.
1214 END IF
1215*
1216 END IF
1217* END IF L2TRAN
1218*
1219* Scale the matrix so that its maximal singular value remains less
1220* than SQRT(BIG) -- the matrix is scaled so that its maximal column
1221* has Euclidean norm equal to SQRT(BIG/N). The only reason to keep
1222* SQRT(BIG) instead of BIG is the fact that ZGEJSV uses LAPACK and
1223* BLAS routines that, in some implementations, are not capable of
1224* working in the full interval [SFMIN,BIG] and that they may provoke
1225* overflows in the intermediate results. If the singular values spread
1226* from SFMIN to BIG, then ZGESVJ will compute them. So, in that case,
1227* one should use ZGESVJ instead of ZGEJSV.
1228* >> change in the April 2016 update: allow bigger range, i.e. the
1229* largest column is allowed up to BIG/N and ZGESVJ will do the rest.
1230 big1 = sqrt( big )
1231 temp1 = sqrt( big / dble(n) )
1232* TEMP1 = BIG/DBLE(N)
1233*
1234 CALL dlascl( 'G', 0, 0, aapp, temp1, n, 1, sva, n, ierr )
1235 IF ( aaqq .GT. (aapp * sfmin) ) THEN
1236 aaqq = ( aaqq / aapp ) * temp1
1237 ELSE
1238 aaqq = ( aaqq * temp1 ) / aapp
1239 END IF
1240 temp1 = temp1 * scalem
1241 CALL zlascl( 'G', 0, 0, aapp, temp1, m, n, a, lda, ierr )
1242*
1243* To undo scaling at the end of this procedure, multiply the
1244* computed singular values with USCAL2 / USCAL1.
1245*
1246 uscal1 = temp1
1247 uscal2 = aapp
1248*
1249 IF ( l2kill ) THEN
1250* L2KILL enforces computation of nonzero singular values in
1251* the restricted range of condition number of the initial A,
1252* sigma_max(A) / sigma_min(A) approx. SQRT(BIG)/SQRT(SFMIN).
1253 xsc = sqrt( sfmin )
1254 ELSE
1255 xsc = small
1256*
1257* Now, if the condition number of A is too big,
1258* sigma_max(A) / sigma_min(A) .GT. SQRT(BIG/N) * EPSLN / SFMIN,
1259* as a precaution measure, the full SVD is computed using ZGESVJ
1260* with accumulated Jacobi rotations. This provides numerically
1261* more robust computation, at the cost of slightly increased run
1262* time. Depending on the concrete implementation of BLAS and LAPACK
1263* (i.e. how they behave in presence of extreme ill-conditioning) the
1264* implementor may decide to remove this switch.
1265 IF ( ( aaqq.LT.sqrt(sfmin) ) .AND. lsvec .AND. rsvec ) THEN
1266 jracc = .true.
1267 END IF
1268*
1269 END IF
1270 IF ( aaqq .LT. xsc ) THEN
1271 DO 700 p = 1, n
1272 IF ( sva(p) .LT. xsc ) THEN
1273 CALL zlaset( 'A', m, 1, czero, czero, a(1,p), lda )
1274 sva(p) = zero
1275 END IF
1276 700 CONTINUE
1277 END IF
1278*
1279* Preconditioning using QR factorization with pivoting
1280*
1281 IF ( rowpiv ) THEN
1282* Optional row permutation (Bjoerck row pivoting):
1283* A result by Cox and Higham shows that the Bjoerck's
1284* row pivoting combined with standard column pivoting
1285* has similar effect as Powell-Reid complete pivoting.
1286* The ell-infinity norms of A are made nonincreasing.
1287 IF ( ( lsvec .AND. rsvec ) .AND. .NOT.( jracc ) ) THEN
1288 iwoff = 2*n
1289 ELSE
1290 iwoff = n
1291 END IF
1292 DO 1952 p = 1, m - 1
1293 q = idamax( m-p+1, rwork(m+p), 1 ) + p - 1
1294 iwork(iwoff+p) = q
1295 IF ( p .NE. q ) THEN
1296 temp1 = rwork(m+p)
1297 rwork(m+p) = rwork(m+q)
1298 rwork(m+q) = temp1
1299 END IF
1300 1952 CONTINUE
1301 CALL zlaswp( n, a, lda, 1, m-1, iwork(iwoff+1), 1 )
1302 END IF
1303*
1304* End of the preparation phase (scaling, optional sorting and
1305* transposing, optional flushing of small columns).
1306*
1307* Preconditioning
1308*
1309* If the full SVD is needed, the right singular vectors are computed
1310* from a matrix equation, and for that we need theoretical analysis
1311* of the Businger-Golub pivoting. So we use ZGEQP3 as the first RR QRF.
1312* In all other cases the first RR QRF can be chosen by other criteria
1313* (eg speed by replacing global with restricted window pivoting, such
1314* as in xGEQPX from TOMS # 782). Good results will be obtained using
1315* xGEQPX with properly (!) chosen numerical parameters.
1316* Any improvement of ZGEQP3 improves overall performance of ZGEJSV.
1317*
1318* A * P1 = Q1 * [ R1^* 0]^*:
1319 DO 1963 p = 1, n
1320* .. all columns are free columns
1321 iwork(p) = 0
1322 1963 CONTINUE
1323 CALL zgeqp3( m, n, a, lda, iwork, cwork, cwork(n+1), lwork-n,
1324 $ rwork, ierr )
1325*
1326* The upper triangular matrix R1 from the first QRF is inspected for
1327* rank deficiency and possibilities for deflation, or possible
1328* ill-conditioning. Depending on the user specified flag L2RANK,
1329* the procedure explores possibilities to reduce the numerical
1330* rank by inspecting the computed upper triangular factor. If
1331* L2RANK or L2ABER are up, then ZGEJSV will compute the SVD of
1332* A + dA, where ||dA|| <= f(M,N)*EPSLN.
1333*
1334 nr = 1
1335 IF ( l2aber ) THEN
1336* Standard absolute error bound suffices. All sigma_i with
1337* sigma_i < N*EPSLN*||A|| are flushed to zero. This is an
1338* aggressive enforcement of lower numerical rank by introducing a
1339* backward error of the order of N*EPSLN*||A||.
1340 temp1 = sqrt(dble(n))*epsln
1341 DO 3001 p = 2, n
1342 IF ( abs(a(p,p)) .GE. (temp1*abs(a(1,1))) ) THEN
1343 nr = nr + 1
1344 ELSE
1345 GO TO 3002
1346 END IF
1347 3001 CONTINUE
1348 3002 CONTINUE
1349 ELSE IF ( l2rank ) THEN
1350* .. similarly as above, only slightly more gentle (less aggressive).
1351* Sudden drop on the diagonal of R1 is used as the criterion for
1352* close-to-rank-deficient.
1353 temp1 = sqrt(sfmin)
1354 DO 3401 p = 2, n
1355 IF ( ( abs(a(p,p)) .LT. (epsln*abs(a(p-1,p-1))) ) .OR.
1356 $ ( abs(a(p,p)) .LT. small ) .OR.
1357 $ ( l2kill .AND. (abs(a(p,p)) .LT. temp1) ) ) GO TO 3402
1358 nr = nr + 1
1359 3401 CONTINUE
1360 3402 CONTINUE
1361*
1362 ELSE
1363* The goal is high relative accuracy. However, if the matrix
1364* has high scaled condition number the relative accuracy is in
1365* general not feasible. Later on, a condition number estimator
1366* will be deployed to estimate the scaled condition number.
1367* Here we just remove the underflowed part of the triangular
1368* factor. This prevents the situation in which the code is
1369* working hard to get the accuracy not warranted by the data.
1370 temp1 = sqrt(sfmin)
1371 DO 3301 p = 2, n
1372 IF ( ( abs(a(p,p)) .LT. small ) .OR.
1373 $ ( l2kill .AND. (abs(a(p,p)) .LT. temp1) ) ) GO TO 3302
1374 nr = nr + 1
1375 3301 CONTINUE
1376 3302 CONTINUE
1377*
1378 END IF
1379*
1380 almort = .false.
1381 IF ( nr .EQ. n ) THEN
1382 maxprj = one
1383 DO 3051 p = 2, n
1384 temp1 = abs(a(p,p)) / sva(iwork(p))
1385 maxprj = min( maxprj, temp1 )
1386 3051 CONTINUE
1387 IF ( maxprj**2 .GE. one - dble(n)*epsln ) almort = .true.
1388 END IF
1389*
1390*
1391 sconda = - one
1392 condr1 = - one
1393 condr2 = - one
1394*
1395 IF ( errest ) THEN
1396 IF ( n .EQ. nr ) THEN
1397 IF ( rsvec ) THEN
1398* .. V is available as workspace
1399 CALL zlacpy( 'U', n, n, a, lda, v, ldv )
1400 DO 3053 p = 1, n
1401 temp1 = sva(iwork(p))
1402 CALL zdscal( p, one/temp1, v(1,p), 1 )
1403 3053 CONTINUE
1404 IF ( lsvec )THEN
1405 CALL zpocon( 'U', n, v, ldv, one, temp1,
1406 $ cwork(n+1), rwork, ierr )
1407 ELSE
1408 CALL zpocon( 'U', n, v, ldv, one, temp1,
1409 $ cwork, rwork, ierr )
1410 END IF
1411*
1412 ELSE IF ( lsvec ) THEN
1413* .. U is available as workspace
1414 CALL zlacpy( 'U', n, n, a, lda, u, ldu )
1415 DO 3054 p = 1, n
1416 temp1 = sva(iwork(p))
1417 CALL zdscal( p, one/temp1, u(1,p), 1 )
1418 3054 CONTINUE
1419 CALL zpocon( 'U', n, u, ldu, one, temp1,
1420 $ cwork(n+1), rwork, ierr )
1421 ELSE
1422 CALL zlacpy( 'U', n, n, a, lda, cwork, n )
1423*[] CALL ZLACPY( 'U', N, N, A, LDA, CWORK(N+1), N )
1424* Change: here index shifted by N to the left, CWORK(1:N)
1425* not needed for SIGMA only computation
1426 DO 3052 p = 1, n
1427 temp1 = sva(iwork(p))
1428*[] CALL ZDSCAL( p, ONE/TEMP1, CWORK(N+(p-1)*N+1), 1 )
1429 CALL zdscal( p, one/temp1, cwork((p-1)*n+1), 1 )
1430 3052 CONTINUE
1431* .. the columns of R are scaled to have unit Euclidean lengths.
1432*[] CALL ZPOCON( 'U', N, CWORK(N+1), N, ONE, TEMP1,
1433*[] $ CWORK(N+N*N+1), RWORK, IERR )
1434 CALL zpocon( 'U', n, cwork, n, one, temp1,
1435 $ cwork(n*n+1), rwork, ierr )
1436*
1437 END IF
1438 IF ( temp1 .NE. zero ) THEN
1439 sconda = one / sqrt(temp1)
1440 ELSE
1441 sconda = - one
1442 END IF
1443* SCONDA is an estimate of SQRT(||(R^* * R)^(-1)||_1).
1444* N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
1445 ELSE
1446 sconda = - one
1447 END IF
1448 END IF
1449*
1450 l2pert = l2pert .AND. ( abs( a(1,1)/a(nr,nr) ) .GT. sqrt(big1) )
1451* If there is no violent scaling, artificial perturbation is not needed.
1452*
1453* Phase 3:
1454*
1455 IF ( .NOT. ( rsvec .OR. lsvec ) ) THEN
1456*
1457* Singular Values only
1458*
1459* .. transpose A(1:NR,1:N)
1460 DO 1946 p = 1, min( n-1, nr )
1461 CALL zcopy( n-p, a(p,p+1), lda, a(p+1,p), 1 )
1462 CALL zlacgv( n-p+1, a(p,p), 1 )
1463 1946 CONTINUE
1464 IF ( nr .EQ. n ) a(n,n) = conjg(a(n,n))
1465*
1466* The following two DO-loops introduce small relative perturbation
1467* into the strict upper triangle of the lower triangular matrix.
1468* Small entries below the main diagonal are also changed.
1469* This modification is useful if the computing environment does not
1470* provide/allow FLUSH TO ZERO underflow, for it prevents many
1471* annoying denormalized numbers in case of strongly scaled matrices.
1472* The perturbation is structured so that it does not introduce any
1473* new perturbation of the singular values, and it does not destroy
1474* the job done by the preconditioner.
1475* The licence for this perturbation is in the variable L2PERT, which
1476* should be .FALSE. if FLUSH TO ZERO underflow is active.
1477*
1478 IF ( .NOT. almort ) THEN
1479*
1480 IF ( l2pert ) THEN
1481* XSC = SQRT(SMALL)
1482 xsc = epsln / dble(n)
1483 DO 4947 q = 1, nr
1484 ctemp = dcmplx(xsc*abs(a(q,q)),zero)
1485 DO 4949 p = 1, n
1486 IF ( ( (p.GT.q) .AND. (abs(a(p,q)).LE.temp1) )
1487 $ .OR. ( p .LT. q ) )
1488* $ A(p,q) = TEMP1 * ( A(p,q) / ABS(A(p,q)) )
1489 $ a(p,q) = ctemp
1490 4949 CONTINUE
1491 4947 CONTINUE
1492 ELSE
1493 CALL zlaset( 'U', nr-1,nr-1, czero,czero, a(1,2),lda )
1494 END IF
1495*
1496* .. second preconditioning using the QR factorization
1497*
1498 CALL zgeqrf( n,nr, a,lda, cwork, cwork(n+1),lwork-n, ierr )
1499*
1500* .. and transpose upper to lower triangular
1501 DO 1948 p = 1, nr - 1
1502 CALL zcopy( nr-p, a(p,p+1), lda, a(p+1,p), 1 )
1503 CALL zlacgv( nr-p+1, a(p,p), 1 )
1504 1948 CONTINUE
1505*
1506 END IF
1507*
1508* Row-cyclic Jacobi SVD algorithm with column pivoting
1509*
1510* .. again some perturbation (a "background noise") is added
1511* to drown denormals
1512 IF ( l2pert ) THEN
1513* XSC = SQRT(SMALL)
1514 xsc = epsln / dble(n)
1515 DO 1947 q = 1, nr
1516 ctemp = dcmplx(xsc*abs(a(q,q)),zero)
1517 DO 1949 p = 1, nr
1518 IF ( ( (p.GT.q) .AND. (abs(a(p,q)).LE.temp1) )
1519 $ .OR. ( p .LT. q ) )
1520* $ A(p,q) = TEMP1 * ( A(p,q) / ABS(A(p,q)) )
1521 $ a(p,q) = ctemp
1522 1949 CONTINUE
1523 1947 CONTINUE
1524 ELSE
1525 CALL zlaset( 'U', nr-1, nr-1, czero, czero, a(1,2), lda )
1526 END IF
1527*
1528* .. and one-sided Jacobi rotations are started on a lower
1529* triangular matrix (plus perturbation which is ignored in
1530* the part which destroys triangular form (confusing?!))
1531*
1532 CALL zgesvj( 'L', 'N', 'N', nr, nr, a, lda, sva,
1533 $ n, v, ldv, cwork, lwork, rwork, lrwork, info )
1534*
1535 scalem = rwork(1)
1536 numrank = nint(rwork(2))
1537*
1538*
1539 ELSE IF ( ( rsvec .AND. ( .NOT. lsvec ) .AND. ( .NOT. jracc ) )
1540 $ .OR.
1541 $ ( jracc .AND. ( .NOT. lsvec ) .AND. ( nr .NE. n ) ) ) THEN
1542*
1543* -> Singular Values and Right Singular Vectors <-
1544*
1545 IF ( almort ) THEN
1546*
1547* .. in this case NR equals N
1548 DO 1998 p = 1, nr
1549 CALL zcopy( n-p+1, a(p,p), lda, v(p,p), 1 )
1550 CALL zlacgv( n-p+1, v(p,p), 1 )
1551 1998 CONTINUE
1552 CALL zlaset( 'U', nr-1,nr-1, czero, czero, v(1,2), ldv )
1553*
1554 CALL zgesvj( 'L','U','N', n, nr, v, ldv, sva, nr, a, lda,
1555 $ cwork, lwork, rwork, lrwork, info )
1556 scalem = rwork(1)
1557 numrank = nint(rwork(2))
1558
1559 ELSE
1560*
1561* .. two more QR factorizations ( one QRF is not enough, two require
1562* accumulated product of Jacobi rotations, three are perfect )
1563*
1564 CALL zlaset( 'L', nr-1,nr-1, czero, czero, a(2,1), lda )
1565 CALL zgelqf( nr,n, a, lda, cwork, cwork(n+1), lwork-n, ierr)
1566 CALL zlacpy( 'L', nr, nr, a, lda, v, ldv )
1567 CALL zlaset( 'U', nr-1,nr-1, czero, czero, v(1,2), ldv )
1568 CALL zgeqrf( nr, nr, v, ldv, cwork(n+1), cwork(2*n+1),
1569 $ lwork-2*n, ierr )
1570 DO 8998 p = 1, nr
1571 CALL zcopy( nr-p+1, v(p,p), ldv, v(p,p), 1 )
1572 CALL zlacgv( nr-p+1, v(p,p), 1 )
1573 8998 CONTINUE
1574 CALL zlaset('U', nr-1, nr-1, czero, czero, v(1,2), ldv)
1575*
1576 CALL zgesvj( 'L', 'U','N', nr, nr, v,ldv, sva, nr, u,
1577 $ ldu, cwork(n+1), lwork-n, rwork, lrwork, info )
1578 scalem = rwork(1)
1579 numrank = nint(rwork(2))
1580 IF ( nr .LT. n ) THEN
1581 CALL zlaset( 'A',n-nr, nr, czero,czero, v(nr+1,1), ldv )
1582 CALL zlaset( 'A',nr, n-nr, czero,czero, v(1,nr+1), ldv )
1583 CALL zlaset( 'A',n-nr,n-nr,czero,cone, v(nr+1,nr+1),ldv )
1584 END IF
1585*
1586 CALL zunmlq( 'L', 'C', n, n, nr, a, lda, cwork,
1587 $ v, ldv, cwork(n+1), lwork-n, ierr )
1588*
1589 END IF
1590* .. permute the rows of V
1591* DO 8991 p = 1, N
1592* CALL ZCOPY( N, V(p,1), LDV, A(IWORK(p),1), LDA )
1593* 8991 CONTINUE
1594* CALL ZLACPY( 'All', N, N, A, LDA, V, LDV )
1595 CALL zlapmr( .false., n, n, v, ldv, iwork )
1596*
1597 IF ( transp ) THEN
1598 CALL zlacpy( 'A', n, n, v, ldv, u, ldu )
1599 END IF
1600*
1601 ELSE IF ( jracc .AND. (.NOT. lsvec) .AND. ( nr.EQ. n ) ) THEN
1602*
1603 CALL zlaset( 'L', n-1,n-1, czero, czero, a(2,1), lda )
1604*
1605 CALL zgesvj( 'U','N','V', n, n, a, lda, sva, n, v, ldv,
1606 $ cwork, lwork, rwork, lrwork, info )
1607 scalem = rwork(1)
1608 numrank = nint(rwork(2))
1609 CALL zlapmr( .false., n, n, v, ldv, iwork )
1610*
1611 ELSE IF ( lsvec .AND. ( .NOT. rsvec ) ) THEN
1612*
1613* .. Singular Values and Left Singular Vectors ..
1614*
1615* .. second preconditioning step to avoid need to accumulate
1616* Jacobi rotations in the Jacobi iterations.
1617 DO 1965 p = 1, nr
1618 CALL zcopy( n-p+1, a(p,p), lda, u(p,p), 1 )
1619 CALL zlacgv( n-p+1, u(p,p), 1 )
1620 1965 CONTINUE
1621 CALL zlaset( 'U', nr-1, nr-1, czero, czero, u(1,2), ldu )
1622*
1623 CALL zgeqrf( n, nr, u, ldu, cwork(n+1), cwork(2*n+1),
1624 $ lwork-2*n, ierr )
1625*
1626 DO 1967 p = 1, nr - 1
1627 CALL zcopy( nr-p, u(p,p+1), ldu, u(p+1,p), 1 )
1628 CALL zlacgv( n-p+1, u(p,p), 1 )
1629 1967 CONTINUE
1630 CALL zlaset( 'U', nr-1, nr-1, czero, czero, u(1,2), ldu )
1631*
1632 CALL zgesvj( 'L', 'U', 'N', nr,nr, u, ldu, sva, nr, a,
1633 $ lda, cwork(n+1), lwork-n, rwork, lrwork, info )
1634 scalem = rwork(1)
1635 numrank = nint(rwork(2))
1636*
1637 IF ( nr .LT. m ) THEN
1638 CALL zlaset( 'A', m-nr, nr,czero, czero, u(nr+1,1), ldu )
1639 IF ( nr .LT. n1 ) THEN
1640 CALL zlaset( 'A',nr, n1-nr, czero, czero, u(1,nr+1),ldu )
1641 CALL zlaset( 'A',m-nr,n1-nr,czero,cone,u(nr+1,nr+1),ldu )
1642 END IF
1643 END IF
1644*
1645 CALL zunmqr( 'L', 'N', m, n1, n, a, lda, cwork, u,
1646 $ ldu, cwork(n+1), lwork-n, ierr )
1647*
1648 IF ( rowpiv )
1649 $ CALL zlaswp( n1, u, ldu, 1, m-1, iwork(iwoff+1), -1 )
1650*
1651 DO 1974 p = 1, n1
1652 xsc = one / dznrm2( m, u(1,p), 1 )
1653 CALL zdscal( m, xsc, u(1,p), 1 )
1654 1974 CONTINUE
1655*
1656 IF ( transp ) THEN
1657 CALL zlacpy( 'A', n, n, u, ldu, v, ldv )
1658 END IF
1659*
1660 ELSE
1661*
1662* .. Full SVD ..
1663*
1664 IF ( .NOT. jracc ) THEN
1665*
1666 IF ( .NOT. almort ) THEN
1667*
1668* Second Preconditioning Step (QRF [with pivoting])
1669* Note that the composition of TRANSPOSE, QRF and TRANSPOSE is
1670* equivalent to an LQF CALL. Since in many libraries the QRF
1671* seems to be better optimized than the LQF, we do explicit
1672* transpose and use the QRF. This is subject to changes in an
1673* optimized implementation of ZGEJSV.
1674*
1675 DO 1968 p = 1, nr
1676 CALL zcopy( n-p+1, a(p,p), lda, v(p,p), 1 )
1677 CALL zlacgv( n-p+1, v(p,p), 1 )
1678 1968 CONTINUE
1679*
1680* .. the following two loops perturb small entries to avoid
1681* denormals in the second QR factorization, where they are
1682* as good as zeros. This is done to avoid painfully slow
1683* computation with denormals. The relative size of the perturbation
1684* is a parameter that can be changed by the implementer.
1685* This perturbation device will be obsolete on machines with
1686* properly implemented arithmetic.
1687* To switch it off, set L2PERT=.FALSE. To remove it from the
1688* code, remove the action under L2PERT=.TRUE., leave the ELSE part.
1689* The following two loops should be blocked and fused with the
1690* transposed copy above.
1691*
1692 IF ( l2pert ) THEN
1693 xsc = sqrt(small)
1694 DO 2969 q = 1, nr
1695 ctemp = dcmplx(xsc*abs( v(q,q) ),zero)
1696 DO 2968 p = 1, n
1697 IF ( ( p .GT. q ) .AND. ( abs(v(p,q)) .LE. temp1 )
1698 $ .OR. ( p .LT. q ) )
1699* $ V(p,q) = TEMP1 * ( V(p,q) / ABS(V(p,q)) )
1700 $ v(p,q) = ctemp
1701 IF ( p .LT. q ) v(p,q) = - v(p,q)
1702 2968 CONTINUE
1703 2969 CONTINUE
1704 ELSE
1705 CALL zlaset( 'U', nr-1, nr-1, czero, czero, v(1,2), ldv )
1706 END IF
1707*
1708* Estimate the row scaled condition number of R1
1709* (If R1 is rectangular, N > NR, then the condition number
1710* of the leading NR x NR submatrix is estimated.)
1711*
1712 CALL zlacpy( 'L', nr, nr, v, ldv, cwork(2*n+1), nr )
1713 DO 3950 p = 1, nr
1714 temp1 = dznrm2(nr-p+1,cwork(2*n+(p-1)*nr+p),1)
1715 CALL zdscal(nr-p+1,one/temp1,cwork(2*n+(p-1)*nr+p),1)
1716 3950 CONTINUE
1717 CALL zpocon('L',nr,cwork(2*n+1),nr,one,temp1,
1718 $ cwork(2*n+nr*nr+1),rwork,ierr)
1719 condr1 = one / sqrt(temp1)
1720* .. here need a second opinion on the condition number
1721* .. then assume worst case scenario
1722* R1 is OK for inverse <=> CONDR1 .LT. DBLE(N)
1723* more conservative <=> CONDR1 .LT. SQRT(DBLE(N))
1724*
1725 cond_ok = sqrt(sqrt(dble(nr)))
1726*[TP] COND_OK is a tuning parameter.
1727*
1728 IF ( condr1 .LT. cond_ok ) THEN
1729* .. the second QRF without pivoting. Note: in an optimized
1730* implementation, this QRF should be implemented as the QRF
1731* of a lower triangular matrix.
1732* R1^* = Q2 * R2
1733 CALL zgeqrf( n, nr, v, ldv, cwork(n+1), cwork(2*n+1),
1734 $ lwork-2*n, ierr )
1735*
1736 IF ( l2pert ) THEN
1737 xsc = sqrt(small)/epsln
1738 DO 3959 p = 2, nr
1739 DO 3958 q = 1, p - 1
1740 ctemp=dcmplx(xsc*min(abs(v(p,p)),abs(v(q,q))),
1741 $ zero)
1742 IF ( abs(v(q,p)) .LE. temp1 )
1743* $ V(q,p) = TEMP1 * ( V(q,p) / ABS(V(q,p)) )
1744 $ v(q,p) = ctemp
1745 3958 CONTINUE
1746 3959 CONTINUE
1747 END IF
1748*
1749 IF ( nr .NE. n )
1750 $ CALL zlacpy( 'A', n, nr, v, ldv, cwork(2*n+1), n )
1751* .. save ...
1752*
1753* .. this transposed copy should be better than naive
1754 DO 1969 p = 1, nr - 1
1755 CALL zcopy( nr-p, v(p,p+1), ldv, v(p+1,p), 1 )
1756 CALL zlacgv(nr-p+1, v(p,p), 1 )
1757 1969 CONTINUE
1758 v(nr,nr)=conjg(v(nr,nr))
1759*
1760 condr2 = condr1
1761*
1762 ELSE
1763*
1764* .. ill-conditioned case: second QRF with pivoting
1765* Note that windowed pivoting would be equally good
1766* numerically, and more run-time efficient. So, in
1767* an optimal implementation, the next call to ZGEQP3
1768* should be replaced with eg. CALL ZGEQPX (ACM TOMS #782)
1769* with properly (carefully) chosen parameters.
1770*
1771* R1^* * P2 = Q2 * R2
1772 DO 3003 p = 1, nr
1773 iwork(n+p) = 0
1774 3003 CONTINUE
1775 CALL zgeqp3( n, nr, v, ldv, iwork(n+1), cwork(n+1),
1776 $ cwork(2*n+1), lwork-2*n, rwork, ierr )
1777** CALL ZGEQRF( N, NR, V, LDV, CWORK(N+1), CWORK(2*N+1),
1778** $ LWORK-2*N, IERR )
1779 IF ( l2pert ) THEN
1780 xsc = sqrt(small)
1781 DO 3969 p = 2, nr
1782 DO 3968 q = 1, p - 1
1783 ctemp=dcmplx(xsc*min(abs(v(p,p)),abs(v(q,q))),
1784 $ zero)
1785 IF ( abs(v(q,p)) .LE. temp1 )
1786* $ V(q,p) = TEMP1 * ( V(q,p) / ABS(V(q,p)) )
1787 $ v(q,p) = ctemp
1788 3968 CONTINUE
1789 3969 CONTINUE
1790 END IF
1791*
1792 CALL zlacpy( 'A', n, nr, v, ldv, cwork(2*n+1), n )
1793*
1794 IF ( l2pert ) THEN
1795 xsc = sqrt(small)
1796 DO 8970 p = 2, nr
1797 DO 8971 q = 1, p - 1
1798 ctemp=dcmplx(xsc*min(abs(v(p,p)),abs(v(q,q))),
1799 $ zero)
1800* V(p,q) = - TEMP1*( V(q,p) / ABS(V(q,p)) )
1801 v(p,q) = - ctemp
1802 8971 CONTINUE
1803 8970 CONTINUE
1804 ELSE
1805 CALL zlaset( 'L',nr-1,nr-1,czero,czero,v(2,1),ldv )
1806 END IF
1807* Now, compute R2 = L3 * Q3, the LQ factorization.
1808 CALL zgelqf( nr, nr, v, ldv, cwork(2*n+n*nr+1),
1809 $ cwork(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr, ierr )
1810* .. and estimate the condition number
1811 CALL zlacpy( 'L',nr,nr,v,ldv,cwork(2*n+n*nr+nr+1),nr )
1812 DO 4950 p = 1, nr
1813 temp1 = dznrm2( p, cwork(2*n+n*nr+nr+p), nr )
1814 CALL zdscal( p, one/temp1, cwork(2*n+n*nr+nr+p), nr )
1815 4950 CONTINUE
1816 CALL zpocon( 'L',nr,cwork(2*n+n*nr+nr+1),nr,one,temp1,
1817 $ cwork(2*n+n*nr+nr+nr*nr+1),rwork,ierr )
1818 condr2 = one / sqrt(temp1)
1819*
1820*
1821 IF ( condr2 .GE. cond_ok ) THEN
1822* .. save the Householder vectors used for Q3
1823* (this overwrites the copy of R2, as it will not be
1824* needed in this branch, but it does not overwritte the
1825* Huseholder vectors of Q2.).
1826 CALL zlacpy( 'U', nr, nr, v, ldv, cwork(2*n+1), n )
1827* .. and the rest of the information on Q3 is in
1828* WORK(2*N+N*NR+1:2*N+N*NR+N)
1829 END IF
1830*
1831 END IF
1832*
1833 IF ( l2pert ) THEN
1834 xsc = sqrt(small)
1835 DO 4968 q = 2, nr
1836 ctemp = xsc * v(q,q)
1837 DO 4969 p = 1, q - 1
1838* V(p,q) = - TEMP1*( V(p,q) / ABS(V(p,q)) )
1839 v(p,q) = - ctemp
1840 4969 CONTINUE
1841 4968 CONTINUE
1842 ELSE
1843 CALL zlaset( 'U', nr-1,nr-1, czero,czero, v(1,2), ldv )
1844 END IF
1845*
1846* Second preconditioning finished; continue with Jacobi SVD
1847* The input matrix is lower trinagular.
1848*
1849* Recover the right singular vectors as solution of a well
1850* conditioned triangular matrix equation.
1851*
1852 IF ( condr1 .LT. cond_ok ) THEN
1853*
1854 CALL zgesvj( 'L','U','N',nr,nr,v,ldv,sva,nr,u, ldu,
1855 $ cwork(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,rwork,
1856 $ lrwork, info )
1857 scalem = rwork(1)
1858 numrank = nint(rwork(2))
1859 DO 3970 p = 1, nr
1860 CALL zcopy( nr, v(1,p), 1, u(1,p), 1 )
1861 CALL zdscal( nr, sva(p), v(1,p), 1 )
1862 3970 CONTINUE
1863
1864* .. pick the right matrix equation and solve it
1865*
1866 IF ( nr .EQ. n ) THEN
1867* :)) .. best case, R1 is inverted. The solution of this matrix
1868* equation is Q2*V2 = the product of the Jacobi rotations
1869* used in ZGESVJ, premultiplied with the orthogonal matrix
1870* from the second QR factorization.
1871 CALL ztrsm('L','U','N','N', nr,nr,cone, a,lda, v,ldv)
1872 ELSE
1873* .. R1 is well conditioned, but non-square. Adjoint of R2
1874* is inverted to get the product of the Jacobi rotations
1875* used in ZGESVJ. The Q-factor from the second QR
1876* factorization is then built in explicitly.
1877 CALL ztrsm('L','U','C','N',nr,nr,cone,cwork(2*n+1),
1878 $ n,v,ldv)
1879 IF ( nr .LT. n ) THEN
1880 CALL zlaset('A',n-nr,nr,czero,czero,v(nr+1,1),ldv)
1881 CALL zlaset('A',nr,n-nr,czero,czero,v(1,nr+1),ldv)
1882 CALL zlaset('A',n-nr,n-nr,czero,cone,v(nr+1,nr+1),ldv)
1883 END IF
1884 CALL zunmqr('L','N',n,n,nr,cwork(2*n+1),n,cwork(n+1),
1885 $ v,ldv,cwork(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr)
1886 END IF
1887*
1888 ELSE IF ( condr2 .LT. cond_ok ) THEN
1889*
1890* The matrix R2 is inverted. The solution of the matrix equation
1891* is Q3^* * V3 = the product of the Jacobi rotations (appplied to
1892* the lower triangular L3 from the LQ factorization of
1893* R2=L3*Q3), pre-multiplied with the transposed Q3.
1894 CALL zgesvj( 'L', 'U', 'N', nr, nr, v, ldv, sva, nr, u,
1895 $ ldu, cwork(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr,
1896 $ rwork, lrwork, info )
1897 scalem = rwork(1)
1898 numrank = nint(rwork(2))
1899 DO 3870 p = 1, nr
1900 CALL zcopy( nr, v(1,p), 1, u(1,p), 1 )
1901 CALL zdscal( nr, sva(p), u(1,p), 1 )
1902 3870 CONTINUE
1903 CALL ztrsm('L','U','N','N',nr,nr,cone,cwork(2*n+1),n,
1904 $ u,ldu)
1905* .. apply the permutation from the second QR factorization
1906 DO 873 q = 1, nr
1907 DO 872 p = 1, nr
1908 cwork(2*n+n*nr+nr+iwork(n+p)) = u(p,q)
1909 872 CONTINUE
1910 DO 874 p = 1, nr
1911 u(p,q) = cwork(2*n+n*nr+nr+p)
1912 874 CONTINUE
1913 873 CONTINUE
1914 IF ( nr .LT. n ) THEN
1915 CALL zlaset( 'A',n-nr,nr,czero,czero,v(nr+1,1),ldv )
1916 CALL zlaset( 'A',nr,n-nr,czero,czero,v(1,nr+1),ldv )
1917 CALL zlaset('A',n-nr,n-nr,czero,cone,v(nr+1,nr+1),ldv)
1918 END IF
1919 CALL zunmqr( 'L','N',n,n,nr,cwork(2*n+1),n,cwork(n+1),
1920 $ v,ldv,cwork(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
1921 ELSE
1922* Last line of defense.
1923* #:( This is a rather pathological case: no scaled condition
1924* improvement after two pivoted QR factorizations. Other
1925* possibility is that the rank revealing QR factorization
1926* or the condition estimator has failed, or the COND_OK
1927* is set very close to ONE (which is unnecessary). Normally,
1928* this branch should never be executed, but in rare cases of
1929* failure of the RRQR or condition estimator, the last line of
1930* defense ensures that ZGEJSV completes the task.
1931* Compute the full SVD of L3 using ZGESVJ with explicit
1932* accumulation of Jacobi rotations.
1933 CALL zgesvj( 'L', 'U', 'V', nr, nr, v, ldv, sva, nr, u,
1934 $ ldu, cwork(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr,
1935 $ rwork, lrwork, info )
1936 scalem = rwork(1)
1937 numrank = nint(rwork(2))
1938 IF ( nr .LT. n ) THEN
1939 CALL zlaset( 'A',n-nr,nr,czero,czero,v(nr+1,1),ldv )
1940 CALL zlaset( 'A',nr,n-nr,czero,czero,v(1,nr+1),ldv )
1941 CALL zlaset('A',n-nr,n-nr,czero,cone,v(nr+1,nr+1),ldv)
1942 END IF
1943 CALL zunmqr( 'L','N',n,n,nr,cwork(2*n+1),n,cwork(n+1),
1944 $ v,ldv,cwork(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
1945*
1946 CALL zunmlq( 'L', 'C', nr, nr, nr, cwork(2*n+1), n,
1947 $ cwork(2*n+n*nr+1), u, ldu, cwork(2*n+n*nr+nr+1),
1948 $ lwork-2*n-n*nr-nr, ierr )
1949 DO 773 q = 1, nr
1950 DO 772 p = 1, nr
1951 cwork(2*n+n*nr+nr+iwork(n+p)) = u(p,q)
1952 772 CONTINUE
1953 DO 774 p = 1, nr
1954 u(p,q) = cwork(2*n+n*nr+nr+p)
1955 774 CONTINUE
1956 773 CONTINUE
1957*
1958 END IF
1959*
1960* Permute the rows of V using the (column) permutation from the
1961* first QRF. Also, scale the columns to make them unit in
1962* Euclidean norm. This applies to all cases.
1963*
1964 temp1 = sqrt(dble(n)) * epsln
1965 DO 1972 q = 1, n
1966 DO 972 p = 1, n
1967 cwork(2*n+n*nr+nr+iwork(p)) = v(p,q)
1968 972 CONTINUE
1969 DO 973 p = 1, n
1970 v(p,q) = cwork(2*n+n*nr+nr+p)
1971 973 CONTINUE
1972 xsc = one / dznrm2( n, v(1,q), 1 )
1973 IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1974 $ CALL zdscal( n, xsc, v(1,q), 1 )
1975 1972 CONTINUE
1976* At this moment, V contains the right singular vectors of A.
1977* Next, assemble the left singular vector matrix U (M x N).
1978 IF ( nr .LT. m ) THEN
1979 CALL zlaset('A', m-nr, nr, czero, czero, u(nr+1,1), ldu)
1980 IF ( nr .LT. n1 ) THEN
1981 CALL zlaset('A',nr,n1-nr,czero,czero,u(1,nr+1),ldu)
1982 CALL zlaset('A',m-nr,n1-nr,czero,cone,
1983 $ u(nr+1,nr+1),ldu)
1984 END IF
1985 END IF
1986*
1987* The Q matrix from the first QRF is built into the left singular
1988* matrix U. This applies to all cases.
1989*
1990 CALL zunmqr( 'L', 'N', m, n1, n, a, lda, cwork, u,
1991 $ ldu, cwork(n+1), lwork-n, ierr )
1992
1993* The columns of U are normalized. The cost is O(M*N) flops.
1994 temp1 = sqrt(dble(m)) * epsln
1995 DO 1973 p = 1, nr
1996 xsc = one / dznrm2( m, u(1,p), 1 )
1997 IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1998 $ CALL zdscal( m, xsc, u(1,p), 1 )
1999 1973 CONTINUE
2000*
2001* If the initial QRF is computed with row pivoting, the left
2002* singular vectors must be adjusted.
2003*
2004 IF ( rowpiv )
2005 $ CALL zlaswp( n1, u, ldu, 1, m-1, iwork(iwoff+1), -1 )
2006*
2007 ELSE
2008*
2009* .. the initial matrix A has almost orthogonal columns and
2010* the second QRF is not needed
2011*
2012 CALL zlacpy( 'U', n, n, a, lda, cwork(n+1), n )
2013 IF ( l2pert ) THEN
2014 xsc = sqrt(small)
2015 DO 5970 p = 2, n
2016 ctemp = xsc * cwork( n + (p-1)*n + p )
2017 DO 5971 q = 1, p - 1
2018* CWORK(N+(q-1)*N+p)=-TEMP1 * ( CWORK(N+(p-1)*N+q) /
2019* $ ABS(CWORK(N+(p-1)*N+q)) )
2020 cwork(n+(q-1)*n+p)=-ctemp
2021 5971 CONTINUE
2022 5970 CONTINUE
2023 ELSE
2024 CALL zlaset( 'L',n-1,n-1,czero,czero,cwork(n+2),n )
2025 END IF
2026*
2027 CALL zgesvj( 'U', 'U', 'N', n, n, cwork(n+1), n, sva,
2028 $ n, u, ldu, cwork(n+n*n+1), lwork-n-n*n, rwork, lrwork,
2029 $ info )
2030*
2031 scalem = rwork(1)
2032 numrank = nint(rwork(2))
2033 DO 6970 p = 1, n
2034 CALL zcopy( n, cwork(n+(p-1)*n+1), 1, u(1,p), 1 )
2035 CALL zdscal( n, sva(p), cwork(n+(p-1)*n+1), 1 )
2036 6970 CONTINUE
2037*
2038 CALL ztrsm( 'L', 'U', 'N', 'N', n, n,
2039 $ cone, a, lda, cwork(n+1), n )
2040 DO 6972 p = 1, n
2041 CALL zcopy( n, cwork(n+p), n, v(iwork(p),1), ldv )
2042 6972 CONTINUE
2043 temp1 = sqrt(dble(n))*epsln
2044 DO 6971 p = 1, n
2045 xsc = one / dznrm2( n, v(1,p), 1 )
2046 IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
2047 $ CALL zdscal( n, xsc, v(1,p), 1 )
2048 6971 CONTINUE
2049*
2050* Assemble the left singular vector matrix U (M x N).
2051*
2052 IF ( n .LT. m ) THEN
2053 CALL zlaset( 'A', m-n, n, czero, czero, u(n+1,1), ldu )
2054 IF ( n .LT. n1 ) THEN
2055 CALL zlaset('A',n, n1-n, czero, czero, u(1,n+1),ldu)
2056 CALL zlaset( 'A',m-n,n1-n, czero, cone,u(n+1,n+1),ldu)
2057 END IF
2058 END IF
2059 CALL zunmqr( 'L', 'N', m, n1, n, a, lda, cwork, u,
2060 $ ldu, cwork(n+1), lwork-n, ierr )
2061 temp1 = sqrt(dble(m))*epsln
2062 DO 6973 p = 1, n1
2063 xsc = one / dznrm2( m, u(1,p), 1 )
2064 IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
2065 $ CALL zdscal( m, xsc, u(1,p), 1 )
2066 6973 CONTINUE
2067*
2068 IF ( rowpiv )
2069 $ CALL zlaswp( n1, u, ldu, 1, m-1, iwork(iwoff+1), -1 )
2070*
2071 END IF
2072*
2073* end of the >> almost orthogonal case << in the full SVD
2074*
2075 ELSE
2076*
2077* This branch deploys a preconditioned Jacobi SVD with explicitly
2078* accumulated rotations. It is included as optional, mainly for
2079* experimental purposes. It does perform well, and can also be used.
2080* In this implementation, this branch will be automatically activated
2081* if the condition number sigma_max(A) / sigma_min(A) is predicted
2082* to be greater than the overflow threshold. This is because the
2083* a posteriori computation of the singular vectors assumes robust
2084* implementation of BLAS and some LAPACK procedures, capable of working
2085* in presence of extreme values, e.g. when the singular values spread from
2086* the underflow to the overflow threshold.
2087*
2088 DO 7968 p = 1, nr
2089 CALL zcopy( n-p+1, a(p,p), lda, v(p,p), 1 )
2090 CALL zlacgv( n-p+1, v(p,p), 1 )
2091 7968 CONTINUE
2092*
2093 IF ( l2pert ) THEN
2094 xsc = sqrt(small/epsln)
2095 DO 5969 q = 1, nr
2096 ctemp = dcmplx(xsc*abs( v(q,q) ),zero)
2097 DO 5968 p = 1, n
2098 IF ( ( p .GT. q ) .AND. ( abs(v(p,q)) .LE. temp1 )
2099 $ .OR. ( p .LT. q ) )
2100* $ V(p,q) = TEMP1 * ( V(p,q) / ABS(V(p,q)) )
2101 $ v(p,q) = ctemp
2102 IF ( p .LT. q ) v(p,q) = - v(p,q)
2103 5968 CONTINUE
2104 5969 CONTINUE
2105 ELSE
2106 CALL zlaset( 'U', nr-1, nr-1, czero, czero, v(1,2), ldv )
2107 END IF
2108
2109 CALL zgeqrf( n, nr, v, ldv, cwork(n+1), cwork(2*n+1),
2110 $ lwork-2*n, ierr )
2111 CALL zlacpy( 'L', n, nr, v, ldv, cwork(2*n+1), n )
2112*
2113 DO 7969 p = 1, nr
2114 CALL zcopy( nr-p+1, v(p,p), ldv, u(p,p), 1 )
2115 CALL zlacgv( nr-p+1, u(p,p), 1 )
2116 7969 CONTINUE
2117
2118 IF ( l2pert ) THEN
2119 xsc = sqrt(small/epsln)
2120 DO 9970 q = 2, nr
2121 DO 9971 p = 1, q - 1
2122 ctemp = dcmplx(xsc * min(abs(u(p,p)),abs(u(q,q))),
2123 $ zero)
2124* U(p,q) = - TEMP1 * ( U(q,p) / ABS(U(q,p)) )
2125 u(p,q) = - ctemp
2126 9971 CONTINUE
2127 9970 CONTINUE
2128 ELSE
2129 CALL zlaset('U', nr-1, nr-1, czero, czero, u(1,2), ldu )
2130 END IF
2131
2132 CALL zgesvj( 'L', 'U', 'V', nr, nr, u, ldu, sva,
2133 $ n, v, ldv, cwork(2*n+n*nr+1), lwork-2*n-n*nr,
2134 $ rwork, lrwork, info )
2135 scalem = rwork(1)
2136 numrank = nint(rwork(2))
2137
2138 IF ( nr .LT. n ) THEN
2139 CALL zlaset( 'A',n-nr,nr,czero,czero,v(nr+1,1),ldv )
2140 CALL zlaset( 'A',nr,n-nr,czero,czero,v(1,nr+1),ldv )
2141 CALL zlaset( 'A',n-nr,n-nr,czero,cone,v(nr+1,nr+1),ldv )
2142 END IF
2143
2144 CALL zunmqr( 'L','N',n,n,nr,cwork(2*n+1),n,cwork(n+1),
2145 $ v,ldv,cwork(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
2146*
2147* Permute the rows of V using the (column) permutation from the
2148* first QRF. Also, scale the columns to make them unit in
2149* Euclidean norm. This applies to all cases.
2150*
2151 temp1 = sqrt(dble(n)) * epsln
2152 DO 7972 q = 1, n
2153 DO 8972 p = 1, n
2154 cwork(2*n+n*nr+nr+iwork(p)) = v(p,q)
2155 8972 CONTINUE
2156 DO 8973 p = 1, n
2157 v(p,q) = cwork(2*n+n*nr+nr+p)
2158 8973 CONTINUE
2159 xsc = one / dznrm2( n, v(1,q), 1 )
2160 IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
2161 $ CALL zdscal( n, xsc, v(1,q), 1 )
2162 7972 CONTINUE
2163*
2164* At this moment, V contains the right singular vectors of A.
2165* Next, assemble the left singular vector matrix U (M x N).
2166*
2167 IF ( nr .LT. m ) THEN
2168 CALL zlaset( 'A', m-nr, nr, czero, czero, u(nr+1,1), ldu )
2169 IF ( nr .LT. n1 ) THEN
2170 CALL zlaset('A',nr, n1-nr, czero, czero, u(1,nr+1),ldu)
2171 CALL zlaset('A',m-nr,n1-nr, czero, cone,u(nr+1,nr+1),ldu)
2172 END IF
2173 END IF
2174*
2175 CALL zunmqr( 'L', 'N', m, n1, n, a, lda, cwork, u,
2176 $ ldu, cwork(n+1), lwork-n, ierr )
2177*
2178 IF ( rowpiv )
2179 $ CALL zlaswp( n1, u, ldu, 1, m-1, iwork(iwoff+1), -1 )
2180*
2181*
2182 END IF
2183 IF ( transp ) THEN
2184* .. swap U and V because the procedure worked on A^*
2185 DO 6974 p = 1, n
2186 CALL zswap( n, u(1,p), 1, v(1,p), 1 )
2187 6974 CONTINUE
2188 END IF
2189*
2190 END IF
2191* end of the full SVD
2192*
2193* Undo scaling, if necessary (and possible)
2194*
2195 IF ( uscal2 .LE. (big/sva(1))*uscal1 ) THEN
2196 CALL dlascl( 'G', 0, 0, uscal1, uscal2, nr, 1, sva, n, ierr )
2197 uscal1 = one
2198 uscal2 = one
2199 END IF
2200*
2201 IF ( nr .LT. n ) THEN
2202 DO 3004 p = nr+1, n
2203 sva(p) = zero
2204 3004 CONTINUE
2205 END IF
2206*
2207 rwork(1) = uscal2 * scalem
2208 rwork(2) = uscal1
2209 IF ( errest ) rwork(3) = sconda
2210 IF ( lsvec .AND. rsvec ) THEN
2211 rwork(4) = condr1
2212 rwork(5) = condr2
2213 END IF
2214 IF ( l2tran ) THEN
2215 rwork(6) = entra
2216 rwork(7) = entrat
2217 END IF
2218*
2219 iwork(1) = nr
2220 iwork(2) = numrank
2221 iwork(3) = warning
2222 IF ( transp ) THEN
2223 iwork(4) = 1
2224 ELSE
2225 iwork(4) = -1
2226 END IF
2227
2228*
2229 RETURN
2230* ..
2231* .. END OF ZGEJSV
2232* ..
double precision function dlamch(CMACH)
DLAMCH
Definition: dlamch.f:69
subroutine dlassq(n, x, incx, scl, sumsq)
DLASSQ updates a sum of squares represented in scaled form.
Definition: dlassq.f90:137
subroutine zlassq(n, x, incx, scl, sumsq)
ZLASSQ updates a sum of squares represented in scaled form.
Definition: zlassq.f90:137
subroutine dlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
DLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: dlascl.f:143
integer function izamax(N, ZX, INCX)
IZAMAX
Definition: izamax.f:71
integer function idamax(N, DX, INCX)
IDAMAX
Definition: idamax.f:71
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:60
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:53
subroutine zswap(N, ZX, INCX, ZY, INCY)
ZSWAP
Definition: zswap.f:81
subroutine zdscal(N, DA, ZX, INCX)
ZDSCAL
Definition: zdscal.f:78
subroutine zcopy(N, ZX, INCX, ZY, INCY)
ZCOPY
Definition: zcopy.f:81
subroutine ztrsm(SIDE, UPLO, TRANSA, DIAG, M, N, ALPHA, A, LDA, B, LDB)
ZTRSM
Definition: ztrsm.f:180
subroutine zgelqf(M, N, A, LDA, TAU, WORK, LWORK, INFO)
ZGELQF
Definition: zgelqf.f:143
subroutine zgeqp3(M, N, A, LDA, JPVT, TAU, WORK, LWORK, RWORK, INFO)
ZGEQP3
Definition: zgeqp3.f:159
subroutine zgesvj(JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V, LDV, CWORK, LWORK, RWORK, LRWORK, INFO)
ZGESVJ
Definition: zgesvj.f:351
subroutine zlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
ZLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: zlascl.f:143
subroutine zlapmr(FORWRD, M, N, X, LDX, K)
ZLAPMR rearranges rows of a matrix as specified by a permutation vector.
Definition: zlapmr.f:104
subroutine zlaswp(N, A, LDA, K1, K2, IPIV, INCX)
ZLASWP performs a series of row interchanges on a general rectangular matrix.
Definition: zlaswp.f:115
subroutine zlacpy(UPLO, M, N, A, LDA, B, LDB)
ZLACPY copies all or part of one two-dimensional array to another.
Definition: zlacpy.f:103
subroutine zlaset(UPLO, M, N, ALPHA, BETA, A, LDA)
ZLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition: zlaset.f:106
subroutine zlacgv(N, X, INCX)
ZLACGV conjugates a complex vector.
Definition: zlacgv.f:74
subroutine zunmlq(SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, WORK, LWORK, INFO)
ZUNMLQ
Definition: zunmlq.f:167
subroutine zungqr(M, N, K, A, LDA, TAU, WORK, LWORK, INFO)
ZUNGQR
Definition: zungqr.f:128
subroutine zunmqr(SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, WORK, LWORK, INFO)
ZUNMQR
Definition: zunmqr.f:167
subroutine zpocon(UPLO, N, A, LDA, ANORM, RCOND, WORK, RWORK, INFO)
ZPOCON
Definition: zpocon.f:121
subroutine dscal(N, DA, DX, INCX)
DSCAL
Definition: dscal.f:79
real(wp) function dznrm2(n, x, incx)
DZNRM2
Definition: dznrm2.f90:90
subroutine zgeqrf(M, N, A, LDA, TAU, WORK, LWORK, INFO)
ZGEQRF VARIANT: left-looking Level 3 BLAS of the algorithm.
Definition: zgeqrf.f:152
Here is the call graph for this function:
Here is the caller graph for this function: