LAPACK 3.12.0
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches

◆ cgesvj()

subroutine cgesvj ( character*1  joba,
character*1  jobu,
character*1  jobv,
integer  m,
integer  n,
complex, dimension( lda, * )  a,
integer  lda,
real, dimension( n )  sva,
integer  mv,
complex, dimension( ldv, * )  v,
integer  ldv,
complex, dimension( lwork )  cwork,
integer  lwork,
real, dimension( lrwork )  rwork,
integer  lrwork,
integer  info 
)

CGESVJ

Download CGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 CGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER*1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U' or 'F': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' or 'J': the matrix V is computed and returned in the array V
          = 'A':  the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly; instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N':  the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU = 'U' .OR. JOBU = 'C':
                 If INFO = 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold SLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'),
                 see the description of JOBU.
                 If INFO > 0,
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU = 'N':
                 If INFO = 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO > 0 :
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is REAL array, dimension (N)
          On exit,
          If INFO = 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE = ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO > 0 :
          the procedure CGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV = 'A', then the product of Jacobi rotations in CGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV >= 1.
          If JOBV = 'V', then LDV >= max(1,N).
          If JOBV = 'A', then LDV >= max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX array, dimension (max(1,LWORK))
          Used as workspace.
          If on entry LWORK = -1, then a workspace query is assumed and
          no computation is done; CWORK(1) is set to the minial (and optimal)
          length of CWORK.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is REAL array, dimension (max(6,LRWORK))
          On entry,
          If JOBU = 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=SLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when CGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is still useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
         If on entry LRWORK = -1, then a workspace query is assumed and
         no computation is done; RWORK(1) is set to the minial (and optimal)
         length of RWORK.
[in]LRWORK
         LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0:  successful exit.
          < 0:  if INFO = -i, then the i-th argument had an illegal value
          > 0:  CGESVJ did not converge in the maximal allowed number
                (NSWEEP=30) of sweeps. The output may still be useful.
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tuning parameters (marked with [TP]) are available for the
 implementer.
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributor:
  ============

  Zlatko Drmac (Zagreb, Croatia)
References:
 [1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
    singular value decomposition on a vector computer.
    SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.
 [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
 [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular
    value computation in floating point arithmetic.
    SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.
 [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
    LAPACK Working note 169.
 [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
    LAPACK Working note 170.
 [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
    QSVD, (H,K)-SVD computations.
    Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  drmac@math.hr. Thank you.

Definition at line 349 of file cgesvj.f.

351*
352* -- LAPACK computational routine --
353* -- LAPACK is a software package provided by Univ. of Tennessee, --
354* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355*
356 IMPLICIT NONE
357* .. Scalar Arguments ..
358 INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N
359 CHARACTER*1 JOBA, JOBU, JOBV
360* ..
361* .. Array Arguments ..
362 COMPLEX A( LDA, * ), V( LDV, * ), CWORK( LWORK )
363 REAL RWORK( LRWORK ), SVA( N )
364* ..
365*
366* =====================================================================
367*
368* .. Local Parameters ..
369 REAL ZERO, HALF, ONE
370 parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
371 COMPLEX CZERO, CONE
372 parameter( czero = (0.0e0, 0.0e0), cone = (1.0e0, 0.0e0) )
373 INTEGER NSWEEP
374 parameter( nsweep = 30 )
375* ..
376* .. Local Scalars ..
377 COMPLEX AAPQ, OMPQ
378 REAL AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
379 $ BIGTHETA, CS, CTOL, EPSLN, MXAAPQ,
380 $ MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL,
381 $ SKL, SFMIN, SMALL, SN, T, TEMP1, THETA, THSIGN, TOL
382 INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
383 $ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34,
384 $ N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND
385 LOGICAL APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE, ROTOK,
386 $ RSVEC, UCTOL, UPPER
387* ..
388* ..
389* .. Intrinsic Functions ..
390 INTRINSIC abs, max, min, conjg, real, sign, sqrt
391* ..
392* .. External Functions ..
393* ..
394* from BLAS
395 REAL SCNRM2
396 COMPLEX CDOTC
397 EXTERNAL cdotc, scnrm2
398 INTEGER ISAMAX
399 EXTERNAL isamax
400* from LAPACK
401 REAL SLAMCH
402 EXTERNAL slamch
403 LOGICAL LSAME
404 EXTERNAL lsame
405* ..
406* .. External Subroutines ..
407* ..
408* from BLAS
409 EXTERNAL ccopy, crot, csscal, cswap, caxpy
410* from LAPACK
411 EXTERNAL clascl, claset, classq, slascl, xerbla
412 EXTERNAL cgsvj0, cgsvj1
413* ..
414* .. Executable Statements ..
415*
416* Test the input arguments
417*
418 lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
419 uctol = lsame( jobu, 'C' )
420 rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
421 applv = lsame( jobv, 'A' )
422 upper = lsame( joba, 'U' )
423 lower = lsame( joba, 'L' )
424*
425 lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
426 IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
427 info = -1
428 ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
429 info = -2
430 ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
431 info = -3
432 ELSE IF( m.LT.0 ) THEN
433 info = -4
434 ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
435 info = -5
436 ELSE IF( lda.LT.m ) THEN
437 info = -7
438 ELSE IF( mv.LT.0 ) THEN
439 info = -9
440 ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
441 $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
442 info = -11
443 ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
444 info = -12
445 ELSE IF( lwork.LT.( m+n ) .AND. ( .NOT.lquery ) ) THEN
446 info = -13
447 ELSE IF( lrwork.LT.max( n, 6 ) .AND. ( .NOT.lquery ) ) THEN
448 info = -15
449 ELSE
450 info = 0
451 END IF
452*
453* #:(
454 IF( info.NE.0 ) THEN
455 CALL xerbla( 'CGESVJ', -info )
456 RETURN
457 ELSE IF ( lquery ) THEN
458 cwork(1) = m + n
459 rwork(1) = max( n, 6 )
460 RETURN
461 END IF
462*
463* #:) Quick return for void matrix
464*
465 IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
466*
467* Set numerical parameters
468* The stopping criterion for Jacobi rotations is
469*
470* max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
471*
472* where EPS is the round-off and CTOL is defined as follows:
473*
474 IF( uctol ) THEN
475* ... user controlled
476 ctol = rwork( 1 )
477 ELSE
478* ... default
479 IF( lsvec .OR. rsvec .OR. applv ) THEN
480 ctol = sqrt( real( m ) )
481 ELSE
482 ctol = real( m )
483 END IF
484 END IF
485* ... and the machine dependent parameters are
486*[!] (Make sure that SLAMCH() works properly on the target machine.)
487*
488 epsln = slamch( 'Epsilon' )
489 rooteps = sqrt( epsln )
490 sfmin = slamch( 'SafeMinimum' )
491 rootsfmin = sqrt( sfmin )
492 small = sfmin / epsln
493* BIG = SLAMCH( 'Overflow' )
494 big = one / sfmin
495 rootbig = one / rootsfmin
496* LARGE = BIG / SQRT( REAL( M*N ) )
497 bigtheta = one / rooteps
498*
499 tol = ctol*epsln
500 roottol = sqrt( tol )
501*
502 IF( real( m )*epsln.GE.one ) THEN
503 info = -4
504 CALL xerbla( 'CGESVJ', -info )
505 RETURN
506 END IF
507*
508* Initialize the right singular vector matrix.
509*
510 IF( rsvec ) THEN
511 mvl = n
512 CALL claset( 'A', mvl, n, czero, cone, v, ldv )
513 ELSE IF( applv ) THEN
514 mvl = mv
515 END IF
516 rsvec = rsvec .OR. applv
517*
518* Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
519*(!) If necessary, scale A to protect the largest singular value
520* from overflow. It is possible that saving the largest singular
521* value destroys the information about the small ones.
522* This initial scaling is almost minimal in the sense that the
523* goal is to make sure that no column norm overflows, and that
524* SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
525* in A are detected, the procedure returns with INFO=-6.
526*
527 skl = one / sqrt( real( m )*real( n ) )
528 noscale = .true.
529 goscale = .true.
530*
531 IF( lower ) THEN
532* the input matrix is M-by-N lower triangular (trapezoidal)
533 DO 1874 p = 1, n
534 aapp = zero
535 aaqq = one
536 CALL classq( m-p+1, a( p, p ), 1, aapp, aaqq )
537 IF( aapp.GT.big ) THEN
538 info = -6
539 CALL xerbla( 'CGESVJ', -info )
540 RETURN
541 END IF
542 aaqq = sqrt( aaqq )
543 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
544 sva( p ) = aapp*aaqq
545 ELSE
546 noscale = .false.
547 sva( p ) = aapp*( aaqq*skl )
548 IF( goscale ) THEN
549 goscale = .false.
550 DO 1873 q = 1, p - 1
551 sva( q ) = sva( q )*skl
552 1873 CONTINUE
553 END IF
554 END IF
555 1874 CONTINUE
556 ELSE IF( upper ) THEN
557* the input matrix is M-by-N upper triangular (trapezoidal)
558 DO 2874 p = 1, n
559 aapp = zero
560 aaqq = one
561 CALL classq( p, a( 1, p ), 1, aapp, aaqq )
562 IF( aapp.GT.big ) THEN
563 info = -6
564 CALL xerbla( 'CGESVJ', -info )
565 RETURN
566 END IF
567 aaqq = sqrt( aaqq )
568 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
569 sva( p ) = aapp*aaqq
570 ELSE
571 noscale = .false.
572 sva( p ) = aapp*( aaqq*skl )
573 IF( goscale ) THEN
574 goscale = .false.
575 DO 2873 q = 1, p - 1
576 sva( q ) = sva( q )*skl
577 2873 CONTINUE
578 END IF
579 END IF
580 2874 CONTINUE
581 ELSE
582* the input matrix is M-by-N general dense
583 DO 3874 p = 1, n
584 aapp = zero
585 aaqq = one
586 CALL classq( m, a( 1, p ), 1, aapp, aaqq )
587 IF( aapp.GT.big ) THEN
588 info = -6
589 CALL xerbla( 'CGESVJ', -info )
590 RETURN
591 END IF
592 aaqq = sqrt( aaqq )
593 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
594 sva( p ) = aapp*aaqq
595 ELSE
596 noscale = .false.
597 sva( p ) = aapp*( aaqq*skl )
598 IF( goscale ) THEN
599 goscale = .false.
600 DO 3873 q = 1, p - 1
601 sva( q ) = sva( q )*skl
602 3873 CONTINUE
603 END IF
604 END IF
605 3874 CONTINUE
606 END IF
607*
608 IF( noscale )skl = one
609*
610* Move the smaller part of the spectrum from the underflow threshold
611*(!) Start by determining the position of the nonzero entries of the
612* array SVA() relative to ( SFMIN, BIG ).
613*
614 aapp = zero
615 aaqq = big
616 DO 4781 p = 1, n
617 IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
618 aapp = max( aapp, sva( p ) )
619 4781 CONTINUE
620*
621* #:) Quick return for zero matrix
622*
623 IF( aapp.EQ.zero ) THEN
624 IF( lsvec )CALL claset( 'G', m, n, czero, cone, a, lda )
625 rwork( 1 ) = one
626 rwork( 2 ) = zero
627 rwork( 3 ) = zero
628 rwork( 4 ) = zero
629 rwork( 5 ) = zero
630 rwork( 6 ) = zero
631 RETURN
632 END IF
633*
634* #:) Quick return for one-column matrix
635*
636 IF( n.EQ.1 ) THEN
637 IF( lsvec )CALL clascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
638 $ a( 1, 1 ), lda, ierr )
639 rwork( 1 ) = one / skl
640 IF( sva( 1 ).GE.sfmin ) THEN
641 rwork( 2 ) = one
642 ELSE
643 rwork( 2 ) = zero
644 END IF
645 rwork( 3 ) = zero
646 rwork( 4 ) = zero
647 rwork( 5 ) = zero
648 rwork( 6 ) = zero
649 RETURN
650 END IF
651*
652* Protect small singular values from underflow, and try to
653* avoid underflows/overflows in computing Jacobi rotations.
654*
655 sn = sqrt( sfmin / epsln )
656 temp1 = sqrt( big / real( n ) )
657 IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
658 $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
659 temp1 = min( big, temp1 / aapp )
660* AAQQ = AAQQ*TEMP1
661* AAPP = AAPP*TEMP1
662 ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
663 temp1 = min( sn / aaqq, big / ( aapp*sqrt( real( n ) ) ) )
664* AAQQ = AAQQ*TEMP1
665* AAPP = AAPP*TEMP1
666 ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
667 temp1 = max( sn / aaqq, temp1 / aapp )
668* AAQQ = AAQQ*TEMP1
669* AAPP = AAPP*TEMP1
670 ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
671 temp1 = min( sn / aaqq, big / ( sqrt( real( n ) )*aapp ) )
672* AAQQ = AAQQ*TEMP1
673* AAPP = AAPP*TEMP1
674 ELSE
675 temp1 = one
676 END IF
677*
678* Scale, if necessary
679*
680 IF( temp1.NE.one ) THEN
681 CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
682 END IF
683 skl = temp1*skl
684 IF( skl.NE.one ) THEN
685 CALL clascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
686 skl = one / skl
687 END IF
688*
689* Row-cyclic Jacobi SVD algorithm with column pivoting
690*
691 emptsw = ( n*( n-1 ) ) / 2
692 notrot = 0
693
694 DO 1868 q = 1, n
695 cwork( q ) = cone
696 1868 CONTINUE
697*
698*
699*
700 swband = 3
701*[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
702* if CGESVJ is used as a computational routine in the preconditioned
703* Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure
704* works on pivots inside a band-like region around the diagonal.
705* The boundaries are determined dynamically, based on the number of
706* pivots above a threshold.
707*
708 kbl = min( 8, n )
709*[TP] KBL is a tuning parameter that defines the tile size in the
710* tiling of the p-q loops of pivot pairs. In general, an optimal
711* value of KBL depends on the matrix dimensions and on the
712* parameters of the computer's memory.
713*
714 nbl = n / kbl
715 IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
716*
717 blskip = kbl**2
718*[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
719*
720 rowskip = min( 5, kbl )
721*[TP] ROWSKIP is a tuning parameter.
722*
723 lkahead = 1
724*[TP] LKAHEAD is a tuning parameter.
725*
726* Quasi block transformations, using the lower (upper) triangular
727* structure of the input matrix. The quasi-block-cycling usually
728* invokes cubic convergence. Big part of this cycle is done inside
729* canonical subspaces of dimensions less than M.
730*
731 IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
732*[TP] The number of partition levels and the actual partition are
733* tuning parameters.
734 n4 = n / 4
735 n2 = n / 2
736 n34 = 3*n4
737 IF( applv ) THEN
738 q = 0
739 ELSE
740 q = 1
741 END IF
742*
743 IF( lower ) THEN
744*
745* This works very well on lower triangular matrices, in particular
746* in the framework of the preconditioned Jacobi SVD (xGEJSV).
747* The idea is simple:
748* [+ 0 0 0] Note that Jacobi transformations of [0 0]
749* [+ + 0 0] [0 0]
750* [+ + x 0] actually work on [x 0] [x 0]
751* [+ + x x] [x x]. [x x]
752*
753 CALL cgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
754 $ cwork( n34+1 ), sva( n34+1 ), mvl,
755 $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
756 $ 2, cwork( n+1 ), lwork-n, ierr )
757
758 CALL cgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
759 $ cwork( n2+1 ), sva( n2+1 ), mvl,
760 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
761 $ cwork( n+1 ), lwork-n, ierr )
762
763 CALL cgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
764 $ cwork( n2+1 ), sva( n2+1 ), mvl,
765 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
766 $ cwork( n+1 ), lwork-n, ierr )
767*
768 CALL cgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
769 $ cwork( n4+1 ), sva( n4+1 ), mvl,
770 $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
771 $ cwork( n+1 ), lwork-n, ierr )
772*
773 CALL cgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
774 $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
775 $ ierr )
776*
777 CALL cgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
778 $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
779 $ lwork-n, ierr )
780*
781*
782 ELSE IF( upper ) THEN
783*
784*
785 CALL cgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
786 $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
787 $ ierr )
788*
789 CALL cgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
790 $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
791 $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
792 $ ierr )
793*
794 CALL cgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
795 $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
796 $ lwork-n, ierr )
797*
798 CALL cgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
799 $ cwork( n2+1 ), sva( n2+1 ), mvl,
800 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
801 $ cwork( n+1 ), lwork-n, ierr )
802
803 END IF
804*
805 END IF
806*
807* .. Row-cyclic pivot strategy with de Rijk's pivoting ..
808*
809 DO 1993 i = 1, nsweep
810*
811* .. go go go ...
812*
813 mxaapq = zero
814 mxsinj = zero
815 iswrot = 0
816*
817 notrot = 0
818 pskipped = 0
819*
820* Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
821* 1 <= p < q <= N. This is the first step toward a blocked implementation
822* of the rotations. New implementation, based on block transformations,
823* is under development.
824*
825 DO 2000 ibr = 1, nbl
826*
827 igl = ( ibr-1 )*kbl + 1
828*
829 DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
830*
831 igl = igl + ir1*kbl
832*
833 DO 2001 p = igl, min( igl+kbl-1, n-1 )
834*
835* .. de Rijk's pivoting
836*
837 q = isamax( n-p+1, sva( p ), 1 ) + p - 1
838 IF( p.NE.q ) THEN
839 CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
840 IF( rsvec )CALL cswap( mvl, v( 1, p ), 1,
841 $ v( 1, q ), 1 )
842 temp1 = sva( p )
843 sva( p ) = sva( q )
844 sva( q ) = temp1
845 aapq = cwork(p)
846 cwork(p) = cwork(q)
847 cwork(q) = aapq
848 END IF
849*
850 IF( ir1.EQ.0 ) THEN
851*
852* Column norms are periodically updated by explicit
853* norm computation.
854*[!] Caveat:
855* Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1)
856* as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
857* overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
858* underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
859* Hence, SCNRM2 cannot be trusted, not even in the case when
860* the true norm is far from the under(over)flow boundaries.
861* If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
862* below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )".
863*
864 IF( ( sva( p ).LT.rootbig ) .AND.
865 $ ( sva( p ).GT.rootsfmin ) ) THEN
866 sva( p ) = scnrm2( m, a( 1, p ), 1 )
867 ELSE
868 temp1 = zero
869 aapp = one
870 CALL classq( m, a( 1, p ), 1, temp1, aapp )
871 sva( p ) = temp1*sqrt( aapp )
872 END IF
873 aapp = sva( p )
874 ELSE
875 aapp = sva( p )
876 END IF
877*
878 IF( aapp.GT.zero ) THEN
879*
880 pskipped = 0
881*
882 DO 2002 q = p + 1, min( igl+kbl-1, n )
883*
884 aaqq = sva( q )
885*
886 IF( aaqq.GT.zero ) THEN
887*
888 aapp0 = aapp
889 IF( aaqq.GE.one ) THEN
890 rotok = ( small*aapp ).LE.aaqq
891 IF( aapp.LT.( big / aaqq ) ) THEN
892 aapq = ( cdotc( m, a( 1, p ), 1,
893 $ a( 1, q ), 1 ) / aaqq ) / aapp
894 ELSE
895 CALL ccopy( m, a( 1, p ), 1,
896 $ cwork(n+1), 1 )
897 CALL clascl( 'G', 0, 0, aapp, one,
898 $ m, 1, cwork(n+1), lda, ierr )
899 aapq = cdotc( m, cwork(n+1), 1,
900 $ a( 1, q ), 1 ) / aaqq
901 END IF
902 ELSE
903 rotok = aapp.LE.( aaqq / small )
904 IF( aapp.GT.( small / aaqq ) ) THEN
905 aapq = ( cdotc( m, a( 1, p ), 1,
906 $ a( 1, q ), 1 ) / aapp ) / aaqq
907 ELSE
908 CALL ccopy( m, a( 1, q ), 1,
909 $ cwork(n+1), 1 )
910 CALL clascl( 'G', 0, 0, aaqq,
911 $ one, m, 1,
912 $ cwork(n+1), lda, ierr )
913 aapq = cdotc( m, a(1, p ), 1,
914 $ cwork(n+1), 1 ) / aapp
915 END IF
916 END IF
917*
918* AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
919 aapq1 = -abs(aapq)
920 mxaapq = max( mxaapq, -aapq1 )
921*
922* TO rotate or NOT to rotate, THAT is the question ...
923*
924 IF( abs( aapq1 ).GT.tol ) THEN
925 ompq = aapq / abs(aapq)
926*
927* .. rotate
928*[RTD] ROTATED = ROTATED + ONE
929*
930 IF( ir1.EQ.0 ) THEN
931 notrot = 0
932 pskipped = 0
933 iswrot = iswrot + 1
934 END IF
935*
936 IF( rotok ) THEN
937*
938 aqoap = aaqq / aapp
939 apoaq = aapp / aaqq
940 theta = -half*abs( aqoap-apoaq )/aapq1
941*
942 IF( abs( theta ).GT.bigtheta ) THEN
943*
944 t = half / theta
945 cs = one
946
947 CALL crot( m, a(1,p), 1, a(1,q), 1,
948 $ cs, conjg(ompq)*t )
949 IF ( rsvec ) THEN
950 CALL crot( mvl, v(1,p), 1,
951 $ v(1,q), 1, cs, conjg(ompq)*t )
952 END IF
953
954 sva( q ) = aaqq*sqrt( max( zero,
955 $ one+t*apoaq*aapq1 ) )
956 aapp = aapp*sqrt( max( zero,
957 $ one-t*aqoap*aapq1 ) )
958 mxsinj = max( mxsinj, abs( t ) )
959*
960 ELSE
961*
962* .. choose correct signum for THETA and rotate
963*
964 thsign = -sign( one, aapq1 )
965 t = one / ( theta+thsign*
966 $ sqrt( one+theta*theta ) )
967 cs = sqrt( one / ( one+t*t ) )
968 sn = t*cs
969*
970 mxsinj = max( mxsinj, abs( sn ) )
971 sva( q ) = aaqq*sqrt( max( zero,
972 $ one+t*apoaq*aapq1 ) )
973 aapp = aapp*sqrt( max( zero,
974 $ one-t*aqoap*aapq1 ) )
975*
976 CALL crot( m, a(1,p), 1, a(1,q), 1,
977 $ cs, conjg(ompq)*sn )
978 IF ( rsvec ) THEN
979 CALL crot( mvl, v(1,p), 1,
980 $ v(1,q), 1, cs, conjg(ompq)*sn )
981 END IF
982 END IF
983 cwork(p) = -cwork(q) * ompq
984*
985 ELSE
986* .. have to use modified Gram-Schmidt like transformation
987 CALL ccopy( m, a( 1, p ), 1,
988 $ cwork(n+1), 1 )
989 CALL clascl( 'G', 0, 0, aapp, one, m,
990 $ 1, cwork(n+1), lda,
991 $ ierr )
992 CALL clascl( 'G', 0, 0, aaqq, one, m,
993 $ 1, a( 1, q ), lda, ierr )
994 CALL caxpy( m, -aapq, cwork(n+1), 1,
995 $ a( 1, q ), 1 )
996 CALL clascl( 'G', 0, 0, one, aaqq, m,
997 $ 1, a( 1, q ), lda, ierr )
998 sva( q ) = aaqq*sqrt( max( zero,
999 $ one-aapq1*aapq1 ) )
1000 mxsinj = max( mxsinj, sfmin )
1001 END IF
1002* END IF ROTOK THEN ... ELSE
1003*
1004* In the case of cancellation in updating SVA(q), SVA(p)
1005* recompute SVA(q), SVA(p).
1006*
1007 IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1008 $ THEN
1009 IF( ( aaqq.LT.rootbig ) .AND.
1010 $ ( aaqq.GT.rootsfmin ) ) THEN
1011 sva( q ) = scnrm2( m, a( 1, q ), 1 )
1012 ELSE
1013 t = zero
1014 aaqq = one
1015 CALL classq( m, a( 1, q ), 1, t,
1016 $ aaqq )
1017 sva( q ) = t*sqrt( aaqq )
1018 END IF
1019 END IF
1020 IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1021 IF( ( aapp.LT.rootbig ) .AND.
1022 $ ( aapp.GT.rootsfmin ) ) THEN
1023 aapp = scnrm2( m, a( 1, p ), 1 )
1024 ELSE
1025 t = zero
1026 aapp = one
1027 CALL classq( m, a( 1, p ), 1, t,
1028 $ aapp )
1029 aapp = t*sqrt( aapp )
1030 END IF
1031 sva( p ) = aapp
1032 END IF
1033*
1034 ELSE
1035* A(:,p) and A(:,q) already numerically orthogonal
1036 IF( ir1.EQ.0 )notrot = notrot + 1
1037*[RTD] SKIPPED = SKIPPED + 1
1038 pskipped = pskipped + 1
1039 END IF
1040 ELSE
1041* A(:,q) is zero column
1042 IF( ir1.EQ.0 )notrot = notrot + 1
1043 pskipped = pskipped + 1
1044 END IF
1045*
1046 IF( ( i.LE.swband ) .AND.
1047 $ ( pskipped.GT.rowskip ) ) THEN
1048 IF( ir1.EQ.0 )aapp = -aapp
1049 notrot = 0
1050 GO TO 2103
1051 END IF
1052*
1053 2002 CONTINUE
1054* END q-LOOP
1055*
1056 2103 CONTINUE
1057* bailed out of q-loop
1058*
1059 sva( p ) = aapp
1060*
1061 ELSE
1062 sva( p ) = aapp
1063 IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1064 $ notrot = notrot + min( igl+kbl-1, n ) - p
1065 END IF
1066*
1067 2001 CONTINUE
1068* end of the p-loop
1069* end of doing the block ( ibr, ibr )
1070 1002 CONTINUE
1071* end of ir1-loop
1072*
1073* ... go to the off diagonal blocks
1074*
1075 igl = ( ibr-1 )*kbl + 1
1076*
1077 DO 2010 jbc = ibr + 1, nbl
1078*
1079 jgl = ( jbc-1 )*kbl + 1
1080*
1081* doing the block at ( ibr, jbc )
1082*
1083 ijblsk = 0
1084 DO 2100 p = igl, min( igl+kbl-1, n )
1085*
1086 aapp = sva( p )
1087 IF( aapp.GT.zero ) THEN
1088*
1089 pskipped = 0
1090*
1091 DO 2200 q = jgl, min( jgl+kbl-1, n )
1092*
1093 aaqq = sva( q )
1094 IF( aaqq.GT.zero ) THEN
1095 aapp0 = aapp
1096*
1097* .. M x 2 Jacobi SVD ..
1098*
1099* Safe Gram matrix computation
1100*
1101 IF( aaqq.GE.one ) THEN
1102 IF( aapp.GE.aaqq ) THEN
1103 rotok = ( small*aapp ).LE.aaqq
1104 ELSE
1105 rotok = ( small*aaqq ).LE.aapp
1106 END IF
1107 IF( aapp.LT.( big / aaqq ) ) THEN
1108 aapq = ( cdotc( m, a( 1, p ), 1,
1109 $ a( 1, q ), 1 ) / aaqq ) / aapp
1110 ELSE
1111 CALL ccopy( m, a( 1, p ), 1,
1112 $ cwork(n+1), 1 )
1113 CALL clascl( 'G', 0, 0, aapp,
1114 $ one, m, 1,
1115 $ cwork(n+1), lda, ierr )
1116 aapq = cdotc( m, cwork(n+1), 1,
1117 $ a( 1, q ), 1 ) / aaqq
1118 END IF
1119 ELSE
1120 IF( aapp.GE.aaqq ) THEN
1121 rotok = aapp.LE.( aaqq / small )
1122 ELSE
1123 rotok = aaqq.LE.( aapp / small )
1124 END IF
1125 IF( aapp.GT.( small / aaqq ) ) THEN
1126 aapq = ( cdotc( m, a( 1, p ), 1,
1127 $ a( 1, q ), 1 ) / max(aaqq,aapp) )
1128 $ / min(aaqq,aapp)
1129 ELSE
1130 CALL ccopy( m, a( 1, q ), 1,
1131 $ cwork(n+1), 1 )
1132 CALL clascl( 'G', 0, 0, aaqq,
1133 $ one, m, 1,
1134 $ cwork(n+1), lda, ierr )
1135 aapq = cdotc( m, a( 1, p ), 1,
1136 $ cwork(n+1), 1 ) / aapp
1137 END IF
1138 END IF
1139*
1140* AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1141 aapq1 = -abs(aapq)
1142 mxaapq = max( mxaapq, -aapq1 )
1143*
1144* TO rotate or NOT to rotate, THAT is the question ...
1145*
1146 IF( abs( aapq1 ).GT.tol ) THEN
1147 ompq = aapq / abs(aapq)
1148 notrot = 0
1149*[RTD] ROTATED = ROTATED + 1
1150 pskipped = 0
1151 iswrot = iswrot + 1
1152*
1153 IF( rotok ) THEN
1154*
1155 aqoap = aaqq / aapp
1156 apoaq = aapp / aaqq
1157 theta = -half*abs( aqoap-apoaq )/ aapq1
1158 IF( aaqq.GT.aapp0 )theta = -theta
1159*
1160 IF( abs( theta ).GT.bigtheta ) THEN
1161 t = half / theta
1162 cs = one
1163 CALL crot( m, a(1,p), 1, a(1,q), 1,
1164 $ cs, conjg(ompq)*t )
1165 IF( rsvec ) THEN
1166 CALL crot( mvl, v(1,p), 1,
1167 $ v(1,q), 1, cs, conjg(ompq)*t )
1168 END IF
1169 sva( q ) = aaqq*sqrt( max( zero,
1170 $ one+t*apoaq*aapq1 ) )
1171 aapp = aapp*sqrt( max( zero,
1172 $ one-t*aqoap*aapq1 ) )
1173 mxsinj = max( mxsinj, abs( t ) )
1174 ELSE
1175*
1176* .. choose correct signum for THETA and rotate
1177*
1178 thsign = -sign( one, aapq1 )
1179 IF( aaqq.GT.aapp0 )thsign = -thsign
1180 t = one / ( theta+thsign*
1181 $ sqrt( one+theta*theta ) )
1182 cs = sqrt( one / ( one+t*t ) )
1183 sn = t*cs
1184 mxsinj = max( mxsinj, abs( sn ) )
1185 sva( q ) = aaqq*sqrt( max( zero,
1186 $ one+t*apoaq*aapq1 ) )
1187 aapp = aapp*sqrt( max( zero,
1188 $ one-t*aqoap*aapq1 ) )
1189*
1190 CALL crot( m, a(1,p), 1, a(1,q), 1,
1191 $ cs, conjg(ompq)*sn )
1192 IF( rsvec ) THEN
1193 CALL crot( mvl, v(1,p), 1,
1194 $ v(1,q), 1, cs, conjg(ompq)*sn )
1195 END IF
1196 END IF
1197 cwork(p) = -cwork(q) * ompq
1198*
1199 ELSE
1200* .. have to use modified Gram-Schmidt like transformation
1201 IF( aapp.GT.aaqq ) THEN
1202 CALL ccopy( m, a( 1, p ), 1,
1203 $ cwork(n+1), 1 )
1204 CALL clascl( 'G', 0, 0, aapp, one,
1205 $ m, 1, cwork(n+1),lda,
1206 $ ierr )
1207 CALL clascl( 'G', 0, 0, aaqq, one,
1208 $ m, 1, a( 1, q ), lda,
1209 $ ierr )
1210 CALL caxpy( m, -aapq, cwork(n+1),
1211 $ 1, a( 1, q ), 1 )
1212 CALL clascl( 'G', 0, 0, one, aaqq,
1213 $ m, 1, a( 1, q ), lda,
1214 $ ierr )
1215 sva( q ) = aaqq*sqrt( max( zero,
1216 $ one-aapq1*aapq1 ) )
1217 mxsinj = max( mxsinj, sfmin )
1218 ELSE
1219 CALL ccopy( m, a( 1, q ), 1,
1220 $ cwork(n+1), 1 )
1221 CALL clascl( 'G', 0, 0, aaqq, one,
1222 $ m, 1, cwork(n+1),lda,
1223 $ ierr )
1224 CALL clascl( 'G', 0, 0, aapp, one,
1225 $ m, 1, a( 1, p ), lda,
1226 $ ierr )
1227 CALL caxpy( m, -conjg(aapq),
1228 $ cwork(n+1), 1, a( 1, p ), 1 )
1229 CALL clascl( 'G', 0, 0, one, aapp,
1230 $ m, 1, a( 1, p ), lda,
1231 $ ierr )
1232 sva( p ) = aapp*sqrt( max( zero,
1233 $ one-aapq1*aapq1 ) )
1234 mxsinj = max( mxsinj, sfmin )
1235 END IF
1236 END IF
1237* END IF ROTOK THEN ... ELSE
1238*
1239* In the case of cancellation in updating SVA(q), SVA(p)
1240* .. recompute SVA(q), SVA(p)
1241 IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1242 $ THEN
1243 IF( ( aaqq.LT.rootbig ) .AND.
1244 $ ( aaqq.GT.rootsfmin ) ) THEN
1245 sva( q ) = scnrm2( m, a( 1, q ), 1)
1246 ELSE
1247 t = zero
1248 aaqq = one
1249 CALL classq( m, a( 1, q ), 1, t,
1250 $ aaqq )
1251 sva( q ) = t*sqrt( aaqq )
1252 END IF
1253 END IF
1254 IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1255 IF( ( aapp.LT.rootbig ) .AND.
1256 $ ( aapp.GT.rootsfmin ) ) THEN
1257 aapp = scnrm2( m, a( 1, p ), 1 )
1258 ELSE
1259 t = zero
1260 aapp = one
1261 CALL classq( m, a( 1, p ), 1, t,
1262 $ aapp )
1263 aapp = t*sqrt( aapp )
1264 END IF
1265 sva( p ) = aapp
1266 END IF
1267* end of OK rotation
1268 ELSE
1269 notrot = notrot + 1
1270*[RTD] SKIPPED = SKIPPED + 1
1271 pskipped = pskipped + 1
1272 ijblsk = ijblsk + 1
1273 END IF
1274 ELSE
1275 notrot = notrot + 1
1276 pskipped = pskipped + 1
1277 ijblsk = ijblsk + 1
1278 END IF
1279*
1280 IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1281 $ THEN
1282 sva( p ) = aapp
1283 notrot = 0
1284 GO TO 2011
1285 END IF
1286 IF( ( i.LE.swband ) .AND.
1287 $ ( pskipped.GT.rowskip ) ) THEN
1288 aapp = -aapp
1289 notrot = 0
1290 GO TO 2203
1291 END IF
1292*
1293 2200 CONTINUE
1294* end of the q-loop
1295 2203 CONTINUE
1296*
1297 sva( p ) = aapp
1298*
1299 ELSE
1300*
1301 IF( aapp.EQ.zero )notrot = notrot +
1302 $ min( jgl+kbl-1, n ) - jgl + 1
1303 IF( aapp.LT.zero )notrot = 0
1304*
1305 END IF
1306*
1307 2100 CONTINUE
1308* end of the p-loop
1309 2010 CONTINUE
1310* end of the jbc-loop
1311 2011 CONTINUE
1312*2011 bailed out of the jbc-loop
1313 DO 2012 p = igl, min( igl+kbl-1, n )
1314 sva( p ) = abs( sva( p ) )
1315 2012 CONTINUE
1316***
1317 2000 CONTINUE
1318*2000 :: end of the ibr-loop
1319*
1320* .. update SVA(N)
1321 IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1322 $ THEN
1323 sva( n ) = scnrm2( m, a( 1, n ), 1 )
1324 ELSE
1325 t = zero
1326 aapp = one
1327 CALL classq( m, a( 1, n ), 1, t, aapp )
1328 sva( n ) = t*sqrt( aapp )
1329 END IF
1330*
1331* Additional steering devices
1332*
1333 IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1334 $ ( iswrot.LE.n ) ) )swband = i
1335*
1336 IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( real( n ) )*
1337 $ tol ) .AND. ( real( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1338 GO TO 1994
1339 END IF
1340*
1341 IF( notrot.GE.emptsw )GO TO 1994
1342*
1343 1993 CONTINUE
1344* end i=1:NSWEEP loop
1345*
1346* #:( Reaching this point means that the procedure has not converged.
1347 info = nsweep - 1
1348 GO TO 1995
1349*
1350 1994 CONTINUE
1351* #:) Reaching this point means numerical convergence after the i-th
1352* sweep.
1353*
1354 info = 0
1355* #:) INFO = 0 confirms successful iterations.
1356 1995 CONTINUE
1357*
1358* Sort the singular values and find how many are above
1359* the underflow threshold.
1360*
1361 n2 = 0
1362 n4 = 0
1363 DO 5991 p = 1, n - 1
1364 q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1365 IF( p.NE.q ) THEN
1366 temp1 = sva( p )
1367 sva( p ) = sva( q )
1368 sva( q ) = temp1
1369 CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1370 IF( rsvec )CALL cswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1371 END IF
1372 IF( sva( p ).NE.zero ) THEN
1373 n4 = n4 + 1
1374 IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1375 END IF
1376 5991 CONTINUE
1377 IF( sva( n ).NE.zero ) THEN
1378 n4 = n4 + 1
1379 IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1380 END IF
1381*
1382* Normalize the left singular vectors.
1383*
1384 IF( lsvec .OR. uctol ) THEN
1385 DO 1998 p = 1, n4
1386* CALL CSSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1387 CALL clascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1388 1998 CONTINUE
1389 END IF
1390*
1391* Scale the product of Jacobi rotations.
1392*
1393 IF( rsvec ) THEN
1394 DO 2399 p = 1, n
1395 temp1 = one / scnrm2( mvl, v( 1, p ), 1 )
1396 CALL csscal( mvl, temp1, v( 1, p ), 1 )
1397 2399 CONTINUE
1398 END IF
1399*
1400* Undo scaling, if necessary (and possible).
1401 IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1402 $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1403 $ ( sfmin / skl ) ) ) ) THEN
1404 DO 2400 p = 1, n
1405 sva( p ) = skl*sva( p )
1406 2400 CONTINUE
1407 skl = one
1408 END IF
1409*
1410 rwork( 1 ) = skl
1411* The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1412* then some of the singular values may overflow or underflow and
1413* the spectrum is given in this factored representation.
1414*
1415 rwork( 2 ) = real( n4 )
1416* N4 is the number of computed nonzero singular values of A.
1417*
1418 rwork( 3 ) = real( n2 )
1419* N2 is the number of singular values of A greater than SFMIN.
1420* If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1421* that may carry some information.
1422*
1423 rwork( 4 ) = real( i )
1424* i is the index of the last sweep before declaring convergence.
1425*
1426 rwork( 5 ) = mxaapq
1427* MXAAPQ is the largest absolute value of scaled pivots in the
1428* last sweep
1429*
1430 rwork( 6 ) = mxsinj
1431* MXSINJ is the largest absolute value of the sines of Jacobi angles
1432* in the last sweep
1433*
1434 RETURN
1435* ..
1436* .. END OF CGESVJ
1437* ..
subroutine xerbla(srname, info)
Definition cblat2.f:3285
subroutine caxpy(n, ca, cx, incx, cy, incy)
CAXPY
Definition caxpy.f:88
subroutine ccopy(n, cx, incx, cy, incy)
CCOPY
Definition ccopy.f:81
complex function cdotc(n, cx, incx, cy, incy)
CDOTC
Definition cdotc.f:83
subroutine cgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
CGSVJ0 pre-processor for the routine cgesvj.
Definition cgsvj0.f:218
subroutine cgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
CGSVJ1 pre-processor for the routine cgesvj, applies Jacobi rotations targeting only particular pivot...
Definition cgsvj1.f:236
integer function isamax(n, sx, incx)
ISAMAX
Definition isamax.f:71
real function slamch(cmach)
SLAMCH
Definition slamch.f:68
subroutine clascl(type, kl, ku, cfrom, cto, m, n, a, lda, info)
CLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition clascl.f:143
subroutine slascl(type, kl, ku, cfrom, cto, m, n, a, lda, info)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition slascl.f:143
subroutine claset(uplo, m, n, alpha, beta, a, lda)
CLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition claset.f:106
subroutine classq(n, x, incx, scale, sumsq)
CLASSQ updates a sum of squares represented in scaled form.
Definition classq.f90:124
logical function lsame(ca, cb)
LSAME
Definition lsame.f:48
real(wp) function scnrm2(n, x, incx)
SCNRM2
Definition scnrm2.f90:90
subroutine crot(n, cx, incx, cy, incy, c, s)
CROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors.
Definition crot.f:103
subroutine csscal(n, sa, cx, incx)
CSSCAL
Definition csscal.f:78
subroutine cswap(n, cx, incx, cy, incy)
CSWAP
Definition cswap.f:81
Here is the call graph for this function:
Here is the caller graph for this function: