LAPACK  3.6.1 LAPACK: Linear Algebra PACKage
 subroutine cgesvj ( character*1 JOBA, character*1 JOBU, character*1 JOBV, integer M, integer N, complex, dimension( lda, * ) A, integer LDA, real, dimension( n ) SVA, integer MV, complex, dimension( ldv, * ) V, integer LDV, complex, dimension( lwork ) CWORK, integer LWORK, real, dimension( lrwork ) RWORK, integer LRWORK, integer INFO )

CGESVJ

Purpose:

Parameters
 [in] JOBA JOBA is CHARACTER* 1 Specifies the structure of A. = 'L': The input matrix A is lower triangular; = 'U': The input matrix A is upper triangular; = 'G': The input matrix A is general M-by-N matrix, M >= N. [in] JOBU JOBU is CHARACTER*1 Specifies whether to compute the left singular vectors (columns of U): = 'U': The left singular vectors corresponding to the nonzero singular values are computed and returned in the leading columns of A. See more details in the description of A. The default numerical orthogonality threshold is set to approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E'). = 'C': Analogous to JOBU='U', except that user can control the level of numerical orthogonality of the computed left singular vectors. TOL can be set to TOL = CTOL*EPS, where CTOL is given on input in the array WORK. No CTOL smaller than ONE is allowed. CTOL greater than 1 / EPS is meaningless. The option 'C' can be used if M*EPS is satisfactory orthogonality of the computed left singular vectors, so CTOL=M could save few sweeps of Jacobi rotations. See the descriptions of A and WORK(1). = 'N': The matrix U is not computed. However, see the description of A. [in] JOBV JOBV is CHARACTER*1 Specifies whether to compute the right singular vectors, that is, the matrix V: = 'V' : the matrix V is computed and returned in the array V = 'A' : the Jacobi rotations are applied to the MV-by-N array V. In other words, the right singular vector matrix V is not computed explicitly; instead it is applied to an MV-by-N matrix initially stored in the first MV rows of V. = 'N' : the matrix V is not computed and the array V is not referenced [in] M M is INTEGER The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0. [in] N N is INTEGER The number of columns of the input matrix A. M >= N >= 0. [in,out] A A is COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C': If INFO .EQ. 0 : RANKA orthonormal columns of U are returned in the leading RANKA columns of the array A. Here RANKA <= N is the number of computed singular values of A that are above the underflow threshold SLAMCH('S'). The singular vectors corresponding to underflowed or zero singular values are not computed. The value of RANKA is returned in the array RWORK as RANKA=NINT(RWORK(2)). Also see the descriptions of SVA and RWORK. The computed columns of U are mutually numerically orthogonal up to approximately TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU.EQ.'C'), see the description of JOBU. If INFO .GT. 0, the procedure CGESVJ did not converge in the given number of iterations (sweeps). In that case, the computed columns of U may not be orthogonal up to TOL. The output U (stored in A), SIGMA (given by the computed singular values in SVA(1:N)) and V is still a decomposition of the input matrix A in the sense that the residual || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small. If JOBU .EQ. 'N': If INFO .EQ. 0 : Note that the left singular vectors are 'for free' in the one-sided Jacobi SVD algorithm. However, if only the singular values are needed, the level of numerical orthogonality of U is not an issue and iterations are stopped when the columns of the iterated matrix are numerically orthogonal up to approximately M*EPS. Thus, on exit, A contains the columns of U scaled with the corresponding singular values. If INFO .GT. 0 : the procedure CGESVJ did not converge in the given number of iterations (sweeps). [in] LDA LDA is INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] SVA SVA is REAL array, dimension (N) On exit, If INFO .EQ. 0 : depending on the value SCALE = RWORK(1), we have: If SCALE .EQ. ONE: SVA(1:N) contains the computed singular values of A. During the computation SVA contains the Euclidean column norms of the iterated matrices in the array A. If SCALE .NE. ONE: The singular values of A are SCALE*SVA(1:N), and this factored representation is due to the fact that some of the singular values of A might underflow or overflow. If INFO .GT. 0 : the procedure CGESVJ did not converge in the given number of iterations (sweeps) and SCALE*SVA(1:N) may not be accurate. [in] MV MV is INTEGER If JOBV .EQ. 'A', then the product of Jacobi rotations in CGESVJ is applied to the first MV rows of V. See the description of JOBV. [in,out] V V is COMPLEX array, dimension (LDV,N) If JOBV = 'V', then V contains on exit the N-by-N matrix of the right singular vectors; If JOBV = 'A', then V contains the product of the computed right singular vector matrix and the initial matrix in the array V. If JOBV = 'N', then V is not referenced. [in] LDV LDV is INTEGER The leading dimension of the array V, LDV .GE. 1. If JOBV .EQ. 'V', then LDV .GE. max(1,N). If JOBV .EQ. 'A', then LDV .GE. max(1,MV) . [in,out] CWORK CWORK is COMPLEX array, dimension M+N. Used as work space. [in] LWORK LWORK is INTEGER Length of CWORK, LWORK >= M+N. [in,out] RWORK RWORK is REAL array, dimension max(6,M+N). On entry, If JOBU .EQ. 'C' : RWORK(1) = CTOL, where CTOL defines the threshold for convergence. The process stops if all columns of A are mutually orthogonal up to CTOL*EPS, EPS=SLAMCH('E'). It is required that CTOL >= ONE, i.e. it is not allowed to force the routine to obtain orthogonality below EPSILON. On exit, RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N) are the computed singular values of A. (See description of SVA().) RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero singular values. RWORK(3) = NINT(RWORK(3)) is the number of the computed singular values that are larger than the underflow threshold. RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi rotations needed for numerical convergence. RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep. This is useful information in cases when CGESVJ did not converge, as it can be used to estimate whether the output is stil useful and for post festum analysis. RWORK(6) = the largest absolute value over all sines of the Jacobi rotation angles in the last sweep. It can be useful for a post festum analysis. [in] LRWORK LRWORK is INTEGER Length of RWORK, LRWORK >= MAX(6,N). [out] INFO INFO is INTEGER = 0 : successful exit. < 0 : if INFO = -i, then the i-th argument had an illegal value > 0 : CGESVJ did not converge in the maximal allowed number (NSWEEP=30) of sweeps. The output may still be useful. See the description of RWORK.
Date
June 2016
Further Details:
The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane rotations. In the case of underflow of the tangent of the Jacobi angle, a modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses column interchanges of de Rijk [1]. The relative accuracy of the computed singular values and the accuracy of the computed singular vectors (in angle metric) is as guaranteed by the theory of Demmel and Veselic [2]. The condition number that determines the accuracy in the full rank case is essentially min_{D=diag} kappa(A*D), where kappa(.) is the spectral condition number. The best performance of this Jacobi SVD procedure is achieved if used in an accelerated version of Drmac and Veselic [4,5], and it is the kernel routine in the SIGMA library [6]. Some tunning parameters (marked with [TP]) are available for the implementer. The computational range for the nonzero singular values is the machine number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even denormalized singular values can be computed with the corresponding gradual loss of accurate digits.
Contributors:
Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
References:
[1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer. SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic. SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. LAPACK Working note 169. [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. LAPACK Working note 170. [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations. Department of Mathematics, University of Zagreb, 2008, 2015.
Please report all bugs and send interesting test examples and comments to drmac.nosp@m.@mat.nosp@m.h.hr. Thank you.

Definition at line 332 of file cgesvj.f.

332 *
333 * -- LAPACK computational routine (version 3.6.1) --
334 * -- LAPACK is a software package provided by Univ. of Tennessee, --
335 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
336 * June 2016
337 *
338  IMPLICIT NONE
339 * .. Scalar Arguments ..
340  INTEGER info, lda, ldv, lwork, lrwork, m, mv, n
341  CHARACTER*1 joba, jobu, jobv
342 * ..
343 * .. Array Arguments ..
344  COMPLEX a( lda, * ), v( ldv, * ), cwork( lwork )
345  REAL rwork( lrwork ), sva( n )
346 * ..
347 *
348 * =====================================================================
349 *
350 * .. Local Parameters ..
351  REAL zero, half, one
352  parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
353  COMPLEX czero, cone
354  parameter( czero = (0.0e0, 0.0e0), cone = (1.0e0, 0.0e0) )
355  INTEGER nsweep
356  parameter( nsweep = 30 )
357 * ..
358 * .. Local Scalars ..
359  COMPLEX aapq, ompq
360  REAL aapp, aapp0, aapq1, aaqq, apoaq, aqoap, big,
361  \$ bigtheta, cs, ctol, epsln, large, mxaapq,
362  \$ mxsinj, rootbig, rooteps, rootsfmin, roottol,
363  \$ skl, sfmin, small, sn, t, temp1, theta, thsign, tol
364  INTEGER blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
365  \$ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
366  \$ n4, nbl, notrot, p, pskipped, q, rowskip, swband
367  LOGICAL applv, goscale, lower, lsvec, noscale, rotok,
368  \$ rsvec, uctol, upper
369 * ..
370 * ..
371 * .. Intrinsic Functions ..
372  INTRINSIC abs, amax1, amin1, conjg, float, min0, max0,
373  \$ sign, sqrt
374 * ..
375 * .. External Functions ..
376 * ..
377 * from BLAS
378  REAL scnrm2
379  COMPLEX cdotc
380  EXTERNAL cdotc, scnrm2
381  INTEGER isamax
382  EXTERNAL isamax
383 * from LAPACK
384  REAL slamch
385  EXTERNAL slamch
386  LOGICAL lsame
387  EXTERNAL lsame
388 * ..
389 * .. External Subroutines ..
390 * ..
391 * from BLAS
392  EXTERNAL ccopy, crot, csscal, cswap
393 * from LAPACK
394  EXTERNAL clascl, claset, classq, slascl, xerbla
395  EXTERNAL cgsvj0, cgsvj1
396 * ..
397 * .. Executable Statements ..
398 *
399 * Test the input arguments
400 *
401  lsvec = lsame( jobu, 'U' )
402  uctol = lsame( jobu, 'C' )
403  rsvec = lsame( jobv, 'V' )
404  applv = lsame( jobv, 'A' )
405  upper = lsame( joba, 'U' )
406  lower = lsame( joba, 'L' )
407 *
408  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
409  info = -1
410  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
411  info = -2
412  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
413  info = -3
414  ELSE IF( m.LT.0 ) THEN
415  info = -4
416  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
417  info = -5
418  ELSE IF( lda.LT.m ) THEN
419  info = -7
420  ELSE IF( mv.LT.0 ) THEN
421  info = -9
422  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
423  \$ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
424  info = -11
425  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
426  info = -12
427  ELSE IF( lwork.LT.( m+n ) ) THEN
428  info = -13
429  ELSE IF( lrwork.LT.max0( n, 6 ) ) THEN
430  info = -15
431  ELSE
432  info = 0
433  END IF
434 *
435 * #:(
436  IF( info.NE.0 ) THEN
437  CALL xerbla( 'CGESVJ', -info )
438  RETURN
439  END IF
440 *
441 * #:) Quick return for void matrix
442 *
443  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
444 *
445 * Set numerical parameters
446 * The stopping criterion for Jacobi rotations is
447 *
448 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
449 *
450 * where EPS is the round-off and CTOL is defined as follows:
451 *
452  IF( uctol ) THEN
453 * ... user controlled
454  ctol = rwork( 1 )
455  ELSE
456 * ... default
457  IF( lsvec .OR. rsvec .OR. applv ) THEN
458  ctol = sqrt( float( m ) )
459  ELSE
460  ctol = float( m )
461  END IF
462  END IF
463 * ... and the machine dependent parameters are
464 *[!] (Make sure that SLAMCH() works properly on the target machine.)
465 *
466  epsln = slamch( 'Epsilon' )
467  rooteps = sqrt( epsln )
468  sfmin = slamch( 'SafeMinimum' )
469  rootsfmin = sqrt( sfmin )
470  small = sfmin / epsln
471  big = slamch( 'Overflow' )
472 * BIG = ONE / SFMIN
473  rootbig = one / rootsfmin
474  large = big / sqrt( float( m*n ) )
475  bigtheta = one / rooteps
476 *
477  tol = ctol*epsln
478  roottol = sqrt( tol )
479 *
480  IF( float( m )*epsln.GE.one ) THEN
481  info = -4
482  CALL xerbla( 'CGESVJ', -info )
483  RETURN
484  END IF
485 *
486 * Initialize the right singular vector matrix.
487 *
488  IF( rsvec ) THEN
489  mvl = n
490  CALL claset( 'A', mvl, n, czero, cone, v, ldv )
491  ELSE IF( applv ) THEN
492  mvl = mv
493  END IF
494  rsvec = rsvec .OR. applv
495 *
496 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
497 *(!) If necessary, scale A to protect the largest singular value
498 * from overflow. It is possible that saving the largest singular
499 * value destroys the information about the small ones.
500 * This initial scaling is almost minimal in the sense that the
501 * goal is to make sure that no column norm overflows, and that
502 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
503 * in A are detected, the procedure returns with INFO=-6.
504 *
505  skl = one / sqrt( float( m )*float( n ) )
506  noscale = .true.
507  goscale = .true.
508 *
509  IF( lower ) THEN
510 * the input matrix is M-by-N lower triangular (trapezoidal)
511  DO 1874 p = 1, n
512  aapp = zero
513  aaqq = one
514  CALL classq( m-p+1, a( p, p ), 1, aapp, aaqq )
515  IF( aapp.GT.big ) THEN
516  info = -6
517  CALL xerbla( 'CGESVJ', -info )
518  RETURN
519  END IF
520  aaqq = sqrt( aaqq )
521  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
522  sva( p ) = aapp*aaqq
523  ELSE
524  noscale = .false.
525  sva( p ) = aapp*( aaqq*skl )
526  IF( goscale ) THEN
527  goscale = .false.
528  DO 1873 q = 1, p - 1
529  sva( q ) = sva( q )*skl
530  1873 CONTINUE
531  END IF
532  END IF
533  1874 CONTINUE
534  ELSE IF( upper ) THEN
535 * the input matrix is M-by-N upper triangular (trapezoidal)
536  DO 2874 p = 1, n
537  aapp = zero
538  aaqq = one
539  CALL classq( p, a( 1, p ), 1, aapp, aaqq )
540  IF( aapp.GT.big ) THEN
541  info = -6
542  CALL xerbla( 'CGESVJ', -info )
543  RETURN
544  END IF
545  aaqq = sqrt( aaqq )
546  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
547  sva( p ) = aapp*aaqq
548  ELSE
549  noscale = .false.
550  sva( p ) = aapp*( aaqq*skl )
551  IF( goscale ) THEN
552  goscale = .false.
553  DO 2873 q = 1, p - 1
554  sva( q ) = sva( q )*skl
555  2873 CONTINUE
556  END IF
557  END IF
558  2874 CONTINUE
559  ELSE
560 * the input matrix is M-by-N general dense
561  DO 3874 p = 1, n
562  aapp = zero
563  aaqq = one
564  CALL classq( m, a( 1, p ), 1, aapp, aaqq )
565  IF( aapp.GT.big ) THEN
566  info = -6
567  CALL xerbla( 'CGESVJ', -info )
568  RETURN
569  END IF
570  aaqq = sqrt( aaqq )
571  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
572  sva( p ) = aapp*aaqq
573  ELSE
574  noscale = .false.
575  sva( p ) = aapp*( aaqq*skl )
576  IF( goscale ) THEN
577  goscale = .false.
578  DO 3873 q = 1, p - 1
579  sva( q ) = sva( q )*skl
580  3873 CONTINUE
581  END IF
582  END IF
583  3874 CONTINUE
584  END IF
585 *
586  IF( noscale )skl = one
587 *
588 * Move the smaller part of the spectrum from the underflow threshold
589 *(!) Start by determining the position of the nonzero entries of the
590 * array SVA() relative to ( SFMIN, BIG ).
591 *
592  aapp = zero
593  aaqq = big
594  DO 4781 p = 1, n
595  IF( sva( p ).NE.zero )aaqq = amin1( aaqq, sva( p ) )
596  aapp = amax1( aapp, sva( p ) )
597  4781 CONTINUE
598 *
599 * #:) Quick return for zero matrix
600 *
601  IF( aapp.EQ.zero ) THEN
602  IF( lsvec )CALL claset( 'G', m, n, czero, cone, a, lda )
603  rwork( 1 ) = one
604  rwork( 2 ) = zero
605  rwork( 3 ) = zero
606  rwork( 4 ) = zero
607  rwork( 5 ) = zero
608  rwork( 6 ) = zero
609  RETURN
610  END IF
611 *
612 * #:) Quick return for one-column matrix
613 *
614  IF( n.EQ.1 ) THEN
615  IF( lsvec )CALL clascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
616  \$ a( 1, 1 ), lda, ierr )
617  rwork( 1 ) = one / skl
618  IF( sva( 1 ).GE.sfmin ) THEN
619  rwork( 2 ) = one
620  ELSE
621  rwork( 2 ) = zero
622  END IF
623  rwork( 3 ) = zero
624  rwork( 4 ) = zero
625  rwork( 5 ) = zero
626  rwork( 6 ) = zero
627  RETURN
628  END IF
629 *
630 * Protect small singular values from underflow, and try to
631 * avoid underflows/overflows in computing Jacobi rotations.
632 *
633  sn = sqrt( sfmin / epsln )
634  temp1 = sqrt( big / float( n ) )
635  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
636  \$ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
637  temp1 = amin1( big, temp1 / aapp )
638 * AAQQ = AAQQ*TEMP1
639 * AAPP = AAPP*TEMP1
640  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
641  temp1 = amin1( sn / aaqq, big / ( aapp*sqrt( float( n ) ) ) )
642 * AAQQ = AAQQ*TEMP1
643 * AAPP = AAPP*TEMP1
644  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
645  temp1 = amax1( sn / aaqq, temp1 / aapp )
646 * AAQQ = AAQQ*TEMP1
647 * AAPP = AAPP*TEMP1
648  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
649  temp1 = amin1( sn / aaqq, big / ( sqrt( float( n ) )*aapp ) )
650 * AAQQ = AAQQ*TEMP1
651 * AAPP = AAPP*TEMP1
652  ELSE
653  temp1 = one
654  END IF
655 *
656 * Scale, if necessary
657 *
658  IF( temp1.NE.one ) THEN
659  CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
660  END IF
661  skl = temp1*skl
662  IF( skl.NE.one ) THEN
663  CALL clascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
664  skl = one / skl
665  END IF
666 *
667 * Row-cyclic Jacobi SVD algorithm with column pivoting
668 *
669  emptsw = ( n*( n-1 ) ) / 2
670  notrot = 0
671
672  DO 1868 q = 1, n
673  cwork( q ) = cone
674  1868 CONTINUE
675 *
676 *
677 *
678  swband = 3
679 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
680 * if CGESVJ is used as a computational routine in the preconditioned
681 * Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure
682 * works on pivots inside a band-like region around the diagonal.
683 * The boundaries are determined dynamically, based on the number of
684 * pivots above a threshold.
685 *
686  kbl = min0( 8, n )
687 *[TP] KBL is a tuning parameter that defines the tile size in the
688 * tiling of the p-q loops of pivot pairs. In general, an optimal
689 * value of KBL depends on the matrix dimensions and on the
690 * parameters of the computer's memory.
691 *
692  nbl = n / kbl
693  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
694 *
695  blskip = kbl**2
696 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
697 *
698  rowskip = min0( 5, kbl )
699 *[TP] ROWSKIP is a tuning parameter.
700 *
702 *[TP] LKAHEAD is a tuning parameter.
703 *
704 * Quasi block transformations, using the lower (upper) triangular
705 * structure of the input matrix. The quasi-block-cycling usually
706 * invokes cubic convergence. Big part of this cycle is done inside
707 * canonical subspaces of dimensions less than M.
708 *
709  IF( ( lower .OR. upper ) .AND. ( n.GT.max0( 64, 4*kbl ) ) ) THEN
710 *[TP] The number of partition levels and the actual partition are
711 * tuning parameters.
712  n4 = n / 4
713  n2 = n / 2
714  n34 = 3*n4
715  IF( applv ) THEN
716  q = 0
717  ELSE
718  q = 1
719  END IF
720 *
721  IF( lower ) THEN
722 *
723 * This works very well on lower triangular matrices, in particular
724 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
725 * The idea is simple:
726 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
727 * [+ + 0 0] [0 0]
728 * [+ + x 0] actually work on [x 0] [x 0]
729 * [+ + x x] [x x]. [x x]
730 *
731  CALL cgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
732  \$ cwork( n34+1 ), sva( n34+1 ), mvl,
733  \$ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
734  \$ 2, cwork( n+1 ), lwork-n, ierr )
735
736  CALL cgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
737  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
738  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
739  \$ cwork( n+1 ), lwork-n, ierr )
740
741  CALL cgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
742  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
743  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
744  \$ cwork( n+1 ), lwork-n, ierr )
745 *
746  CALL cgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
747  \$ cwork( n4+1 ), sva( n4+1 ), mvl,
748  \$ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
749  \$ cwork( n+1 ), lwork-n, ierr )
750 *
751  CALL cgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
752  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
753  \$ ierr )
754 *
755  CALL cgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
756  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
757  \$ lwork-n, ierr )
758 *
759 *
760  ELSE IF( upper ) THEN
761 *
762 *
763  CALL cgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
764  \$ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
765  \$ ierr )
766 *
767  CALL cgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
768  \$ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
769  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
770  \$ ierr )
771 *
772  CALL cgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
773  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
774  \$ lwork-n, ierr )
775 *
776  CALL cgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
777  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
778  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
779  \$ cwork( n+1 ), lwork-n, ierr )
780
781  END IF
782 *
783  END IF
784 *
785 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
786 *
787  DO 1993 i = 1, nsweep
788 *
789 * .. go go go ...
790 *
791  mxaapq = zero
792  mxsinj = zero
793  iswrot = 0
794 *
795  notrot = 0
796  pskipped = 0
797 *
798 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
799 * 1 <= p < q <= N. This is the first step toward a blocked implementation
800 * of the rotations. New implementation, based on block transformations,
801 * is under development.
802 *
803  DO 2000 ibr = 1, nbl
804 *
805  igl = ( ibr-1 )*kbl + 1
806 *
807  DO 1002 ir1 = 0, min0( lkahead, nbl-ibr )
808 *
809  igl = igl + ir1*kbl
810 *
811  DO 2001 p = igl, min0( igl+kbl-1, n-1 )
812 *
813 * .. de Rijk's pivoting
814 *
815  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
816  IF( p.NE.q ) THEN
817  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
818  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1,
819  \$ v( 1, q ), 1 )
820  temp1 = sva( p )
821  sva( p ) = sva( q )
822  sva( q ) = temp1
823  aapq = cwork(p)
824  cwork(p) = cwork(q)
825  cwork(q) = aapq
826  END IF
827 *
828  IF( ir1.EQ.0 ) THEN
829 *
830 * Column norms are periodically updated by explicit
831 * norm computation.
832 *[!] Caveat:
833 * Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1)
834 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
835 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
836 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
837 * Hence, SCNRM2 cannot be trusted, not even in the case when
838 * the true norm is far from the under(over)flow boundaries.
839 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
840 * below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )".
841 *
842  IF( ( sva( p ).LT.rootbig ) .AND.
843  \$ ( sva( p ).GT.rootsfmin ) ) THEN
844  sva( p ) = scnrm2( m, a( 1, p ), 1 )
845  ELSE
846  temp1 = zero
847  aapp = one
848  CALL classq( m, a( 1, p ), 1, temp1, aapp )
849  sva( p ) = temp1*sqrt( aapp )
850  END IF
851  aapp = sva( p )
852  ELSE
853  aapp = sva( p )
854  END IF
855 *
856  IF( aapp.GT.zero ) THEN
857 *
858  pskipped = 0
859 *
860  DO 2002 q = p + 1, min0( igl+kbl-1, n )
861 *
862  aaqq = sva( q )
863 *
864  IF( aaqq.GT.zero ) THEN
865 *
866  aapp0 = aapp
867  IF( aaqq.GE.one ) THEN
868  rotok = ( small*aapp ).LE.aaqq
869  IF( aapp.LT.( big / aaqq ) ) THEN
870  aapq = ( cdotc( m, a( 1, p ), 1,
871  \$ a( 1, q ), 1 ) / aaqq ) / aapp
872  ELSE
873  CALL ccopy( m, a( 1, p ), 1,
874  \$ cwork(n+1), 1 )
875  CALL clascl( 'G', 0, 0, aapp, one,
876  \$ m, 1, cwork(n+1), lda, ierr )
877  aapq = cdotc( m, cwork(n+1), 1,
878  \$ a( 1, q ), 1 ) / aaqq
879  END IF
880  ELSE
881  rotok = aapp.LE.( aaqq / small )
882  IF( aapp.GT.( small / aaqq ) ) THEN
883  aapq = ( cdotc( m, a( 1, p ), 1,
884  \$ a( 1, q ), 1 ) / aaqq ) / aapp
885  ELSE
886  CALL ccopy( m, a( 1, q ), 1,
887  \$ cwork(n+1), 1 )
888  CALL clascl( 'G', 0, 0, aaqq,
889  \$ one, m, 1,
890  \$ cwork(n+1), lda, ierr )
891  aapq = cdotc( m, a(1, p ), 1,
892  \$ cwork(n+1), 1 ) / aapp
893  END IF
894  END IF
895 *
896 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
897  aapq1 = -abs(aapq)
898  mxaapq = amax1( mxaapq, -aapq1 )
899 *
900 * TO rotate or NOT to rotate, THAT is the question ...
901 *
902  IF( abs( aapq1 ).GT.tol ) THEN
903 *
904 * .. rotate
905 *[RTD] ROTATED = ROTATED + ONE
906 *
907  IF( ir1.EQ.0 ) THEN
908  notrot = 0
909  pskipped = 0
910  iswrot = iswrot + 1
911  END IF
912 *
913  IF( rotok ) THEN
914 *
915  ompq = aapq / abs(aapq)
916  aqoap = aaqq / aapp
917  apoaq = aapp / aaqq
918  theta = -half*abs( aqoap-apoaq )/aapq1
919 *
920  IF( abs( theta ).GT.bigtheta ) THEN
921 *
922  t = half / theta
923  cs = one
924
925  CALL crot( m, a(1,p), 1, a(1,q), 1,
926  \$ cs, conjg(ompq)*t )
927  IF ( rsvec ) THEN
928  CALL crot( mvl, v(1,p), 1,
929  \$ v(1,q), 1, cs, conjg(ompq)*t )
930  END IF
931
932  sva( q ) = aaqq*sqrt( amax1( zero,
933  \$ one+t*apoaq*aapq1 ) )
934  aapp = aapp*sqrt( amax1( zero,
935  \$ one-t*aqoap*aapq1 ) )
936  mxsinj = amax1( mxsinj, abs( t ) )
937 *
938  ELSE
939 *
940 * .. choose correct signum for THETA and rotate
941 *
942  thsign = -sign( one, aapq1 )
943  t = one / ( theta+thsign*
944  \$ sqrt( one+theta*theta ) )
945  cs = sqrt( one / ( one+t*t ) )
946  sn = t*cs
947 *
948  mxsinj = amax1( mxsinj, abs( sn ) )
949  sva( q ) = aaqq*sqrt( amax1( zero,
950  \$ one+t*apoaq*aapq1 ) )
951  aapp = aapp*sqrt( amax1( zero,
952  \$ one-t*aqoap*aapq1 ) )
953 *
954  CALL crot( m, a(1,p), 1, a(1,q), 1,
955  \$ cs, conjg(ompq)*sn )
956  IF ( rsvec ) THEN
957  CALL crot( mvl, v(1,p), 1,
958  \$ v(1,q), 1, cs, conjg(ompq)*sn )
959  END IF
960  END IF
961  cwork(p) = -cwork(q) * ompq
962 *
963  ELSE
964 * .. have to use modified Gram-Schmidt like transformation
965  CALL ccopy( m, a( 1, p ), 1,
966  \$ cwork(n+1), 1 )
967  CALL clascl( 'G', 0, 0, aapp, one, m,
968  \$ 1, cwork(n+1), lda,
969  \$ ierr )
970  CALL clascl( 'G', 0, 0, aaqq, one, m,
971  \$ 1, a( 1, q ), lda, ierr )
972  CALL caxpy( m, -aapq, cwork(n+1), 1,
973  \$ a( 1, q ), 1 )
974  CALL clascl( 'G', 0, 0, one, aaqq, m,
975  \$ 1, a( 1, q ), lda, ierr )
976  sva( q ) = aaqq*sqrt( amax1( zero,
977  \$ one-aapq1*aapq1 ) )
978  mxsinj = amax1( mxsinj, sfmin )
979  END IF
980 * END IF ROTOK THEN ... ELSE
981 *
982 * In the case of cancellation in updating SVA(q), SVA(p)
983 * recompute SVA(q), SVA(p).
984 *
985  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
986  \$ THEN
987  IF( ( aaqq.LT.rootbig ) .AND.
988  \$ ( aaqq.GT.rootsfmin ) ) THEN
989  sva( q ) = scnrm2( m, a( 1, q ), 1 )
990  ELSE
991  t = zero
992  aaqq = one
993  CALL classq( m, a( 1, q ), 1, t,
994  \$ aaqq )
995  sva( q ) = t*sqrt( aaqq )
996  END IF
997  END IF
998  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
999  IF( ( aapp.LT.rootbig ) .AND.
1000  \$ ( aapp.GT.rootsfmin ) ) THEN
1001  aapp = scnrm2( m, a( 1, p ), 1 )
1002  ELSE
1003  t = zero
1004  aapp = one
1005  CALL classq( m, a( 1, p ), 1, t,
1006  \$ aapp )
1007  aapp = t*sqrt( aapp )
1008  END IF
1009  sva( p ) = aapp
1010  END IF
1011 *
1012  ELSE
1013 * A(:,p) and A(:,q) already numerically orthogonal
1014  IF( ir1.EQ.0 )notrot = notrot + 1
1015 *[RTD] SKIPPED = SKIPPED + 1
1016  pskipped = pskipped + 1
1017  END IF
1018  ELSE
1019 * A(:,q) is zero column
1020  IF( ir1.EQ.0 )notrot = notrot + 1
1021  pskipped = pskipped + 1
1022  END IF
1023 *
1024  IF( ( i.LE.swband ) .AND.
1025  \$ ( pskipped.GT.rowskip ) ) THEN
1026  IF( ir1.EQ.0 )aapp = -aapp
1027  notrot = 0
1028  GO TO 2103
1029  END IF
1030 *
1031  2002 CONTINUE
1032 * END q-LOOP
1033 *
1034  2103 CONTINUE
1035 * bailed out of q-loop
1036 *
1037  sva( p ) = aapp
1038 *
1039  ELSE
1040  sva( p ) = aapp
1041  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1042  \$ notrot = notrot + min0( igl+kbl-1, n ) - p
1043  END IF
1044 *
1045  2001 CONTINUE
1046 * end of the p-loop
1047 * end of doing the block ( ibr, ibr )
1048  1002 CONTINUE
1049 * end of ir1-loop
1050 *
1051 * ... go to the off diagonal blocks
1052 *
1053  igl = ( ibr-1 )*kbl + 1
1054 *
1055  DO 2010 jbc = ibr + 1, nbl
1056 *
1057  jgl = ( jbc-1 )*kbl + 1
1058 *
1059 * doing the block at ( ibr, jbc )
1060 *
1061  ijblsk = 0
1062  DO 2100 p = igl, min0( igl+kbl-1, n )
1063 *
1064  aapp = sva( p )
1065  IF( aapp.GT.zero ) THEN
1066 *
1067  pskipped = 0
1068 *
1069  DO 2200 q = jgl, min0( jgl+kbl-1, n )
1070 *
1071  aaqq = sva( q )
1072  IF( aaqq.GT.zero ) THEN
1073  aapp0 = aapp
1074 *
1075 * .. M x 2 Jacobi SVD ..
1076 *
1077 * Safe Gram matrix computation
1078 *
1079  IF( aaqq.GE.one ) THEN
1080  IF( aapp.GE.aaqq ) THEN
1081  rotok = ( small*aapp ).LE.aaqq
1082  ELSE
1083  rotok = ( small*aaqq ).LE.aapp
1084  END IF
1085  IF( aapp.LT.( big / aaqq ) ) THEN
1086  aapq = ( cdotc( m, a( 1, p ), 1,
1087  \$ a( 1, q ), 1 ) / aaqq ) / aapp
1088  ELSE
1089  CALL ccopy( m, a( 1, p ), 1,
1090  \$ cwork(n+1), 1 )
1091  CALL clascl( 'G', 0, 0, aapp,
1092  \$ one, m, 1,
1093  \$ cwork(n+1), lda, ierr )
1094  aapq = cdotc( m, cwork(n+1), 1,
1095  \$ a( 1, q ), 1 ) / aaqq
1096  END IF
1097  ELSE
1098  IF( aapp.GE.aaqq ) THEN
1099  rotok = aapp.LE.( aaqq / small )
1100  ELSE
1101  rotok = aaqq.LE.( aapp / small )
1102  END IF
1103  IF( aapp.GT.( small / aaqq ) ) THEN
1104  aapq = ( cdotc( m, a( 1, p ), 1,
1105  \$ a( 1, q ), 1 ) / aaqq ) / aapp
1106  ELSE
1107  CALL ccopy( m, a( 1, q ), 1,
1108  \$ cwork(n+1), 1 )
1109  CALL clascl( 'G', 0, 0, aaqq,
1110  \$ one, m, 1,
1111  \$ cwork(n+1), lda, ierr )
1112  aapq = cdotc( m, a( 1, p ), 1,
1113  \$ cwork(n+1), 1 ) / aapp
1114  END IF
1115  END IF
1116 *
1117 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1118  aapq1 = -abs(aapq)
1119  mxaapq = amax1( mxaapq, -aapq1 )
1120 *
1121 * TO rotate or NOT to rotate, THAT is the question ...
1122 *
1123  IF( abs( aapq1 ).GT.tol ) THEN
1124  notrot = 0
1125 *[RTD] ROTATED = ROTATED + 1
1126  pskipped = 0
1127  iswrot = iswrot + 1
1128 *
1129  IF( rotok ) THEN
1130 *
1131  ompq = aapq / abs(aapq)
1132  aqoap = aaqq / aapp
1133  apoaq = aapp / aaqq
1134  theta = -half*abs( aqoap-apoaq )/ aapq1
1135  IF( aaqq.GT.aapp0 )theta = -theta
1136 *
1137  IF( abs( theta ).GT.bigtheta ) THEN
1138  t = half / theta
1139  cs = one
1140  CALL crot( m, a(1,p), 1, a(1,q), 1,
1141  \$ cs, conjg(ompq)*t )
1142  IF( rsvec ) THEN
1143  CALL crot( mvl, v(1,p), 1,
1144  \$ v(1,q), 1, cs, conjg(ompq)*t )
1145  END IF
1146  sva( q ) = aaqq*sqrt( amax1( zero,
1147  \$ one+t*apoaq*aapq1 ) )
1148  aapp = aapp*sqrt( amax1( zero,
1149  \$ one-t*aqoap*aapq1 ) )
1150  mxsinj = amax1( mxsinj, abs( t ) )
1151  ELSE
1152 *
1153 * .. choose correct signum for THETA and rotate
1154 *
1155  thsign = -sign( one, aapq1 )
1156  IF( aaqq.GT.aapp0 )thsign = -thsign
1157  t = one / ( theta+thsign*
1158  \$ sqrt( one+theta*theta ) )
1159  cs = sqrt( one / ( one+t*t ) )
1160  sn = t*cs
1161  mxsinj = amax1( mxsinj, abs( sn ) )
1162  sva( q ) = aaqq*sqrt( amax1( zero,
1163  \$ one+t*apoaq*aapq1 ) )
1164  aapp = aapp*sqrt( amax1( zero,
1165  \$ one-t*aqoap*aapq1 ) )
1166 *
1167  CALL crot( m, a(1,p), 1, a(1,q), 1,
1168  \$ cs, conjg(ompq)*sn )
1169  IF( rsvec ) THEN
1170  CALL crot( mvl, v(1,p), 1,
1171  \$ v(1,q), 1, cs, conjg(ompq)*sn )
1172  END IF
1173  END IF
1174  cwork(p) = -cwork(q) * ompq
1175 *
1176  ELSE
1177 * .. have to use modified Gram-Schmidt like transformation
1178  IF( aapp.GT.aaqq ) THEN
1179  CALL ccopy( m, a( 1, p ), 1,
1180  \$ cwork(n+1), 1 )
1181  CALL clascl( 'G', 0, 0, aapp, one,
1182  \$ m, 1, cwork(n+1),lda,
1183  \$ ierr )
1184  CALL clascl( 'G', 0, 0, aaqq, one,
1185  \$ m, 1, a( 1, q ), lda,
1186  \$ ierr )
1187  CALL caxpy( m, -aapq, cwork(n+1),
1188  \$ 1, a( 1, q ), 1 )
1189  CALL clascl( 'G', 0, 0, one, aaqq,
1190  \$ m, 1, a( 1, q ), lda,
1191  \$ ierr )
1192  sva( q ) = aaqq*sqrt( amax1( zero,
1193  \$ one-aapq1*aapq1 ) )
1194  mxsinj = amax1( mxsinj, sfmin )
1195  ELSE
1196  CALL ccopy( m, a( 1, q ), 1,
1197  \$ cwork(n+1), 1 )
1198  CALL clascl( 'G', 0, 0, aaqq, one,
1199  \$ m, 1, cwork(n+1),lda,
1200  \$ ierr )
1201  CALL clascl( 'G', 0, 0, aapp, one,
1202  \$ m, 1, a( 1, p ), lda,
1203  \$ ierr )
1204  CALL caxpy( m, -conjg(aapq),
1205  \$ cwork(n+1), 1, a( 1, p ), 1 )
1206  CALL clascl( 'G', 0, 0, one, aapp,
1207  \$ m, 1, a( 1, p ), lda,
1208  \$ ierr )
1209  sva( p ) = aapp*sqrt( amax1( zero,
1210  \$ one-aapq1*aapq1 ) )
1211  mxsinj = amax1( mxsinj, sfmin )
1212  END IF
1213  END IF
1214 * END IF ROTOK THEN ... ELSE
1215 *
1216 * In the case of cancellation in updating SVA(q), SVA(p)
1217 * .. recompute SVA(q), SVA(p)
1218  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1219  \$ THEN
1220  IF( ( aaqq.LT.rootbig ) .AND.
1221  \$ ( aaqq.GT.rootsfmin ) ) THEN
1222  sva( q ) = scnrm2( m, a( 1, q ), 1)
1223  ELSE
1224  t = zero
1225  aaqq = one
1226  CALL classq( m, a( 1, q ), 1, t,
1227  \$ aaqq )
1228  sva( q ) = t*sqrt( aaqq )
1229  END IF
1230  END IF
1231  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1232  IF( ( aapp.LT.rootbig ) .AND.
1233  \$ ( aapp.GT.rootsfmin ) ) THEN
1234  aapp = scnrm2( m, a( 1, p ), 1 )
1235  ELSE
1236  t = zero
1237  aapp = one
1238  CALL classq( m, a( 1, p ), 1, t,
1239  \$ aapp )
1240  aapp = t*sqrt( aapp )
1241  END IF
1242  sva( p ) = aapp
1243  END IF
1244 * end of OK rotation
1245  ELSE
1246  notrot = notrot + 1
1247 *[RTD] SKIPPED = SKIPPED + 1
1248  pskipped = pskipped + 1
1249  ijblsk = ijblsk + 1
1250  END IF
1251  ELSE
1252  notrot = notrot + 1
1253  pskipped = pskipped + 1
1254  ijblsk = ijblsk + 1
1255  END IF
1256 *
1257  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1258  \$ THEN
1259  sva( p ) = aapp
1260  notrot = 0
1261  GO TO 2011
1262  END IF
1263  IF( ( i.LE.swband ) .AND.
1264  \$ ( pskipped.GT.rowskip ) ) THEN
1265  aapp = -aapp
1266  notrot = 0
1267  GO TO 2203
1268  END IF
1269 *
1270  2200 CONTINUE
1271 * end of the q-loop
1272  2203 CONTINUE
1273 *
1274  sva( p ) = aapp
1275 *
1276  ELSE
1277 *
1278  IF( aapp.EQ.zero )notrot = notrot +
1279  \$ min0( jgl+kbl-1, n ) - jgl + 1
1280  IF( aapp.LT.zero )notrot = 0
1281 *
1282  END IF
1283 *
1284  2100 CONTINUE
1285 * end of the p-loop
1286  2010 CONTINUE
1287 * end of the jbc-loop
1288  2011 CONTINUE
1289 *2011 bailed out of the jbc-loop
1290  DO 2012 p = igl, min0( igl+kbl-1, n )
1291  sva( p ) = abs( sva( p ) )
1292  2012 CONTINUE
1293 ***
1294  2000 CONTINUE
1295 *2000 :: end of the ibr-loop
1296 *
1297 * .. update SVA(N)
1298  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1299  \$ THEN
1300  sva( n ) = scnrm2( m, a( 1, n ), 1 )
1301  ELSE
1302  t = zero
1303  aapp = one
1304  CALL classq( m, a( 1, n ), 1, t, aapp )
1305  sva( n ) = t*sqrt( aapp )
1306  END IF
1307 *
1309 *
1310  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1311  \$ ( iswrot.LE.n ) ) )swband = i
1312 *
1313  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( float( n ) )*
1314  \$ tol ) .AND. ( float( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1315  GO TO 1994
1316  END IF
1317 *
1318  IF( notrot.GE.emptsw )GO TO 1994
1319 *
1320  1993 CONTINUE
1321 * end i=1:NSWEEP loop
1322 *
1323 * #:( Reaching this point means that the procedure has not converged.
1324  info = nsweep - 1
1325  GO TO 1995
1326 *
1327  1994 CONTINUE
1328 * #:) Reaching this point means numerical convergence after the i-th
1329 * sweep.
1330 *
1331  info = 0
1332 * #:) INFO = 0 confirms successful iterations.
1333  1995 CONTINUE
1334 *
1335 * Sort the singular values and find how many are above
1336 * the underflow threshold.
1337 *
1338  n2 = 0
1339  n4 = 0
1340  DO 5991 p = 1, n - 1
1341  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1342  IF( p.NE.q ) THEN
1343  temp1 = sva( p )
1344  sva( p ) = sva( q )
1345  sva( q ) = temp1
1346  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1347  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1348  END IF
1349  IF( sva( p ).NE.zero ) THEN
1350  n4 = n4 + 1
1351  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1352  END IF
1353  5991 CONTINUE
1354  IF( sva( n ).NE.zero ) THEN
1355  n4 = n4 + 1
1356  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1357  END IF
1358 *
1359 * Normalize the left singular vectors.
1360 *
1361  IF( lsvec .OR. uctol ) THEN
1362  DO 1998 p = 1, n2
1363  CALL csscal( m, one / sva( p ), a( 1, p ), 1 )
1364  1998 CONTINUE
1365  END IF
1366 *
1367 * Scale the product of Jacobi rotations.
1368 *
1369  IF( rsvec ) THEN
1370  DO 2399 p = 1, n
1371  temp1 = one / scnrm2( mvl, v( 1, p ), 1 )
1372  CALL csscal( mvl, temp1, v( 1, p ), 1 )
1373  2399 CONTINUE
1374  END IF
1375 *
1376 * Undo scaling, if necessary (and possible).
1377  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1378  \$ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1379  \$ ( sfmin / skl ) ) ) ) THEN
1380  DO 2400 p = 1, n
1381  sva( p ) = skl*sva( p )
1382  2400 CONTINUE
1383  skl = one
1384  END IF
1385 *
1386  rwork( 1 ) = skl
1387 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1388 * then some of the singular values may overflow or underflow and
1389 * the spectrum is given in this factored representation.
1390 *
1391  rwork( 2 ) = float( n4 )
1392 * N4 is the number of computed nonzero singular values of A.
1393 *
1394  rwork( 3 ) = float( n2 )
1395 * N2 is the number of singular values of A greater than SFMIN.
1396 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1397 * that may carry some information.
1398 *
1399  rwork( 4 ) = float( i )
1400 * i is the index of the last sweep before declaring convergence.
1401 *
1402  rwork( 5 ) = mxaapq
1403 * MXAAPQ is the largest absolute value of scaled pivots in the
1404 * last sweep
1405 *
1406  rwork( 6 ) = mxsinj
1407 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1408 * in the last sweep
1409 *
1410  RETURN
1411 * ..
1412 * .. END OF CGESVJ
1413 * ..
real function scnrm2(N, X, INCX)
SCNRM2
Definition: scnrm2.f:56
integer function isamax(N, SX, INCX)
ISAMAX
Definition: isamax.f:53
subroutine clascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
CLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: clascl.f:145
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:145
subroutine classq(N, X, INCX, SCALE, SUMSQ)
CLASSQ updates a sum of squares represented in scaled form.
Definition: classq.f:108
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
subroutine claset(UPLO, M, N, ALPHA, BETA, A, LDA)
CLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: claset.f:108
complex function cdotc(N, CX, INCX, CY, INCY)
CDOTC
Definition: cdotc.f:54
subroutine cgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ0 pre-processor for the routine cgesvj.
Definition: cgsvj0.f:220
subroutine ccopy(N, CX, INCX, CY, INCY)
CCOPY
Definition: ccopy.f:52
subroutine cswap(N, CX, INCX, CY, INCY)
CSWAP
Definition: cswap.f:52
real function slamch(CMACH)
SLAMCH
Definition: slamch.f:69
subroutine cgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ1 pre-processor for the routine cgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: cgsvj1.f:238
subroutine caxpy(N, CA, CX, INCX, CY, INCY)
CAXPY
Definition: caxpy.f:53
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
subroutine csscal(N, SA, CX, INCX)
CSSCAL
Definition: csscal.f:54
subroutine crot(N, CX, INCX, CY, INCY, C, S)
CROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors...
Definition: crot.f:105

Here is the call graph for this function:

Here is the caller graph for this function: