subroutine sgesvj	(	character*1	JOBA,
		character*1	JOBU,
		character*1	JOBV,
		integer	M,
		integer	N,
		real, dimension( lda, * )	A,
		integer	LDA,
		real, dimension( n )	SVA,
		integer	MV,
		real, dimension( ldv, * )	V,
		integer	LDV,
		real, dimension( lwork )	WORK,
		integer	LWORK,
		integer	INFO
	)

SGESVJ

Download SGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:

 SGESVJ computes the singular value decomposition (SVD) of a real
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^t,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N orthogonal matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
 SGESVJ can sometimes compute tiny singular values and their singular vectors much
 more accurately than other SVD routines, see below under Further Details.

Parameters

[in]	JOBA	JOBA is CHARACTER* 1 Specifies the structure of A. = 'L': The input matrix A is lower triangular; = 'U': The input matrix A is upper triangular; = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]	JOBU	JOBU is CHARACTER1 Specifies whether to compute the left singular vectors (columns of U): = 'U': The left singular vectors corresponding to the nonzero singular values are computed and returned in the leading columns of A. See more details in the description of A. The default numerical orthogonality threshold is set to approximately TOL=CTOLEPS, CTOL=SQRT(M), EPS=SLAMCH('E'). = 'C': Analogous to JOBU='U', except that user can control the level of numerical orthogonality of the computed left singular vectors. TOL can be set to TOL = CTOLEPS, where CTOL is given on input in the array WORK. No CTOL smaller than ONE is allowed. CTOL greater than 1 / EPS is meaningless. The option 'C' can be used if MEPS is satisfactory orthogonality of the computed left singular vectors, so CTOL=M could save few sweeps of Jacobi rotations. See the descriptions of A and WORK(1). = 'N': The matrix U is not computed. However, see the description of A.
[in]	JOBV	JOBV is CHARACTER*1 Specifies whether to compute the right singular vectors, that is, the matrix V: = 'V' : the matrix V is computed and returned in the array V = 'A' : the Jacobi rotations are applied to the MV-by-N array V. In other words, the right singular vector matrix V is not computed explicitly; instead it is applied to an MV-by-N matrix initially stored in the first MV rows of V. = 'N' : the matrix V is not computed and the array V is not referenced
[in]	M	M is INTEGER The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
[in]	N	N is INTEGER The number of columns of the input matrix A. M >= N >= 0.
[in,out]	A	A is REAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C': If INFO .EQ. 0 : RANKA orthonormal columns of U are returned in the leading RANKA columns of the array A. Here RANKA <= N is the number of computed singular values of A that are above the underflow threshold SLAMCH('S'). The singular vectors corresponding to underflowed or zero singular values are not computed. The value of RANKA is returned in the array WORK as RANKA=NINT(WORK(2)). Also see the descriptions of SVA and WORK. The computed columns of U are mutually numerically orthogonal up to approximately TOL=SQRT(M)EPS (default); or TOL=CTOLEPS (JOBU.EQ.'C'), see the description of JOBU. If INFO .GT. 0, the procedure SGESVJ did not converge in the given number of iterations (sweeps). In that case, the computed columns of U may not be orthogonal up to TOL. The output U (stored in A), SIGMA (given by the computed singular values in SVA(1:N)) and V is still a decomposition of the input matrix A in the sense that the residual \|\|A-SCALEUSIGMAV^T\|\|_2 / \|\|A\|\|_2 is small. If JOBU .EQ. 'N': If INFO .EQ. 0 : Note that the left singular vectors are 'for free' in the one-sided Jacobi SVD algorithm. However, if only the singular values are needed, the level of numerical orthogonality of U is not an issue and iterations are stopped when the columns of the iterated matrix are numerically orthogonal up to approximately MEPS. Thus, on exit, A contains the columns of U scaled with the corresponding singular values. If INFO .GT. 0 : the procedure SGESVJ did not converge in the given number of iterations (sweeps).
[in]	LDA	LDA is INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]	SVA	SVA is REAL array, dimension (N) On exit, If INFO .EQ. 0 : depending on the value SCALE = WORK(1), we have: If SCALE .EQ. ONE: SVA(1:N) contains the computed singular values of A. During the computation SVA contains the Euclidean column norms of the iterated matrices in the array A. If SCALE .NE. ONE: The singular values of A are SCALESVA(1:N), and this factored representation is due to the fact that some of the singular values of A might underflow or overflow. If INFO .GT. 0 : the procedure SGESVJ did not converge in the given number of iterations (sweeps) and SCALESVA(1:N) may not be accurate.
[in]	MV	MV is INTEGER If JOBV .EQ. 'A', then the product of Jacobi rotations in SGESVJ is applied to the first MV rows of V. See the description of JOBV.
[in,out]	V	V is REAL array, dimension (LDV,N) If JOBV = 'V', then V contains on exit the N-by-N matrix of the right singular vectors; If JOBV = 'A', then V contains the product of the computed right singular vector matrix and the initial matrix in the array V. If JOBV = 'N', then V is not referenced.
[in]	LDV	LDV is INTEGER The leading dimension of the array V, LDV .GE. 1. If JOBV .EQ. 'V', then LDV .GE. max(1,N). If JOBV .EQ. 'A', then LDV .GE. max(1,MV) .
[in,out]	WORK	WORK is REAL array, dimension max(4,M+N). On entry, If JOBU .EQ. 'C' : WORK(1) = CTOL, where CTOL defines the threshold for convergence. The process stops if all columns of A are mutually orthogonal up to CTOLEPS, EPS=SLAMCH('E'). It is required that CTOL >= ONE, i.e. it is not allowed to force the routine to obtain orthogonality below EPSILON. On exit, WORK(1) = SCALE is the scaling factor such that SCALESVA(1:N) are the computed singular vcalues of A. (See description of SVA().) WORK(2) = NINT(WORK(2)) is the number of the computed nonzero singular values. WORK(3) = NINT(WORK(3)) is the number of the computed singular values that are larger than the underflow threshold. WORK(4) = NINT(WORK(4)) is the number of sweeps of Jacobi rotations needed for numerical convergence. WORK(5) = max_{i.NE.j} \|COS(A(:,i),A(:,j))\| in the last sweep. This is useful information in cases when SGESVJ did not converge, as it can be used to estimate whether the output is stil useful and for post festum analysis. WORK(6) = the largest absolute value over all sines of the Jacobi rotation angles in the last sweep. It can be useful for a post festum analysis.
[in]	LWORK	LWORK is INTEGER length of WORK, WORK >= MAX(6,M+N)
[out]	INFO	INFO is INTEGER = 0 : successful exit. < 0 : if INFO = -i, then the i-th argument had an illegal value > 0 : SGESVJ did not converge in the maximal allowed number (30) of sweeps. The output may still be useful. See the description of WORK.

Author: Univ. of Tennessee; Univ. of California Berkeley; Univ. of Colorado Denver; NAG Ltd.

Date: November 2015

Further Details:: The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane rotations. The rotations are implemented as fast scaled rotations of Anda and Park [1]. In the case of underflow of the Jacobi angle, a modified Jacobi transformation of Drmac [4] is used. Pivot strategy uses column interchanges of de Rijk [2]. The relative accuracy of the computed singular values and the accuracy of the computed singular vectors (in angle metric) is as guaranteed by the theory of Demmel and Veselic [3]. The condition number that determines the accuracy in the full rank case is essentially min_{D=diag} kappa(A*D), where kappa(.) is the spectral condition number. The best performance of this Jacobi SVD procedure is achieved if used in an accelerated version of Drmac and Veselic [5,6], and it is the kernel routine in the SIGMA library [7]. Some tunning parameters (marked with [TP]) are available for the implementer.
The computational range for the nonzero singular values is the machine number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even denormalized singular values can be computed with the corresponding gradual loss of accurate digits.

Contributors:: Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)

References:: [1] A. A. Anda and H. Park: Fast plane rotations with dynamic scaling.
SIAM J. matrix Anal. Appl., Vol. 15 (1994), pp. 162-174.

[2] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer.
SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.

[3] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
[4] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic.
SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.

[5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
LAPACK Working note 169.

[6] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
LAPACK Working note 170.

[7] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations.
Department of Mathematics, University of Zagreb, 2008.

Bugs, Examples and Comments:: Please report all bugs and send interesting test examples and comments to drmac.nosp@m.@mat.nosp@m.h.hr. Thank you.

Definition at line 325 of file sgesvj.f.

 *
 *  -- LAPACK computational routine (version 3.6.0) --
 *  -- LAPACK is a software package provided by Univ. of Tennessee,    --
 *  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
 *     November 2015
 *
 *     .. Scalar Arguments ..
       INTEGER            info, lda, ldv, lwork, m, mv, n
       CHARACTER*1        joba, jobu, jobv
 *     ..
 *     .. Array Arguments ..
       REAL               a( lda, * ), sva( n ), v( ldv, * ),
      $                   work( lwork )
 *     ..
 *
 *  =====================================================================
 *
 *     .. Local Parameters ..
       REAL               zero, half, one
       parameter                ( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
       INTEGER            nsweep
       parameter                ( nsweep = 30 )
 *     ..
 *     .. Local Scalars ..
       REAL               aapp, aapp0, aapq, aaqq, apoaq, aqoap, big,
      $                   bigtheta, cs, ctol, epsln, large, mxaapq,
      $                   mxsinj, rootbig, rooteps, rootsfmin, roottol,
      $                   skl, sfmin, small, sn, t, temp1, theta,
      $                   thsign, tol
       INTEGER            blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
      $                   iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
      $                   n4, nbl, notrot, p, pskipped, q, rowskip,
      $                   swband
       LOGICAL            applv, goscale, lower, lsvec, noscale, rotok,
      $                   rsvec, uctol, upper
 *     ..
 *     .. Local Arrays ..
       REAL               fastr( 5 )
 *     ..
 *     .. Intrinsic Functions ..
       INTRINSIC          abs, max, min, float, sign, sqrt
 *     ..
 *     .. External Functions ..
 *     ..
 *     from BLAS
       REAL               sdot, snrm2
       EXTERNAL           sdot, snrm2
       INTEGER            isamax
       EXTERNAL           isamax
 *     from LAPACK
       REAL               slamch
       EXTERNAL           slamch
       LOGICAL            lsame
       EXTERNAL           lsame
 *     ..
 *     .. External Subroutines ..
 *     ..
 *     from BLAS
       EXTERNAL           saxpy, scopy, srotm, sscal, sswap
 *     from LAPACK
       EXTERNAL           slascl, slaset, slassq, xerbla
 *
       EXTERNAL           sgsvj0, sgsvj1
 *     ..
 *     .. Executable Statements ..
 *
 *     Test the input arguments
 *
       lsvec = lsame( jobu, 'U' )
       uctol = lsame( jobu, 'C' )
       rsvec = lsame( jobv, 'V' )
       applv = lsame( jobv, 'A' )
       upper = lsame( joba, 'U' )
       lower = lsame( joba, 'L' )
 *
       IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
          info = -1
       ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
          info = -2
       ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
          info = -3
       ELSE IF( m.LT.0 ) THEN
          info = -4
       ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
          info = -5
       ELSE IF( lda.LT.m ) THEN
          info = -7
       ELSE IF( mv.LT.0 ) THEN
          info = -9
       ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
      $         ( applv .AND. ( ldv.LT.mv ) ) ) THEN
          info = -11
       ELSE IF( uctol .AND. ( work( 1 ).LE.one ) ) THEN
          info = -12
       ELSE IF( lwork.LT.max( m+n, 6 ) ) THEN
          info = -13
       ELSE
          info = 0
       END IF
 *
 *     #:(
       IF( info.NE.0 ) THEN
          CALL xerbla( 'SGESVJ', -info )
          RETURN
       END IF
 *
 * #:) Quick return for void matrix
 *
       IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
 *
 *     Set numerical parameters
 *     The stopping criterion for Jacobi rotations is
 *
 *     max_{i<>j}|A(:,i)^T * A(:,j)|/(||A(:,i)||*||A(:,j)||) < CTOL*EPS
 *
 *     where EPS is the round-off and CTOL is defined as follows:
 *
       IF( uctol ) THEN
 *        ... user controlled
          ctol = work( 1 )
       ELSE
 *        ... default
          IF( lsvec .OR. rsvec .OR. applv ) THEN
             ctol = sqrt( float( m ) )
          ELSE
             ctol = float( m )
          END IF
       END IF
 *     ... and the machine dependent parameters are
 *[!]  (Make sure that SLAMCH() works properly on the target machine.)
 *
       epsln = slamch( 'Epsilon' )
       rooteps = sqrt( epsln )
       sfmin = slamch( 'SafeMinimum' )
       rootsfmin = sqrt( sfmin )
       small = sfmin / epsln
       big = slamch( 'Overflow' )
 *     BIG         = ONE    / SFMIN
       rootbig = one / rootsfmin
       large = big / sqrt( float( m*n ) )
       bigtheta = one / rooteps
 *
       tol = ctol*epsln
       roottol = sqrt( tol )
 *
       IF( float( m )*epsln.GE.one ) THEN
          info = -4
          CALL xerbla( 'SGESVJ', -info )
          RETURN
       END IF
 *
 *     Initialize the right singular vector matrix.
 *
       IF( rsvec ) THEN
          mvl = n
          CALL slaset( 'A', mvl, n, zero, one, v, ldv )
       ELSE IF( applv ) THEN
          mvl = mv
       END IF
       rsvec = rsvec .OR. applv
 *
 *     Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
 *(!)  If necessary, scale A to protect the largest singular value
 *     from overflow. It is possible that saving the largest singular
 *     value destroys the information about the small ones.
 *     This initial scaling is almost minimal in the sense that the
 *     goal is to make sure that no column norm overflows, and that
 *     SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
 *     in A are detected, the procedure returns with INFO=-6.
 *
       skl = one / sqrt( float( m )*float( n ) )
       noscale = .true.
       goscale = .true.
 *
       IF( lower ) THEN
 *        the input matrix is M-by-N lower triangular (trapezoidal)
          DO 1874 p = 1, n
             aapp = zero
             aaqq = one
             CALL slassq( m-p+1, a( p, p ), 1, aapp, aaqq )
             IF( aapp.GT.big ) THEN
                info = -6
                CALL xerbla( 'SGESVJ', -info )
                RETURN
             END IF
             aaqq = sqrt( aaqq )
             IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
                sva( p ) = aapp*aaqq
             ELSE
                noscale = .false.
                sva( p ) = aapp*( aaqq*skl )
                IF( goscale ) THEN
                   goscale = .false.
                   DO 1873 q = 1, p - 1
                      sva( q ) = sva( q )*skl
  1873             CONTINUE
                END IF
             END IF
  1874    CONTINUE
       ELSE IF( upper ) THEN
 *        the input matrix is M-by-N upper triangular (trapezoidal)
          DO 2874 p = 1, n
             aapp = zero
             aaqq = one
             CALL slassq( p, a( 1, p ), 1, aapp, aaqq )
             IF( aapp.GT.big ) THEN
                info = -6
                CALL xerbla( 'SGESVJ', -info )
                RETURN
             END IF
             aaqq = sqrt( aaqq )
             IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
                sva( p ) = aapp*aaqq
             ELSE
                noscale = .false.
                sva( p ) = aapp*( aaqq*skl )
                IF( goscale ) THEN
                   goscale = .false.
                   DO 2873 q = 1, p - 1
                      sva( q ) = sva( q )*skl
  2873             CONTINUE
                END IF
             END IF
  2874    CONTINUE
       ELSE
 *        the input matrix is M-by-N general dense
          DO 3874 p = 1, n
             aapp = zero
             aaqq = one
             CALL slassq( m, a( 1, p ), 1, aapp, aaqq )
             IF( aapp.GT.big ) THEN
                info = -6
                CALL xerbla( 'SGESVJ', -info )
                RETURN
             END IF
             aaqq = sqrt( aaqq )
             IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
                sva( p ) = aapp*aaqq
             ELSE
                noscale = .false.
                sva( p ) = aapp*( aaqq*skl )
                IF( goscale ) THEN
                   goscale = .false.
                   DO 3873 q = 1, p - 1
                      sva( q ) = sva( q )*skl
  3873             CONTINUE
                END IF
             END IF
  3874    CONTINUE
       END IF
 *
       IF( noscale )skl = one
 *
 *     Move the smaller part of the spectrum from the underflow threshold
 *(!)  Start by determining the position of the nonzero entries of the
 *     array SVA() relative to ( SFMIN, BIG ).
 *
       aapp = zero
       aaqq = big
       DO 4781 p = 1, n
          IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
          aapp = max( aapp, sva( p ) )
  4781 CONTINUE
 *
 * #:) Quick return for zero matrix
 *
       IF( aapp.EQ.zero ) THEN
          IF( lsvec )CALL slaset( 'G', m, n, zero, one, a, lda )
          work( 1 ) = one
          work( 2 ) = zero
          work( 3 ) = zero
          work( 4 ) = zero
          work( 5 ) = zero
          work( 6 ) = zero
          RETURN
       END IF
 *
 * #:) Quick return for one-column matrix
 *
       IF( n.EQ.1 ) THEN
          IF( lsvec )CALL slascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
      $                           a( 1, 1 ), lda, ierr )
          work( 1 ) = one / skl
          IF( sva( 1 ).GE.sfmin ) THEN
             work( 2 ) = one
          ELSE
             work( 2 ) = zero
          END IF
          work( 3 ) = zero
          work( 4 ) = zero
          work( 5 ) = zero
          work( 6 ) = zero
          RETURN
       END IF
 *
 *     Protect small singular values from underflow, and try to
 *     avoid underflows/overflows in computing Jacobi rotations.
 *
       sn = sqrt( sfmin / epsln )
       temp1 = sqrt( big / float( n ) )
       IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
      $    ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
          temp1 = min( big, temp1 / aapp )
 *         AAQQ  = AAQQ*TEMP1
 *         AAPP  = AAPP*TEMP1
       ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
          temp1 = min( sn / aaqq, big / ( aapp*sqrt( float( n ) ) ) )
 *         AAQQ  = AAQQ*TEMP1
 *         AAPP  = AAPP*TEMP1
       ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
          temp1 = max( sn / aaqq, temp1 / aapp )
 *         AAQQ  = AAQQ*TEMP1
 *         AAPP  = AAPP*TEMP1
       ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
          temp1 = min( sn / aaqq, big / ( sqrt( float( n ) )*aapp ) )
 *         AAQQ  = AAQQ*TEMP1
 *         AAPP  = AAPP*TEMP1
       ELSE
          temp1 = one
       END IF
 *
 *     Scale, if necessary
 *
       IF( temp1.NE.one ) THEN
          CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
       END IF
       skl = temp1*skl
       IF( skl.NE.one ) THEN
          CALL slascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
          skl = one / skl
       END IF
 *
 *     Row-cyclic Jacobi SVD algorithm with column pivoting
 *
       emptsw = ( n*( n-1 ) ) / 2
       notrot = 0
       fastr( 1 ) = zero
 *
 *     A is represented in factored form A = A * diag(WORK), where diag(WORK)
 *     is initialized to identity. WORK is updated during fast scaled
 *     rotations.
 *
       DO 1868 q = 1, n
          work( q ) = one
  1868 CONTINUE
 *
 *
       swband = 3
 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
 *     if SGESVJ is used as a computational routine in the preconditioned
 *     Jacobi SVD algorithm SGESVJ. For sweeps i=1:SWBAND the procedure
 *     works on pivots inside a band-like region around the diagonal.
 *     The boundaries are determined dynamically, based on the number of
 *     pivots above a threshold.
 *
       kbl = min( 8, n )
 *[TP] KBL is a tuning parameter that defines the tile size in the
 *     tiling of the p-q loops of pivot pairs. In general, an optimal
 *     value of KBL depends on the matrix dimensions and on the
 *     parameters of the computer's memory.
 *
       nbl = n / kbl
       IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
 *
       blskip = kbl**2
 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
 *
       rowskip = min( 5, kbl )
 *[TP] ROWSKIP is a tuning parameter.
 *
       lkahead = 1
 *[TP] LKAHEAD is a tuning parameter.
 *
 *     Quasi block transformations, using the lower (upper) triangular
 *     structure of the input matrix. The quasi-block-cycling usually
 *     invokes cubic convergence. Big part of this cycle is done inside
 *     canonical subspaces of dimensions less than M.
 *
       IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
 *[TP] The number of partition levels and the actual partition are
 *     tuning parameters.
          n4 = n / 4
          n2 = n / 2
          n34 = 3*n4
          IF( applv ) THEN
             q = 0
          ELSE
             q = 1
          END IF
 *
          IF( lower ) THEN
 *
 *     This works very well on lower triangular matrices, in particular
 *     in the framework of the preconditioned Jacobi SVD (xGEJSV).
 *     The idea is simple:
 *     [+ 0 0 0]   Note that Jacobi transformations of [0 0]
 *     [+ + 0 0]                                       [0 0]
 *     [+ + x 0]   actually work on [x 0]              [x 0]
 *     [+ + x x]                    [x x].             [x x]
 *
             CALL sgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
      $                   work( n34+1 ), sva( n34+1 ), mvl,
      $                   v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
      $                   2, work( n+1 ), lwork-n, ierr )
 *
             CALL sgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
      $                   work( n2+1 ), sva( n2+1 ), mvl,
      $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
      $                   work( n+1 ), lwork-n, ierr )
 *
             CALL sgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
      $                   work( n2+1 ), sva( n2+1 ), mvl,
      $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
      $                   work( n+1 ), lwork-n, ierr )
 *
             CALL sgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
      $                   work( n4+1 ), sva( n4+1 ), mvl,
      $                   v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
      $                   work( n+1 ), lwork-n, ierr )
 *
             CALL sgsvj0( jobv, m, n4, a, lda, work, sva, mvl, v, ldv,
      $                   epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
      $                   ierr )
 *
             CALL sgsvj1( jobv, m, n2, n4, a, lda, work, sva, mvl, v,
      $                   ldv, epsln, sfmin, tol, 1, work( n+1 ),
      $                   lwork-n, ierr )
 *
 *
          ELSE IF( upper ) THEN
 *
 *
             CALL sgsvj0( jobv, n4, n4, a, lda, work, sva, mvl, v, ldv,
      $                   epsln, sfmin, tol, 2, work( n+1 ), lwork-n,
      $                   ierr )
 *
             CALL sgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, work( n4+1 ),
      $                   sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
      $                   epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
      $                   ierr )
 *
             CALL sgsvj1( jobv, n2, n2, n4, a, lda, work, sva, mvl, v,
      $                   ldv, epsln, sfmin, tol, 1, work( n+1 ),
      $                   lwork-n, ierr )
 *
             CALL sgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
      $                   work( n2+1 ), sva( n2+1 ), mvl,
      $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
      $                   work( n+1 ), lwork-n, ierr )
 
          END IF
 *
       END IF
 *
 *     .. Row-cyclic pivot strategy with de Rijk's pivoting ..
 *
       DO 1993 i = 1, nsweep
 *
 *     .. go go go ...
 *
          mxaapq = zero
          mxsinj = zero
          iswrot = 0
 *
          notrot = 0
          pskipped = 0
 *
 *     Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
 *     1 <= p < q <= N. This is the first step toward a blocked implementation
 *     of the rotations. New implementation, based on block transformations,
 *     is under development.
 *
          DO 2000 ibr = 1, nbl
 *
             igl = ( ibr-1 )*kbl + 1
 *
             DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
 *
                igl = igl + ir1*kbl
 *
                DO 2001 p = igl, min( igl+kbl-1, n-1 )
 *
 *     .. de Rijk's pivoting
 *
                   q = isamax( n-p+1, sva( p ), 1 ) + p - 1
                   IF( p.NE.q ) THEN
                      CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
                      IF( rsvec )CALL sswap( mvl, v( 1, p ), 1,
      $                                      v( 1, q ), 1 )
                      temp1 = sva( p )
                      sva( p ) = sva( q )
                      sva( q ) = temp1
                      temp1 = work( p )
                      work( p ) = work( q )
                      work( q ) = temp1
                   END IF
 *
                   IF( ir1.EQ.0 ) THEN
 *
 *        Column norms are periodically updated by explicit
 *        norm computation.
 *        Caveat:
 *        Unfortunately, some BLAS implementations compute SNRM2(M,A(1,p),1)
 *        as SQRT(SDOT(M,A(1,p),1,A(1,p),1)), which may cause the result to
 *        overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
 *        underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
 *        Hence, SNRM2 cannot be trusted, not even in the case when
 *        the true norm is far from the under(over)flow boundaries.
 *        If properly implemented SNRM2 is available, the IF-THEN-ELSE
 *        below should read "AAPP = SNRM2( M, A(1,p), 1 ) * WORK(p)".
 *
                      IF( ( sva( p ).LT.rootbig ) .AND.
      $                   ( sva( p ).GT.rootsfmin ) ) THEN
                         sva( p ) = snrm2( m, a( 1, p ), 1 )*work( p )
                      ELSE
                         temp1 = zero
                         aapp = one
                         CALL slassq( m, a( 1, p ), 1, temp1, aapp )
                         sva( p ) = temp1*sqrt( aapp )*work( p )
                      END IF
                      aapp = sva( p )
                   ELSE
                      aapp = sva( p )
                   END IF
 *
                   IF( aapp.GT.zero ) THEN
 *
                      pskipped = 0
 *
                      DO 2002 q = p + 1, min( igl+kbl-1, n )
 *
                         aaqq = sva( q )
 *
                         IF( aaqq.GT.zero ) THEN
 *
                            aapp0 = aapp
                            IF( aaqq.GE.one ) THEN
                               rotok = ( small*aapp ).LE.aaqq
                               IF( aapp.LT.( big / aaqq ) ) THEN
                                  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
      $                                  q ), 1 )*work( p )*work( q ) /
      $                                  aaqq ) / aapp
                               ELSE
                                  CALL scopy( m, a( 1, p ), 1,
      $                                       work( n+1 ), 1 )
                                  CALL slascl( 'G', 0, 0, aapp,
      $                                        work( p ), m, 1,
      $                                        work( n+1 ), lda, ierr )
                                  aapq = sdot( m, work( n+1 ), 1,
      $                                  a( 1, q ), 1 )*work( q ) / aaqq
                               END IF
                            ELSE
                               rotok = aapp.LE.( aaqq / small )
                               IF( aapp.GT.( small / aaqq ) ) THEN
                                  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
      $                                  q ), 1 )*work( p )*work( q ) /
      $                                  aaqq ) / aapp
                               ELSE
                                  CALL scopy( m, a( 1, q ), 1,
      $                                       work( n+1 ), 1 )
                                  CALL slascl( 'G', 0, 0, aaqq,
      $                                        work( q ), m, 1,
      $                                        work( n+1 ), lda, ierr )
                                  aapq = sdot( m, work( n+1 ), 1,
      $                                  a( 1, p ), 1 )*work( p ) / aapp
                               END IF
                            END IF
 *
                            mxaapq = max( mxaapq, abs( aapq ) )
 *
 *        TO rotate or NOT to rotate, THAT is the question ...
 *
                            IF( abs( aapq ).GT.tol ) THEN
 *
 *           .. rotate
 *[RTD]      ROTATED = ROTATED + ONE
 *
                               IF( ir1.EQ.0 ) THEN
                                  notrot = 0
                                  pskipped = 0
                                  iswrot = iswrot + 1
                               END IF
 *
                               IF( rotok ) THEN
 *
                                  aqoap = aaqq / aapp
                                  apoaq = aapp / aaqq
                                  theta = -half*abs( aqoap-apoaq ) / aapq
 *
                                  IF( abs( theta ).GT.bigtheta ) THEN
 *
                                     t = half / theta
                                     fastr( 3 ) = t*work( p ) / work( q )
                                     fastr( 4 ) = -t*work( q ) /
      $                                           work( p )
                                     CALL srotm( m, a( 1, p ), 1,
      $                                          a( 1, q ), 1, fastr )
                                     IF( rsvec )CALL srotm( mvl,
      $                                              v( 1, p ), 1,
      $                                              v( 1, q ), 1,
      $                                              fastr )
                                     sva( q ) = aaqq*sqrt( max( zero,
      $                                         one+t*apoaq*aapq ) )
                                     aapp = aapp*sqrt( max( zero, 
      $                                         one-t*aqoap*aapq ) )
                                     mxsinj = max( mxsinj, abs( t ) )
 *
                                  ELSE
 *
 *                 .. choose correct signum for THETA and rotate
 *
                                     thsign = -sign( one, aapq )
                                     t = one / ( theta+thsign*
      $                                  sqrt( one+theta*theta ) )
                                     cs = sqrt( one / ( one+t*t ) )
                                     sn = t*cs
 *
                                     mxsinj = max( mxsinj, abs( sn ) )
                                     sva( q ) = aaqq*sqrt( max( zero,
      $                                         one+t*apoaq*aapq ) )
                                     aapp = aapp*sqrt( max( zero,
      $                                     one-t*aqoap*aapq ) )
 *
                                     apoaq = work( p ) / work( q )
                                     aqoap = work( q ) / work( p )
                                     IF( work( p ).GE.one ) THEN
                                        IF( work( q ).GE.one ) THEN
                                           fastr( 3 ) = t*apoaq
                                           fastr( 4 ) = -t*aqoap
                                           work( p ) = work( p )*cs
                                           work( q ) = work( q )*cs
                                           CALL srotm( m, a( 1, p ), 1,
      $                                                a( 1, q ), 1,
      $                                                fastr )
                                           IF( rsvec )CALL srotm( mvl,
      $                                        v( 1, p ), 1, v( 1, q ),
      $                                        1, fastr )
                                        ELSE
                                           CALL saxpy( m, -t*aqoap,
      $                                                a( 1, q ), 1,
      $                                                a( 1, p ), 1 )
                                           CALL saxpy( m, cs*sn*apoaq,
      $                                                a( 1, p ), 1,
      $                                                a( 1, q ), 1 )
                                           work( p ) = work( p )*cs
                                           work( q ) = work( q ) / cs
                                           IF( rsvec ) THEN
                                              CALL saxpy( mvl, -t*aqoap,
      $                                                   v( 1, q ), 1,
      $                                                   v( 1, p ), 1 )
                                              CALL saxpy( mvl,
      $                                                   cs*sn*apoaq,
      $                                                   v( 1, p ), 1,
      $                                                   v( 1, q ), 1 )
                                           END IF
                                        END IF
                                     ELSE
                                        IF( work( q ).GE.one ) THEN
                                           CALL saxpy( m, t*apoaq,
      $                                                a( 1, p ), 1,
      $                                                a( 1, q ), 1 )
                                           CALL saxpy( m, -cs*sn*aqoap,
      $                                                a( 1, q ), 1,
      $                                                a( 1, p ), 1 )
                                           work( p ) = work( p ) / cs
                                           work( q ) = work( q )*cs
                                           IF( rsvec ) THEN
                                              CALL saxpy( mvl, t*apoaq,
      $                                                   v( 1, p ), 1,
      $                                                   v( 1, q ), 1 )
                                              CALL saxpy( mvl,
      $                                                   -cs*sn*aqoap,
      $                                                   v( 1, q ), 1,
      $                                                   v( 1, p ), 1 )
                                           END IF
                                        ELSE
                                           IF( work( p ).GE.work( q ) )
      $                                        THEN
                                              CALL saxpy( m, -t*aqoap,
      $                                                   a( 1, q ), 1,
      $                                                   a( 1, p ), 1 )
                                              CALL saxpy( m, cs*sn*apoaq,
      $                                                   a( 1, p ), 1,
      $                                                   a( 1, q ), 1 )
                                              work( p ) = work( p )*cs
                                              work( q ) = work( q ) / cs
                                              IF( rsvec ) THEN
                                                 CALL saxpy( mvl,
      $                                               -t*aqoap,
      $                                               v( 1, q ), 1,
      $                                               v( 1, p ), 1 )
                                                 CALL saxpy( mvl,
      $                                               cs*sn*apoaq,
      $                                               v( 1, p ), 1,
      $                                               v( 1, q ), 1 )
                                              END IF
                                           ELSE
                                              CALL saxpy( m, t*apoaq,
      $                                                   a( 1, p ), 1,
      $                                                   a( 1, q ), 1 )
                                              CALL saxpy( m,
      $                                                   -cs*sn*aqoap,
      $                                                   a( 1, q ), 1,
      $                                                   a( 1, p ), 1 )
                                              work( p ) = work( p ) / cs
                                              work( q ) = work( q )*cs
                                              IF( rsvec ) THEN
                                                 CALL saxpy( mvl,
      $                                               t*apoaq, v( 1, p ),
      $                                               1, v( 1, q ), 1 )
                                                 CALL saxpy( mvl,
      $                                               -cs*sn*aqoap,
      $                                               v( 1, q ), 1,
      $                                               v( 1, p ), 1 )
                                              END IF
                                           END IF
                                        END IF
                                     END IF
                                  END IF
 *
                               ELSE
 *              .. have to use modified Gram-Schmidt like transformation
                                  CALL scopy( m, a( 1, p ), 1,
      $                                       work( n+1 ), 1 )
                                  CALL slascl( 'G', 0, 0, aapp, one, m,
      $                                        1, work( n+1 ), lda,
      $                                        ierr )
                                  CALL slascl( 'G', 0, 0, aaqq, one, m,
      $                                        1, a( 1, q ), lda, ierr )
                                  temp1 = -aapq*work( p ) / work( q )
                                  CALL saxpy( m, temp1, work( n+1 ), 1,
      $                                       a( 1, q ), 1 )
                                  CALL slascl( 'G', 0, 0, one, aaqq, m,
      $                                        1, a( 1, q ), lda, ierr )
                                  sva( q ) = aaqq*sqrt( max( zero,
      $                                      one-aapq*aapq ) )
                                  mxsinj = max( mxsinj, sfmin )
                               END IF
 *           END IF ROTOK THEN ... ELSE
 *
 *           In the case of cancellation in updating SVA(q), SVA(p)
 *           recompute SVA(q), SVA(p).
 *
                               IF( ( sva( q ) / aaqq )**2.LE.rooteps )
      $                            THEN
                                  IF( ( aaqq.LT.rootbig ) .AND.
      $                               ( aaqq.GT.rootsfmin ) ) THEN
                                     sva( q ) = snrm2( m, a( 1, q ), 1 )*
      $                                         work( q )
                                  ELSE
                                     t = zero
                                     aaqq = one
                                     CALL slassq( m, a( 1, q ), 1, t,
      $                                           aaqq )
                                     sva( q ) = t*sqrt( aaqq )*work( q )
                                  END IF
                               END IF
                               IF( ( aapp / aapp0 ).LE.rooteps ) THEN
                                  IF( ( aapp.LT.rootbig ) .AND.
      $                               ( aapp.GT.rootsfmin ) ) THEN
                                     aapp = snrm2( m, a( 1, p ), 1 )*
      $                                     work( p )
                                  ELSE
                                     t = zero
                                     aapp = one
                                     CALL slassq( m, a( 1, p ), 1, t,
      $                                           aapp )
                                     aapp = t*sqrt( aapp )*work( p )
                                  END IF
                                  sva( p ) = aapp
                               END IF
 *
                            ELSE
 *        A(:,p) and A(:,q) already numerically orthogonal
                               IF( ir1.EQ.0 )notrot = notrot + 1
 *[RTD]      SKIPPED  = SKIPPED  + 1
                               pskipped = pskipped + 1
                            END IF
                         ELSE
 *        A(:,q) is zero column
                            IF( ir1.EQ.0 )notrot = notrot + 1
                            pskipped = pskipped + 1
                         END IF
 *
                         IF( ( i.LE.swband ) .AND.
      $                      ( pskipped.GT.rowskip ) ) THEN
                            IF( ir1.EQ.0 )aapp = -aapp
                            notrot = 0
                            GO TO 2103
                         END IF
 *
  2002                CONTINUE
 *     END q-LOOP
 *
  2103                CONTINUE
 *     bailed out of q-loop
 *
                      sva( p ) = aapp
 *
                   ELSE
                      sva( p ) = aapp
                      IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
      $                   notrot = notrot + min( igl+kbl-1, n ) - p
                   END IF
 *
  2001          CONTINUE
 *     end of the p-loop
 *     end of doing the block ( ibr, ibr )
  1002       CONTINUE
 *     end of ir1-loop
 *
 * ... go to the off diagonal blocks
 *
             igl = ( ibr-1 )*kbl + 1
 *
             DO 2010 jbc = ibr + 1, nbl
 *
                jgl = ( jbc-1 )*kbl + 1
 *
 *        doing the block at ( ibr, jbc )
 *
                ijblsk = 0
                DO 2100 p = igl, min( igl+kbl-1, n )
 *
                   aapp = sva( p )
                   IF( aapp.GT.zero ) THEN
 *
                      pskipped = 0
 *
                      DO 2200 q = jgl, min( jgl+kbl-1, n )
 *
                         aaqq = sva( q )
                         IF( aaqq.GT.zero ) THEN
                            aapp0 = aapp
 *
 *     .. M x 2 Jacobi SVD ..
 *
 *        Safe Gram matrix computation
 *
                            IF( aaqq.GE.one ) THEN
                               IF( aapp.GE.aaqq ) THEN
                                  rotok = ( small*aapp ).LE.aaqq
                               ELSE
                                  rotok = ( small*aaqq ).LE.aapp
                               END IF
                               IF( aapp.LT.( big / aaqq ) ) THEN
                                  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
      $                                  q ), 1 )*work( p )*work( q ) /
      $                                  aaqq ) / aapp
                               ELSE
                                  CALL scopy( m, a( 1, p ), 1,
      $                                       work( n+1 ), 1 )
                                  CALL slascl( 'G', 0, 0, aapp,
      $                                        work( p ), m, 1,
      $                                        work( n+1 ), lda, ierr )
                                  aapq = sdot( m, work( n+1 ), 1,
      $                                  a( 1, q ), 1 )*work( q ) / aaqq
                               END IF
                            ELSE
                               IF( aapp.GE.aaqq ) THEN
                                  rotok = aapp.LE.( aaqq / small )
                               ELSE
                                  rotok = aaqq.LE.( aapp / small )
                               END IF
                               IF( aapp.GT.( small / aaqq ) ) THEN
                                  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
      $                                  q ), 1 )*work( p )*work( q ) /
      $                                  aaqq ) / aapp
                               ELSE
                                  CALL scopy( m, a( 1, q ), 1,
      $                                       work( n+1 ), 1 )
                                  CALL slascl( 'G', 0, 0, aaqq,
      $                                        work( q ), m, 1,
      $                                        work( n+1 ), lda, ierr )
                                  aapq = sdot( m, work( n+1 ), 1,
      $                                  a( 1, p ), 1 )*work( p ) / aapp
                               END IF
                            END IF
 *
                            mxaapq = max( mxaapq, abs( aapq ) )
 *
 *        TO rotate or NOT to rotate, THAT is the question ...
 *
                            IF( abs( aapq ).GT.tol ) THEN
                               notrot = 0
 *[RTD]      ROTATED  = ROTATED + 1
                               pskipped = 0
                               iswrot = iswrot + 1
 *
                               IF( rotok ) THEN
 *
                                  aqoap = aaqq / aapp
                                  apoaq = aapp / aaqq
                                  theta = -half*abs( aqoap-apoaq ) / aapq
                                  IF( aaqq.GT.aapp0 )theta = -theta
 *
                                  IF( abs( theta ).GT.bigtheta ) THEN
                                     t = half / theta
                                     fastr( 3 ) = t*work( p ) / work( q )
                                     fastr( 4 ) = -t*work( q ) /
      $                                           work( p )
                                     CALL srotm( m, a( 1, p ), 1,
      $                                          a( 1, q ), 1, fastr )
                                     IF( rsvec )CALL srotm( mvl,
      $                                              v( 1, p ), 1,
      $                                              v( 1, q ), 1,
      $                                              fastr )
                                     sva( q ) = aaqq*sqrt( max( zero,
      $                                         one+t*apoaq*aapq ) )
                                     aapp = aapp*sqrt( max( zero,
      $                                     one-t*aqoap*aapq ) )
                                     mxsinj = max( mxsinj, abs( t ) )
                                  ELSE
 *
 *                 .. choose correct signum for THETA and rotate
 *
                                     thsign = -sign( one, aapq )
                                     IF( aaqq.GT.aapp0 )thsign = -thsign
                                     t = one / ( theta+thsign*
      $                                  sqrt( one+theta*theta ) )
                                     cs = sqrt( one / ( one+t*t ) )
                                     sn = t*cs
                                     mxsinj = max( mxsinj, abs( sn ) )
                                     sva( q ) = aaqq*sqrt( max( zero,
      $                                         one+t*apoaq*aapq ) )
                                     aapp = aapp*sqrt( max( zero,  
      $                                         one-t*aqoap*aapq ) )
 *
                                     apoaq = work( p ) / work( q )
                                     aqoap = work( q ) / work( p )
                                     IF( work( p ).GE.one ) THEN
 *
                                        IF( work( q ).GE.one ) THEN
                                           fastr( 3 ) = t*apoaq
                                           fastr( 4 ) = -t*aqoap
                                           work( p ) = work( p )*cs
                                           work( q ) = work( q )*cs
                                           CALL srotm( m, a( 1, p ), 1,
      $                                                a( 1, q ), 1,
      $                                                fastr )
                                           IF( rsvec )CALL srotm( mvl,
      $                                        v( 1, p ), 1, v( 1, q ),
      $                                        1, fastr )
                                        ELSE
                                           CALL saxpy( m, -t*aqoap,
      $                                                a( 1, q ), 1,
      $                                                a( 1, p ), 1 )
                                           CALL saxpy( m, cs*sn*apoaq,
      $                                                a( 1, p ), 1,
      $                                                a( 1, q ), 1 )
                                           IF( rsvec ) THEN
                                              CALL saxpy( mvl, -t*aqoap,
      $                                                   v( 1, q ), 1,
      $                                                   v( 1, p ), 1 )
                                              CALL saxpy( mvl,
      $                                                   cs*sn*apoaq,
      $                                                   v( 1, p ), 1,
      $                                                   v( 1, q ), 1 )
                                           END IF
                                           work( p ) = work( p )*cs
                                           work( q ) = work( q ) / cs
                                        END IF
                                     ELSE
                                        IF( work( q ).GE.one ) THEN
                                           CALL saxpy( m, t*apoaq,
      $                                                a( 1, p ), 1,
      $                                                a( 1, q ), 1 )
                                           CALL saxpy( m, -cs*sn*aqoap,
      $                                                a( 1, q ), 1,
      $                                                a( 1, p ), 1 )
                                           IF( rsvec ) THEN
                                              CALL saxpy( mvl, t*apoaq,
      $                                                   v( 1, p ), 1,
      $                                                   v( 1, q ), 1 )
                                              CALL saxpy( mvl,
      $                                                   -cs*sn*aqoap,
      $                                                   v( 1, q ), 1,
      $                                                   v( 1, p ), 1 )
                                           END IF
                                           work( p ) = work( p ) / cs
                                           work( q ) = work( q )*cs
                                        ELSE
                                           IF( work( p ).GE.work( q ) )
      $                                        THEN
                                              CALL saxpy( m, -t*aqoap,
      $                                                   a( 1, q ), 1,
      $                                                   a( 1, p ), 1 )
                                              CALL saxpy( m, cs*sn*apoaq,
      $                                                   a( 1, p ), 1,
      $                                                   a( 1, q ), 1 )
                                              work( p ) = work( p )*cs
                                              work( q ) = work( q ) / cs
                                              IF( rsvec ) THEN
                                                 CALL saxpy( mvl,
      $                                               -t*aqoap,
      $                                               v( 1, q ), 1,
      $                                               v( 1, p ), 1 )
                                                 CALL saxpy( mvl,
      $                                               cs*sn*apoaq,
      $                                               v( 1, p ), 1,
      $                                               v( 1, q ), 1 )
                                              END IF
                                           ELSE
                                              CALL saxpy( m, t*apoaq,
      $                                                   a( 1, p ), 1,
      $                                                   a( 1, q ), 1 )
                                              CALL saxpy( m,
      $                                                   -cs*sn*aqoap,
      $                                                   a( 1, q ), 1,
      $                                                   a( 1, p ), 1 )
                                              work( p ) = work( p ) / cs
                                              work( q ) = work( q )*cs
                                              IF( rsvec ) THEN
                                                 CALL saxpy( mvl,
      $                                               t*apoaq, v( 1, p ),
      $                                               1, v( 1, q ), 1 )
                                                 CALL saxpy( mvl,
      $                                               -cs*sn*aqoap,
      $                                               v( 1, q ), 1,
      $                                               v( 1, p ), 1 )
                                              END IF
                                           END IF
                                        END IF
                                     END IF
                                  END IF
 *
                               ELSE
                                  IF( aapp.GT.aaqq ) THEN
                                     CALL scopy( m, a( 1, p ), 1,
      $                                          work( n+1 ), 1 )
                                     CALL slascl( 'G', 0, 0, aapp, one,
      $                                           m, 1, work( n+1 ), lda,
      $                                           ierr )
                                     CALL slascl( 'G', 0, 0, aaqq, one,
      $                                           m, 1, a( 1, q ), lda,
      $                                           ierr )
                                     temp1 = -aapq*work( p ) / work( q )
                                     CALL saxpy( m, temp1, work( n+1 ),
      $                                          1, a( 1, q ), 1 )
                                     CALL slascl( 'G', 0, 0, one, aaqq,
      $                                           m, 1, a( 1, q ), lda,
      $                                           ierr )
                                     sva( q ) = aaqq*sqrt( max( zero,
      $                                         one-aapq*aapq ) )
                                     mxsinj = max( mxsinj, sfmin )
                                  ELSE
                                     CALL scopy( m, a( 1, q ), 1,
      $                                          work( n+1 ), 1 )
                                     CALL slascl( 'G', 0, 0, aaqq, one,
      $                                           m, 1, work( n+1 ), lda,
      $                                           ierr )
                                     CALL slascl( 'G', 0, 0, aapp, one,
      $                                           m, 1, a( 1, p ), lda,
      $                                           ierr )
                                     temp1 = -aapq*work( q ) / work( p )
                                     CALL saxpy( m, temp1, work( n+1 ),
      $                                          1, a( 1, p ), 1 )
                                     CALL slascl( 'G', 0, 0, one, aapp,
      $                                           m, 1, a( 1, p ), lda,
      $                                           ierr )
                                     sva( p ) = aapp*sqrt( max( zero,
      $                                         one-aapq*aapq ) )
                                     mxsinj = max( mxsinj, sfmin )
                                  END IF
                               END IF
 *           END IF ROTOK THEN ... ELSE
 *
 *           In the case of cancellation in updating SVA(q)
 *           .. recompute SVA(q)
                               IF( ( sva( q ) / aaqq )**2.LE.rooteps )
      $                            THEN
                                  IF( ( aaqq.LT.rootbig ) .AND.
      $                               ( aaqq.GT.rootsfmin ) ) THEN
                                     sva( q ) = snrm2( m, a( 1, q ), 1 )*
      $                                         work( q )
                                  ELSE
                                     t = zero
                                     aaqq = one
                                     CALL slassq( m, a( 1, q ), 1, t,
      $                                           aaqq )
                                     sva( q ) = t*sqrt( aaqq )*work( q )
                                  END IF
                               END IF
                               IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
                                  IF( ( aapp.LT.rootbig ) .AND.
      $                               ( aapp.GT.rootsfmin ) ) THEN
                                     aapp = snrm2( m, a( 1, p ), 1 )*
      $                                     work( p )
                                  ELSE
                                     t = zero
                                     aapp = one
                                     CALL slassq( m, a( 1, p ), 1, t,
      $                                           aapp )
                                     aapp = t*sqrt( aapp )*work( p )
                                  END IF
                                  sva( p ) = aapp
                               END IF
 *              end of OK rotation
                            ELSE
                               notrot = notrot + 1
 *[RTD]      SKIPPED  = SKIPPED  + 1
                               pskipped = pskipped + 1
                               ijblsk = ijblsk + 1
                            END IF
                         ELSE
                            notrot = notrot + 1
                            pskipped = pskipped + 1
                            ijblsk = ijblsk + 1
                         END IF
 *
                         IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
      $                      THEN
                            sva( p ) = aapp
                            notrot = 0
                            GO TO 2011
                         END IF
                         IF( ( i.LE.swband ) .AND.
      $                      ( pskipped.GT.rowskip ) ) THEN
                            aapp = -aapp
                            notrot = 0
                            GO TO 2203
                         END IF
 *
  2200                CONTINUE
 *        end of the q-loop
  2203                CONTINUE
 *
                      sva( p ) = aapp
 *
                   ELSE
 *
                      IF( aapp.EQ.zero )notrot = notrot +
      $                   min( jgl+kbl-1, n ) - jgl + 1
                      IF( aapp.LT.zero )notrot = 0
 *
                   END IF
 *
  2100          CONTINUE
 *     end of the p-loop
  2010       CONTINUE
 *     end of the jbc-loop
  2011       CONTINUE
 *2011 bailed out of the jbc-loop
             DO 2012 p = igl, min( igl+kbl-1, n )
                sva( p ) = abs( sva( p ) )
  2012       CONTINUE
 ***
  2000    CONTINUE
 *2000 :: end of the ibr-loop
 *
 *     .. update SVA(N)
          IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
      $       THEN
             sva( n ) = snrm2( m, a( 1, n ), 1 )*work( n )
          ELSE
             t = zero
             aapp = one
             CALL slassq( m, a( 1, n ), 1, t, aapp )
             sva( n ) = t*sqrt( aapp )*work( n )
          END IF
 *
 *     Additional steering devices
 *
          IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
      $       ( iswrot.LE.n ) ) )swband = i
 *
          IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( float( n ) )*
      $       tol ) .AND. ( float( n )*mxaapq*mxsinj.LT.tol ) ) THEN
             GO TO 1994
          END IF
 *
          IF( notrot.GE.emptsw )GO TO 1994
 *
  1993 CONTINUE
 *     end i=1:NSWEEP loop
 *
 * #:( Reaching this point means that the procedure has not converged.
       info = nsweep - 1
       GO TO 1995
 *
  1994 CONTINUE
 * #:) Reaching this point means numerical convergence after the i-th
 *     sweep.
 *
       info = 0
 * #:) INFO = 0 confirms successful iterations.
  1995 CONTINUE
 *
 *     Sort the singular values and find how many are above
 *     the underflow threshold.
 *
       n2 = 0
       n4 = 0
       DO 5991 p = 1, n - 1
          q = isamax( n-p+1, sva( p ), 1 ) + p - 1
          IF( p.NE.q ) THEN
             temp1 = sva( p )
             sva( p ) = sva( q )
             sva( q ) = temp1
             temp1 = work( p )
             work( p ) = work( q )
             work( q ) = temp1
             CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
             IF( rsvec )CALL sswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
          END IF
          IF( sva( p ).NE.zero ) THEN
             n4 = n4 + 1
             IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
          END IF
  5991 CONTINUE
       IF( sva( n ).NE.zero ) THEN
          n4 = n4 + 1
          IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
       END IF
 *
 *     Normalize the left singular vectors.
 *
       IF( lsvec .OR. uctol ) THEN
          DO 1998 p = 1, n2
             CALL sscal( m, work( p ) / sva( p ), a( 1, p ), 1 )
  1998    CONTINUE
       END IF
 *
 *     Scale the product of Jacobi rotations (assemble the fast rotations).
 *
       IF( rsvec ) THEN
          IF( applv ) THEN
             DO 2398 p = 1, n
                CALL sscal( mvl, work( p ), v( 1, p ), 1 )
  2398       CONTINUE
          ELSE
             DO 2399 p = 1, n
                temp1 = one / snrm2( mvl, v( 1, p ), 1 )
                CALL sscal( mvl, temp1, v( 1, p ), 1 )
  2399       CONTINUE
          END IF
       END IF
 *
 *     Undo scaling, if necessary (and possible).
       IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) ) 
      $    .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
      $    ( sfmin / skl ) ) ) ) THEN
          DO 2400 p = 1, n
             sva( p ) = skl*sva( p )
  2400    CONTINUE
          skl = one
       END IF
 *
       work( 1 ) = skl
 *     The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
 *     then some of the singular values may overflow or underflow and
 *     the spectrum is given in this factored representation.
 *
       work( 2 ) = float( n4 )
 *     N4 is the number of computed nonzero singular values of A.
 *
       work( 3 ) = float( n2 )
 *     N2 is the number of singular values of A greater than SFMIN.
 *     If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
 *     that may carry some information.
 *
       work( 4 ) = float( i )
 *     i is the index of the last sweep before declaring convergence.
 *
       work( 5 ) = mxaapq
 *     MXAAPQ is the largest absolute value of scaled pivots in the
 *     last sweep
 *
       work( 6 ) = mxsinj
 *     MXSINJ is the largest absolute value of the sines of Jacobi angles
 *     in the last sweep
 *
       RETURN
 *     ..
 *     .. END OF SGESVJ
 *     ..

Here is the call graph for this function:

Here is the caller graph for this function: