◆ sgesvj()

subroutine sgesvj	(	character*1	joba,
		character*1	jobu,
		character*1	jobv,
		integer	m,
		integer	n,
		real, dimension( lda, * )	a,
		integer	lda,
		real, dimension( n )	sva,
		integer	mv,
		real, dimension( ldv, * )	v,
		integer	ldv,
		real, dimension( lwork )	work,
		integer	lwork,
		integer	info )

SGESVJ

Download SGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:

!>
!> SGESVJ computes the singular value decomposition (SVD) of a real
!> M-by-N matrix A, where M >= N. The SVD of A is written as
!>                                    [++]   [xx]   [x0]   [xx]
!>              A = U * SIGMA * V^t,  [++] = [xx] * [ox] * [xx]
!>                                    [++]   [xx]
!> where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
!> matrix, and V is an N-by-N orthogonal matrix. The diagonal elements
!> of SIGMA are the singular values of A. The columns of U and V are the
!> left and the right singular vectors of A, respectively.
!> SGESVJ can sometimes compute tiny singular values and their singular vectors much
!> more accurately than other SVD routines, see below under Further Details.
!>

Parameters

[in]	JOBA	!> JOBA is CHARACTER*1 !> Specifies the structure of A. !> = 'L': The input matrix A is lower triangular; !> = 'U': The input matrix A is upper triangular; !> = 'G': The input matrix A is general M-by-N matrix, M >= N. !>
[in]	JOBU	!> JOBU is CHARACTER1 !> Specifies whether to compute the left singular vectors !> (columns of U): !> = 'U': The left singular vectors corresponding to the nonzero !> singular values are computed and returned in the leading !> columns of A. See more details in the description of A. !> The default numerical orthogonality threshold is set to !> approximately TOL=CTOLEPS, CTOL=SQRT(M), EPS=SLAMCH('E'). !> = 'C': Analogous to JOBU='U', except that user can control the !> level of numerical orthogonality of the computed left !> singular vectors. TOL can be set to TOL = CTOLEPS, where !> CTOL is given on input in the array WORK. !> No CTOL smaller than ONE is allowed. CTOL greater !> than 1 / EPS is meaningless. The option 'C' !> can be used if MEPS is satisfactory orthogonality !> of the computed left singular vectors, so CTOL=M could !> save few sweeps of Jacobi rotations. !> See the descriptions of A and WORK(1). !> = 'N': The matrix U is not computed. However, see the !> description of A. !>
[in]	JOBV	!> JOBV is CHARACTER*1 !> Specifies whether to compute the right singular vectors, that !> is, the matrix V: !> = 'V': the matrix V is computed and returned in the array V !> = 'A': the Jacobi rotations are applied to the MV-by-N !> array V. In other words, the right singular vector !> matrix V is not computed explicitly; instead it is !> applied to an MV-by-N matrix initially stored in the !> first MV rows of V. !> = 'N': the matrix V is not computed and the array V is not !> referenced !>
[in]	M	!> M is INTEGER !> The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0. !>
[in]	N	!> N is INTEGER !> The number of columns of the input matrix A. !> M >= N >= 0. !>
[in,out]	A	!> A is REAL array, dimension (LDA,N) !> On entry, the M-by-N matrix A. !> On exit, !> If JOBU = 'U' .OR. JOBU = 'C': !> If INFO = 0: !> RANKA orthonormal columns of U are returned in the !> leading RANKA columns of the array A. Here RANKA <= N !> is the number of computed singular values of A that are !> above the underflow threshold SLAMCH('S'). The singular !> vectors corresponding to underflowed or zero singular !> values are not computed. The value of RANKA is returned !> in the array WORK as RANKA=NINT(WORK(2)). Also see the !> descriptions of SVA and WORK. The computed columns of U !> are mutually numerically orthogonal up to approximately !> TOL=SQRT(M)EPS (default); or TOL=CTOLEPS (JOBU = 'C'), !> see the description of JOBU. !> If INFO > 0, !> the procedure SGESVJ did not converge in the given number !> of iterations (sweeps). In that case, the computed !> columns of U may not be orthogonal up to TOL. The output !> U (stored in A), SIGMA (given by the computed singular !> values in SVA(1:N)) and V is still a decomposition of the !> input matrix A in the sense that the residual !> \|\|A-SCALEUSIGMAV^T\|\|_2 / \|\|A\|\|_2 is small. !> If JOBU = 'N': !> If INFO = 0: !> Note that the left singular vectors are 'for free' in the !> one-sided Jacobi SVD algorithm. However, if only the !> singular values are needed, the level of numerical !> orthogonality of U is not an issue and iterations are !> stopped when the columns of the iterated matrix are !> numerically orthogonal up to approximately MEPS. Thus, !> on exit, A contains the columns of U scaled with the !> corresponding singular values. !> If INFO > 0: !> the procedure SGESVJ did not converge in the given number !> of iterations (sweeps). !>
[in]	LDA	!> LDA is INTEGER !> The leading dimension of the array A. LDA >= max(1,M). !>
[out]	SVA	!> SVA is REAL array, dimension (N) !> On exit, !> If INFO = 0 : !> depending on the value SCALE = WORK(1), we have: !> If SCALE = ONE: !> SVA(1:N) contains the computed singular values of A. !> During the computation SVA contains the Euclidean column !> norms of the iterated matrices in the array A. !> If SCALE .NE. ONE: !> The singular values of A are SCALESVA(1:N), and this !> factored representation is due to the fact that some of the !> singular values of A might underflow or overflow. !> !> If INFO > 0 : !> the procedure SGESVJ did not converge in the given number of !> iterations (sweeps) and SCALESVA(1:N) may not be accurate. !>
[in]	MV	!> MV is INTEGER !> If JOBV = 'A', then the product of Jacobi rotations in SGESVJ !> is applied to the first MV rows of V. See the description of JOBV. !>
[in,out]	V	!> V is REAL array, dimension (LDV,N) !> If JOBV = 'V', then V contains on exit the N-by-N matrix of !> the right singular vectors; !> If JOBV = 'A', then V contains the product of the computed right !> singular vector matrix and the initial matrix in !> the array V. !> If JOBV = 'N', then V is not referenced. !>
[in]	LDV	!> LDV is INTEGER !> The leading dimension of the array V, LDV >= 1. !> If JOBV = 'V', then LDV >= max(1,N). !> If JOBV = 'A', then LDV >= max(1,MV) . !>
[in,out]	WORK	!> WORK is REAL array, dimension (MAX(1,LWORK)) !> On entry, !> If JOBU = 'C' : !> WORK(1) = CTOL, where CTOL defines the threshold for convergence. !> The process stops if all columns of A are mutually !> orthogonal up to CTOLEPS, EPS=SLAMCH('E'). !> It is required that CTOL >= ONE, i.e. it is not !> allowed to force the routine to obtain orthogonality !> below EPSILON. !> On exit, !> WORK(1) = SCALE is the scaling factor such that SCALESVA(1:N) !> are the computed singular vcalues of A. !> (See description of SVA().) !> WORK(2) = NINT(WORK(2)) is the number of the computed nonzero !> singular values. !> WORK(3) = NINT(WORK(3)) is the number of the computed singular !> values that are larger than the underflow threshold. !> WORK(4) = NINT(WORK(4)) is the number of sweeps of Jacobi !> rotations needed for numerical convergence. !> WORK(5) = max_{i.NE.j} \|COS(A(:,i),A(:,j))\| in the last sweep. !> This is useful information in cases when SGESVJ did !> not converge, as it can be used to estimate whether !> the output is still useful and for post festum analysis. !> WORK(6) = the largest absolute value over all sines of the !> Jacobi rotation angles in the last sweep. It can be !> useful for a post festum analysis. !>
[in]	LWORK	!> LWORK is INTEGER !> Length of WORK. !> LWORK >= 1, if MIN(M,N) = 0, and LWORK >= MAX(6,M+N), otherwise. !> !> If on entry LWORK = -1, then a workspace query is assumed and !> no computation is done; WORK(1) is set to the minial (and optimal) !> length of WORK. !>
[out]	INFO	!> INFO is INTEGER !> = 0: successful exit. !> < 0: if INFO = -i, then the i-th argument had an illegal value !> > 0: SGESVJ did not converge in the maximal allowed number (30) !> of sweeps. The output may still be useful. See the !> description of WORK. !>

Author: Univ. of Tennessee; Univ. of California Berkeley; Univ. of Colorado Denver; NAG Ltd.

Further Details:: The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane rotations. The rotations are implemented as fast scaled rotations of Anda and Park [1]. In the case of underflow of the Jacobi angle, a modified Jacobi transformation of Drmac [4] is used. Pivot strategy uses column interchanges of de Rijk [2]. The relative accuracy of the computed singular values and the accuracy of the computed singular vectors (in angle metric) is as guaranteed by the theory of Demmel and Veselic [3]. The condition number that determines the accuracy in the full rank case is essentially min_{D=diag} kappa(A*D), where kappa(.) is the spectral condition number. The best performance of this Jacobi SVD procedure is achieved if used in an accelerated version of Drmac and Veselic [5,6], and it is the kernel routine in the SIGMA library [7]. Some tuning parameters (marked with [TP]) are available for the implementer.
The computational range for the nonzero singular values is the machine number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even denormalized singular values can be computed with the corresponding gradual loss of accurate digits.

Contributors:: Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)

References:: [1] A. A. Anda and H. Park: Fast plane rotations with dynamic scaling.
SIAM J. matrix Anal. Appl., Vol. 15 (1994), pp. 162-174.

[2] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer.
SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.

[3] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
[4] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic.
SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.

[5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
LAPACK Working note 169.

[6] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
LAPACK Working note 170.

[7] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations.
Department of Mathematics, University of Zagreb, 2008.

Bugs, Examples and Comments:: Please report all bugs and send interesting test examples and comments to drmac.nosp@m.@mat.nosp@m.h.hr. Thank you.

Definition at line 324 of file sgesvj.f.

*
*  -- LAPACK computational routine --
*  -- LAPACK is a software package provided by Univ. of Tennessee,    --
*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
*
*     .. Scalar Arguments ..
      INTEGER            INFO, LDA, LDV, LWORK, M, MV, N
      CHARACTER*1        JOBA, JOBU, JOBV
*     ..
*     .. Array Arguments ..
      REAL               A( LDA, * ), SVA( N ), V( LDV, * ),
     $                   WORK( LWORK )
*     ..
*
*  =====================================================================
*
*     .. Local Parameters ..
      REAL               ZERO, HALF, ONE
      parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
      INTEGER            NSWEEP
      parameter( nsweep = 30 )
*     ..
*     .. Local Scalars ..
      REAL               AAPP, AAPP0, AAPQ, AAQQ, APOAQ, AQOAP, BIG,
     $                   BIGTHETA, CS, CTOL, EPSLN, LARGE, MXAAPQ,
     $                   MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL,
     $                   SKL, SFMIN, SMALL, SN, T, TEMP1, THETA,
     $                   THSIGN, TOL
      INTEGER            BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
     $                   ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34,
     $                   N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP,
     $                   SWBAND, MINMN, LWMIN
      LOGICAL            APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE,
     $                   ROTOK, RSVEC, UCTOL, UPPER
*     ..
*     .. Local Arrays ..
      REAL               FASTR( 5 )
*     ..
*     .. Intrinsic Functions ..
      INTRINSIC          abs, max, min, float, sign, sqrt
*     ..
*     .. External Functions ..
*     ..
*     from BLAS
      REAL               SDOT, SNRM2
      EXTERNAL           sdot, snrm2
      INTEGER            ISAMAX
      EXTERNAL           isamax
*     from LAPACK
      REAL               SLAMCH, SROUNDUP_LWORK
      EXTERNAL           slamch, sroundup_lwork
      LOGICAL            LSAME
      EXTERNAL           lsame
*     ..
*     .. External Subroutines ..
*     ..
*     from BLAS
      EXTERNAL           saxpy, scopy, srotm, sscal, sswap
*     from LAPACK
      EXTERNAL           slascl, slaset, slassq, xerbla
*
      EXTERNAL           sgsvj0, sgsvj1
*     ..
*     .. Executable Statements ..
*
*     Test the input arguments
*
      lsvec = lsame( jobu, 'U' )
      uctol = lsame( jobu, 'C' )
      rsvec = lsame( jobv, 'V' )
      applv = lsame( jobv, 'A' )
      upper = lsame( joba, 'U' )
      lower = lsame( joba, 'L' )
*
      minmn = min( m, n )
      IF( minmn.EQ.0 ) THEN
         lwmin = 1
      ELSE
         lwmin = max( 6, m+n )
      END IF
*
      lquery = ( lwork.EQ.-1 )
      IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
         info = -1
      ELSE IF( .NOT.( lsvec .OR.
     $         uctol .OR.
     $         lsame( jobu, 'N' ) ) ) THEN
         info = -2
      ELSE IF( .NOT.( rsvec .OR.
     $         applv .OR.
     $         lsame( jobv, 'N' ) ) ) THEN
         info = -3
      ELSE IF( m.LT.0 ) THEN
         info = -4
      ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
         info = -5
      ELSE IF( lda.LT.m ) THEN
         info = -7
      ELSE IF( mv.LT.0 ) THEN
         info = -9
      ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
     $         ( applv .AND. ( ldv.LT.mv ) ) ) THEN
         info = -11
      ELSE IF( uctol .AND. ( work( 1 ).LE.one ) ) THEN
         info = -12
      ELSE IF( lwork.LT.lwmin .AND. ( .NOT.lquery ) ) THEN
         info = -13
      ELSE
         info = 0
      END IF
*
*     #:(
      IF( info.NE.0 ) THEN
         CALL xerbla( 'SGESVJ', -info )
         RETURN
      ELSE IF( lquery ) THEN
         work( 1 ) = sroundup_lwork( lwmin )
         RETURN
      END IF
*
* #:) Quick return for void matrix
*
      IF( minmn.EQ.0 ) RETURN
*
*     Set numerical parameters
*     The stopping criterion for Jacobi rotations is
*
*     max_{i<>j}|A(:,i)^T * A(:,j)|/(||A(:,i)||*||A(:,j)||) < CTOL*EPS
*
*     where EPS is the round-off and CTOL is defined as follows:
*
      IF( uctol ) THEN
*        ... user controlled
         ctol = work( 1 )
      ELSE
*        ... default
         IF( lsvec .OR. rsvec .OR. applv ) THEN
            ctol = sqrt( float( m ) )
         ELSE
            ctol = float( m )
         END IF
      END IF
*     ... and the machine dependent parameters are
*[!]  (Make sure that SLAMCH() works properly on the target machine.)
*
      epsln = slamch( 'Epsilon' )
      rooteps = sqrt( epsln )
      sfmin = slamch( 'SafeMinimum' )
      rootsfmin = sqrt( sfmin )
      small = sfmin / epsln
      big = slamch( 'Overflow' )
*     BIG         = ONE    / SFMIN
      rootbig = one / rootsfmin
      large = big / sqrt( float( m*n ) )
      bigtheta = one / rooteps
*
      tol = ctol*epsln
      roottol = sqrt( tol )
*
      IF( float( m )*epsln.GE.one ) THEN
         info = -4
         CALL xerbla( 'SGESVJ', -info )
         RETURN
      END IF
*
*     Initialize the right singular vector matrix.
*
      IF( rsvec ) THEN
         mvl = n
         CALL slaset( 'A', mvl, n, zero, one, v, ldv )
      ELSE IF( applv ) THEN
         mvl = mv
      END IF
      rsvec = rsvec .OR. applv
*
*     Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
*(!)  If necessary, scale A to protect the largest singular value
*     from overflow. It is possible that saving the largest singular
*     value destroys the information about the small ones.
*     This initial scaling is almost minimal in the sense that the
*     goal is to make sure that no column norm overflows, and that
*     SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
*     in A are detected, the procedure returns with INFO=-6.
*
      skl = one / sqrt( float( m )*float( n ) )
      noscale = .true.
      goscale = .true.
*
      IF( lower ) THEN
*        the input matrix is M-by-N lower triangular (trapezoidal)
         DO 1874 p = 1, n
            aapp = zero
            aaqq = one
            CALL slassq( m-p+1, a( p, p ), 1, aapp, aaqq )
            IF( aapp.GT.big ) THEN
               info = -6
               CALL xerbla( 'SGESVJ', -info )
               RETURN
            END IF
            aaqq = sqrt( aaqq )
            IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
               sva( p ) = aapp*aaqq
            ELSE
               noscale = .false.
               sva( p ) = aapp*( aaqq*skl )
               IF( goscale ) THEN
                  goscale = .false.
                  DO 1873 q = 1, p - 1
                     sva( q ) = sva( q )*skl
 1873             CONTINUE
               END IF
            END IF
 1874    CONTINUE
      ELSE IF( upper ) THEN
*        the input matrix is M-by-N upper triangular (trapezoidal)
         DO 2874 p = 1, n
            aapp = zero
            aaqq = one
            CALL slassq( p, a( 1, p ), 1, aapp, aaqq )
            IF( aapp.GT.big ) THEN
               info = -6
               CALL xerbla( 'SGESVJ', -info )
               RETURN
            END IF
            aaqq = sqrt( aaqq )
            IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
               sva( p ) = aapp*aaqq
            ELSE
               noscale = .false.
               sva( p ) = aapp*( aaqq*skl )
               IF( goscale ) THEN
                  goscale = .false.
                  DO 2873 q = 1, p - 1
                     sva( q ) = sva( q )*skl
 2873             CONTINUE
               END IF
            END IF
 2874    CONTINUE
      ELSE
*        the input matrix is M-by-N general dense
         DO 3874 p = 1, n
            aapp = zero
            aaqq = one
            CALL slassq( m, a( 1, p ), 1, aapp, aaqq )
            IF( aapp.GT.big ) THEN
               info = -6
               CALL xerbla( 'SGESVJ', -info )
               RETURN
            END IF
            aaqq = sqrt( aaqq )
            IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
               sva( p ) = aapp*aaqq
            ELSE
               noscale = .false.
               sva( p ) = aapp*( aaqq*skl )
               IF( goscale ) THEN
                  goscale = .false.
                  DO 3873 q = 1, p - 1
                     sva( q ) = sva( q )*skl
 3873             CONTINUE
               END IF
            END IF
 3874    CONTINUE
      END IF
*
      IF( noscale )skl = one
*
*     Move the smaller part of the spectrum from the underflow threshold
*(!)  Start by determining the position of the nonzero entries of the
*     array SVA() relative to ( SFMIN, BIG ).
*
      aapp = zero
      aaqq = big
      DO 4781 p = 1, n
         IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
         aapp = max( aapp, sva( p ) )
 4781 CONTINUE
*
* #:) Quick return for zero matrix
*
      IF( aapp.EQ.zero ) THEN
         IF( lsvec )CALL slaset( 'G', m, n, zero, one, a, lda )
         work( 1 ) = one
         work( 2 ) = zero
         work( 3 ) = zero
         work( 4 ) = zero
         work( 5 ) = zero
         work( 6 ) = zero
         RETURN
      END IF
*
* #:) Quick return for one-column matrix
*
      IF( n.EQ.1 ) THEN
         IF( lsvec )CALL slascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
     $                           a( 1, 1 ), lda, ierr )
         work( 1 ) = one / skl
         IF( sva( 1 ).GE.sfmin ) THEN
            work( 2 ) = one
         ELSE
            work( 2 ) = zero
         END IF
         work( 3 ) = zero
         work( 4 ) = zero
         work( 5 ) = zero
         work( 6 ) = zero
         RETURN
      END IF
*
*     Protect small singular values from underflow, and try to
*     avoid underflows/overflows in computing Jacobi rotations.
*
      sn = sqrt( sfmin / epsln )
      temp1 = sqrt( big / float( n ) )
      IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
     $    ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
         temp1 = min( big, temp1 / aapp )
*         AAQQ  = AAQQ*TEMP1
*         AAPP  = AAPP*TEMP1
      ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
         temp1 = min( sn / aaqq, big / ( aapp*sqrt( float( n ) ) ) )
*         AAQQ  = AAQQ*TEMP1
*         AAPP  = AAPP*TEMP1
      ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
         temp1 = max( sn / aaqq, temp1 / aapp )
*         AAQQ  = AAQQ*TEMP1
*         AAPP  = AAPP*TEMP1
      ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
         temp1 = min( sn / aaqq, big / ( sqrt( float( n ) )*aapp ) )
*         AAQQ  = AAQQ*TEMP1
*         AAPP  = AAPP*TEMP1
      ELSE
         temp1 = one
      END IF
*
*     Scale, if necessary
*
      IF( temp1.NE.one ) THEN
         CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
      END IF
      skl = temp1*skl
      IF( skl.NE.one ) THEN
         CALL slascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
         skl = one / skl
      END IF
*
*     Row-cyclic Jacobi SVD algorithm with column pivoting
*
      emptsw = ( n*( n-1 ) ) / 2
      notrot = 0
      fastr( 1 ) = zero
*
*     A is represented in factored form A = A * diag(WORK), where diag(WORK)
*     is initialized to identity. WORK is updated during fast scaled
*     rotations.
*
      DO 1868 q = 1, n
         work( q ) = one
 1868 CONTINUE
*
*
      swband = 3
*[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
*     if SGESVJ is used as a computational routine in the preconditioned
*     Jacobi SVD algorithm SGESVJ. For sweeps i=1:SWBAND the procedure
*     works on pivots inside a band-like region around the diagonal.
*     The boundaries are determined dynamically, based on the number of
*     pivots above a threshold.
*
      kbl = min( 8, n )
*[TP] KBL is a tuning parameter that defines the tile size in the
*     tiling of the p-q loops of pivot pairs. In general, an optimal
*     value of KBL depends on the matrix dimensions and on the
*     parameters of the computer's memory.
*
      nbl = n / kbl
      IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
*
      blskip = kbl**2
*[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
*
      rowskip = min( 5, kbl )
*[TP] ROWSKIP is a tuning parameter.
*
      lkahead = 1
*[TP] LKAHEAD is a tuning parameter.
*
*     Quasi block transformations, using the lower (upper) triangular
*     structure of the input matrix. The quasi-block-cycling usually
*     invokes cubic convergence. Big part of this cycle is done inside
*     canonical subspaces of dimensions less than M.
*
      IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
*[TP] The number of partition levels and the actual partition are
*     tuning parameters.
         n4 = n / 4
         n2 = n / 2
         n34 = 3*n4
         IF( applv ) THEN
            q = 0
         ELSE
            q = 1
         END IF
*
         IF( lower ) THEN
*
*     This works very well on lower triangular matrices, in particular
*     in the framework of the preconditioned Jacobi SVD (xGEJSV).
*     The idea is simple:
*     [+ 0 0 0]   Note that Jacobi transformations of [0 0]
*     [+ + 0 0]                                       [0 0]
*     [+ + x 0]   actually work on [x 0]              [x 0]
*     [+ + x x]                    [x x].             [x x]
*
            CALL sgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
     $                   work( n34+1 ), sva( n34+1 ), mvl,
     $                   v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
     $                   2, work( n+1 ), lwork-n, ierr )
*
            CALL sgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
     $                   work( n2+1 ), sva( n2+1 ), mvl,
     $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
     $                   work( n+1 ), lwork-n, ierr )
*
            CALL sgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
     $                   work( n2+1 ), sva( n2+1 ), mvl,
     $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
     $                   work( n+1 ), lwork-n, ierr )
*
            CALL sgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
     $                   work( n4+1 ), sva( n4+1 ), mvl,
     $                   v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
     $                   work( n+1 ), lwork-n, ierr )
*
            CALL sgsvj0( jobv, m, n4, a, lda, work, sva, mvl, v, ldv,
     $                   epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
     $                   ierr )
*
            CALL sgsvj1( jobv, m, n2, n4, a, lda, work, sva, mvl, v,
     $                   ldv, epsln, sfmin, tol, 1, work( n+1 ),
     $                   lwork-n, ierr )
*
*
         ELSE IF( upper ) THEN
*
*
            CALL sgsvj0( jobv, n4, n4, a, lda, work, sva, mvl, v,
     $                   ldv,
     $                   epsln, sfmin, tol, 2, work( n+1 ), lwork-n,
     $                   ierr )
*
            CALL sgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda,
     $                   work( n4+1 ),
     $                   sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
     $                   epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
     $                   ierr )
*
            CALL sgsvj1( jobv, n2, n2, n4, a, lda, work, sva, mvl, v,
     $                   ldv, epsln, sfmin, tol, 1, work( n+1 ),
     $                   lwork-n, ierr )
*
            CALL sgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
     $                   work( n2+1 ), sva( n2+1 ), mvl,
     $                   v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
     $                   work( n+1 ), lwork-n, ierr )
 
         END IF
*
      END IF
*
*     .. Row-cyclic pivot strategy with de Rijk's pivoting ..
*
      DO 1993 i = 1, nsweep
*
*     .. go go go ...
*
         mxaapq = zero
         mxsinj = zero
         iswrot = 0
*
         notrot = 0
         pskipped = 0
*
*     Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
*     1 <= p < q <= N. This is the first step toward a blocked implementation
*     of the rotations. New implementation, based on block transformations,
*     is under development.
*
         DO 2000 ibr = 1, nbl
*
            igl = ( ibr-1 )*kbl + 1
*
            DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
*
               igl = igl + ir1*kbl
*
               DO 2001 p = igl, min( igl+kbl-1, n-1 )
*
*     .. de Rijk's pivoting
*
                  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
                  IF( p.NE.q ) THEN
                     CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
                     IF( rsvec )CALL sswap( mvl, v( 1, p ), 1,
     $                                      v( 1, q ), 1 )
                     temp1 = sva( p )
                     sva( p ) = sva( q )
                     sva( q ) = temp1
                     temp1 = work( p )
                     work( p ) = work( q )
                     work( q ) = temp1
                  END IF
*
                  IF( ir1.EQ.0 ) THEN
*
*        Column norms are periodically updated by explicit
*        norm computation.
*        Caveat:
*        Unfortunately, some BLAS implementations compute SNRM2(M,A(1,p),1)
*        as SQRT(SDOT(M,A(1,p),1,A(1,p),1)), which may cause the result to
*        overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
*        underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
*        Hence, SNRM2 cannot be trusted, not even in the case when
*        the true norm is far from the under(over)flow boundaries.
*        If properly implemented SNRM2 is available, the IF-THEN-ELSE
*        below should read "AAPP = SNRM2( M, A(1,p), 1 ) * WORK(p)".
*
                     IF( ( sva( p ).LT.rootbig ) .AND.
     $                   ( sva( p ).GT.rootsfmin ) ) THEN
                        sva( p ) = snrm2( m, a( 1, p ), 1 )*work( p )
                     ELSE
                        temp1 = zero
                        aapp = one
                        CALL slassq( m, a( 1, p ), 1, temp1, aapp )
                        sva( p ) = temp1*sqrt( aapp )*work( p )
                     END IF
                     aapp = sva( p )
                  ELSE
                     aapp = sva( p )
                  END IF
*
                  IF( aapp.GT.zero ) THEN
*
                     pskipped = 0
*
                     DO 2002 q = p + 1, min( igl+kbl-1, n )
*
                        aaqq = sva( q )
*
                        IF( aaqq.GT.zero ) THEN
*
                           aapp0 = aapp
                           IF( aaqq.GE.one ) THEN
                              rotok = ( small*aapp ).LE.aaqq
                              IF( aapp.LT.( big / aaqq ) ) THEN
                                 aapq = ( sdot( m, a( 1, p ), 1,
     $                                    a( 1,
     $                                  q ), 1 )*work( p )*work( q ) /
     $                                  aaqq ) / aapp
                              ELSE
                                 CALL scopy( m, a( 1, p ), 1,
     $                                       work( n+1 ), 1 )
                                 CALL slascl( 'G', 0, 0, aapp,
     $                                        work( p ), m, 1,
     $                                        work( n+1 ), lda, ierr )
                                 aapq = sdot( m, work( n+1 ), 1,
     $                                  a( 1, q ), 1 )*work( q ) / aaqq
                              END IF
                           ELSE
                              rotok = aapp.LE.( aaqq / small )
                              IF( aapp.GT.( small / aaqq ) ) THEN
                                 aapq = ( sdot( m, a( 1, p ), 1,
     $                                    a( 1,
     $                                  q ), 1 )*work( p )*work( q ) /
     $                                  aaqq ) / aapp
                              ELSE
                                 CALL scopy( m, a( 1, q ), 1,
     $                                       work( n+1 ), 1 )
                                 CALL slascl( 'G', 0, 0, aaqq,
     $                                        work( q ), m, 1,
     $                                        work( n+1 ), lda, ierr )
                                 aapq = sdot( m, work( n+1 ), 1,
     $                                  a( 1, p ), 1 )*work( p ) / aapp
                              END IF
                           END IF
*
                           mxaapq = max( mxaapq, abs( aapq ) )
*
*        TO rotate or NOT to rotate, THAT is the question ...
*
                           IF( abs( aapq ).GT.tol ) THEN
*
*           .. rotate
*[RTD]      ROTATED = ROTATED + ONE
*
                              IF( ir1.EQ.0 ) THEN
                                 notrot = 0
                                 pskipped = 0
                                 iswrot = iswrot + 1
                              END IF
*
                              IF( rotok ) THEN
*
                                 aqoap = aaqq / aapp
                                 apoaq = aapp / aaqq
                                 theta = -half*abs( aqoap-apoaq ) / aapq
*
                                 IF( abs( theta ).GT.bigtheta ) THEN
*
                                    t = half / theta
                                    fastr( 3 ) = t*work( p ) / work( q )
                                    fastr( 4 ) = -t*work( q ) /
     $                                           work( p )
                                    CALL srotm( m, a( 1, p ), 1,
     $                                          a( 1, q ), 1, fastr )
                                    IF( rsvec )CALL srotm( mvl,
     $                                              v( 1, p ), 1,
     $                                              v( 1, q ), 1,
     $                                              fastr )
                                    sva( q ) = aaqq*sqrt( max( zero,
     $                                         one+t*apoaq*aapq ) )
                                    aapp = aapp*sqrt( max( zero,
     $                                         one-t*aqoap*aapq ) )
                                    mxsinj = max( mxsinj, abs( t ) )
*
                                 ELSE
*
*                 .. choose correct signum for THETA and rotate
*
                                    thsign = -sign( one, aapq )
                                    t = one / ( theta+thsign*
     $                                  sqrt( one+theta*theta ) )
                                    cs = sqrt( one / ( one+t*t ) )
                                    sn = t*cs
*
                                    mxsinj = max( mxsinj, abs( sn ) )
                                    sva( q ) = aaqq*sqrt( max( zero,
     $                                         one+t*apoaq*aapq ) )
                                    aapp = aapp*sqrt( max( zero,
     $                                     one-t*aqoap*aapq ) )
*
                                    apoaq = work( p ) / work( q )
                                    aqoap = work( q ) / work( p )
                                    IF( work( p ).GE.one ) THEN
                                       IF( work( q ).GE.one ) THEN
                                          fastr( 3 ) = t*apoaq
                                          fastr( 4 ) = -t*aqoap
                                          work( p ) = work( p )*cs
                                          work( q ) = work( q )*cs
                                          CALL srotm( m, a( 1, p ),
     $                                                1,
     $                                                a( 1, q ), 1,
     $                                                fastr )
                                          IF( rsvec )CALL srotm( mvl,
     $                                        v( 1, p ), 1, v( 1, q ),
     $                                        1, fastr )
                                       ELSE
                                          CALL saxpy( m, -t*aqoap,
     $                                                a( 1, q ), 1,
     $                                                a( 1, p ), 1 )
                                          CALL saxpy( m, cs*sn*apoaq,
     $                                                a( 1, p ), 1,
     $                                                a( 1, q ), 1 )
                                          work( p ) = work( p )*cs
                                          work( q ) = work( q ) / cs
                                          IF( rsvec ) THEN
                                             CALL saxpy( mvl,
     $                                                   -t*aqoap,
     $                                                   v( 1, q ), 1,
     $                                                   v( 1, p ), 1 )
                                             CALL saxpy( mvl,
     $                                                   cs*sn*apoaq,
     $                                                   v( 1, p ), 1,
     $                                                   v( 1, q ), 1 )
                                          END IF
                                       END IF
                                    ELSE
                                       IF( work( q ).GE.one ) THEN
                                          CALL saxpy( m, t*apoaq,
     $                                                a( 1, p ), 1,
     $                                                a( 1, q ), 1 )
                                          CALL saxpy( m,
     $                                                -cs*sn*aqoap,
     $                                                a( 1, q ), 1,
     $                                                a( 1, p ), 1 )
                                          work( p ) = work( p ) / cs
                                          work( q ) = work( q )*cs
                                          IF( rsvec ) THEN
                                             CALL saxpy( mvl,
     $                                                   t*apoaq,
     $                                                   v( 1, p ), 1,
     $                                                   v( 1, q ), 1 )
                                             CALL saxpy( mvl,
     $                                                   -cs*sn*aqoap,
     $                                                   v( 1, q ), 1,
     $                                                   v( 1, p ), 1 )
                                          END IF
                                       ELSE
                                          IF( work( p ).GE.work( q ) )
     $                                        THEN
                                             CALL saxpy( m, -t*aqoap,
     $                                                   a( 1, q ), 1,
     $                                                   a( 1, p ), 1 )
                                             CALL saxpy( m,
     $                                                   cs*sn*apoaq,
     $                                                   a( 1, p ), 1,
     $                                                   a( 1, q ), 1 )
                                             work( p ) = work( p )*cs
                                             work( q ) = work( q ) / cs
                                             IF( rsvec ) THEN
                                                CALL saxpy( mvl,
     $                                               -t*aqoap,
     $                                               v( 1, q ), 1,
     $                                               v( 1, p ), 1 )
                                                CALL saxpy( mvl,
     $                                               cs*sn*apoaq,
     $                                               v( 1, p ), 1,
     $                                               v( 1, q ), 1 )
                                             END IF
                                          ELSE
                                             CALL saxpy( m, t*apoaq,
     $                                                   a( 1, p ), 1,
     $                                                   a( 1, q ), 1 )
                                             CALL saxpy( m,
     $                                                   -cs*sn*aqoap,
     $                                                   a( 1, q ), 1,
     $                                                   a( 1, p ), 1 )
                                             work( p ) = work( p ) / cs
                                             work( q ) = work( q )*cs
                                             IF( rsvec ) THEN
                                                CALL saxpy( mvl,
     $                                               t*apoaq, v( 1, p ),
     $                                               1, v( 1, q ), 1 )
                                                CALL saxpy( mvl,
     $                                               -cs*sn*aqoap,
     $                                               v( 1, q ), 1,
     $                                               v( 1, p ), 1 )
                                             END IF
                                          END IF
                                       END IF
                                    END IF
                                 END IF
*
                              ELSE
*              .. have to use modified Gram-Schmidt like transformation
                                 CALL scopy( m, a( 1, p ), 1,
     $                                       work( n+1 ), 1 )
                                 CALL slascl( 'G', 0, 0, aapp, one,
     $                                        m,
     $                                        1, work( n+1 ), lda,
     $                                        ierr )
                                 CALL slascl( 'G', 0, 0, aaqq, one,
     $                                        m,
     $                                        1, a( 1, q ), lda, ierr )
                                 temp1 = -aapq*work( p ) / work( q )
                                 CALL saxpy( m, temp1, work( n+1 ),
     $                                       1,
     $                                       a( 1, q ), 1 )
                                 CALL slascl( 'G', 0, 0, one, aaqq,
     $                                        m,
     $                                        1, a( 1, q ), lda, ierr )
                                 sva( q ) = aaqq*sqrt( max( zero,
     $                                      one-aapq*aapq ) )
                                 mxsinj = max( mxsinj, sfmin )
                              END IF
*           END IF ROTOK THEN ... ELSE
*
*           In the case of cancellation in updating SVA(q), SVA(p)
*           recompute SVA(q), SVA(p).
*
                              IF( ( sva( q ) / aaqq )**2.LE.rooteps )
     $                            THEN
                                 IF( ( aaqq.LT.rootbig ) .AND.
     $                               ( aaqq.GT.rootsfmin ) ) THEN
                                    sva( q ) = snrm2( m, a( 1, q ),
     $                                   1 )*
     $                                         work( q )
                                 ELSE
                                    t = zero
                                    aaqq = one
                                    CALL slassq( m, a( 1, q ), 1, t,
     $                                           aaqq )
                                    sva( q ) = t*sqrt( aaqq )*work( q )
                                 END IF
                              END IF
                              IF( ( aapp / aapp0 ).LE.rooteps ) THEN
                                 IF( ( aapp.LT.rootbig ) .AND.
     $                               ( aapp.GT.rootsfmin ) ) THEN
                                    aapp = snrm2( m, a( 1, p ), 1 )*
     $                                     work( p )
                                 ELSE
                                    t = zero
                                    aapp = one
                                    CALL slassq( m, a( 1, p ), 1, t,
     $                                           aapp )
                                    aapp = t*sqrt( aapp )*work( p )
                                 END IF
                                 sva( p ) = aapp
                              END IF
*
                           ELSE
*        A(:,p) and A(:,q) already numerically orthogonal
                              IF( ir1.EQ.0 )notrot = notrot + 1
*[RTD]      SKIPPED  = SKIPPED  + 1
                              pskipped = pskipped + 1
                           END IF
                        ELSE
*        A(:,q) is zero column
                           IF( ir1.EQ.0 )notrot = notrot + 1
                           pskipped = pskipped + 1
                        END IF
*
                        IF( ( i.LE.swband ) .AND.
     $                      ( pskipped.GT.rowskip ) ) THEN
                           IF( ir1.EQ.0 )aapp = -aapp
                           notrot = 0
                           GO TO 2103
                        END IF
*
 2002                CONTINUE
*     END q-LOOP
*
 2103                CONTINUE
*     bailed out of q-loop
*
                     sva( p ) = aapp
*
                  ELSE
                     sva( p ) = aapp
                     IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
     $                   notrot = notrot + min( igl+kbl-1, n ) - p
                  END IF
*
 2001          CONTINUE
*     end of the p-loop
*     end of doing the block ( ibr, ibr )
 1002       CONTINUE
*     end of ir1-loop
*
* ... go to the off diagonal blocks
*
            igl = ( ibr-1 )*kbl + 1
*
            DO 2010 jbc = ibr + 1, nbl
*
               jgl = ( jbc-1 )*kbl + 1
*
*        doing the block at ( ibr, jbc )
*
               ijblsk = 0
               DO 2100 p = igl, min( igl+kbl-1, n )
*
                  aapp = sva( p )
                  IF( aapp.GT.zero ) THEN
*
                     pskipped = 0
*
                     DO 2200 q = jgl, min( jgl+kbl-1, n )
*
                        aaqq = sva( q )
                        IF( aaqq.GT.zero ) THEN
                           aapp0 = aapp
*
*     .. M x 2 Jacobi SVD ..
*
*        Safe Gram matrix computation
*
                           IF( aaqq.GE.one ) THEN
                              IF( aapp.GE.aaqq ) THEN
                                 rotok = ( small*aapp ).LE.aaqq
                              ELSE
                                 rotok = ( small*aaqq ).LE.aapp
                              END IF
                              IF( aapp.LT.( big / aaqq ) ) THEN
                                 aapq = ( sdot( m, a( 1, p ), 1,
     $                                    a( 1,
     $                                  q ), 1 )*work( p )*work( q ) /
     $                                  aaqq ) / aapp
                              ELSE
                                 CALL scopy( m, a( 1, p ), 1,
     $                                       work( n+1 ), 1 )
                                 CALL slascl( 'G', 0, 0, aapp,
     $                                        work( p ), m, 1,
     $                                        work( n+1 ), lda, ierr )
                                 aapq = sdot( m, work( n+1 ), 1,
     $                                  a( 1, q ), 1 )*work( q ) / aaqq
                              END IF
                           ELSE
                              IF( aapp.GE.aaqq ) THEN
                                 rotok = aapp.LE.( aaqq / small )
                              ELSE
                                 rotok = aaqq.LE.( aapp / small )
                              END IF
                              IF( aapp.GT.( small / aaqq ) ) THEN
                                 aapq = ( sdot( m, a( 1, p ), 1,
     $                                    a( 1,
     $                                  q ), 1 )*work( p )*work( q ) /
     $                                  aaqq ) / aapp
                              ELSE
                                 CALL scopy( m, a( 1, q ), 1,
     $                                       work( n+1 ), 1 )
                                 CALL slascl( 'G', 0, 0, aaqq,
     $                                        work( q ), m, 1,
     $                                        work( n+1 ), lda, ierr )
                                 aapq = sdot( m, work( n+1 ), 1,
     $                                  a( 1, p ), 1 )*work( p ) / aapp
                              END IF
                           END IF
*
                           mxaapq = max( mxaapq, abs( aapq ) )
*
*        TO rotate or NOT to rotate, THAT is the question ...
*
                           IF( abs( aapq ).GT.tol ) THEN
                              notrot = 0
*[RTD]      ROTATED  = ROTATED + 1
                              pskipped = 0
                              iswrot = iswrot + 1
*
                              IF( rotok ) THEN
*
                                 aqoap = aaqq / aapp
                                 apoaq = aapp / aaqq
                                 theta = -half*abs( aqoap-apoaq ) / aapq
                                 IF( aaqq.GT.aapp0 )theta = -theta
*
                                 IF( abs( theta ).GT.bigtheta ) THEN
                                    t = half / theta
                                    fastr( 3 ) = t*work( p ) / work( q )
                                    fastr( 4 ) = -t*work( q ) /
     $                                           work( p )
                                    CALL srotm( m, a( 1, p ), 1,
     $                                          a( 1, q ), 1, fastr )
                                    IF( rsvec )CALL srotm( mvl,
     $                                              v( 1, p ), 1,
     $                                              v( 1, q ), 1,
     $                                              fastr )
                                    sva( q ) = aaqq*sqrt( max( zero,
     $                                         one+t*apoaq*aapq ) )
                                    aapp = aapp*sqrt( max( zero,
     $                                     one-t*aqoap*aapq ) )
                                    mxsinj = max( mxsinj, abs( t ) )
                                 ELSE
*
*                 .. choose correct signum for THETA and rotate
*
                                    thsign = -sign( one, aapq )
                                    IF( aaqq.GT.aapp0 )thsign = -thsign
                                    t = one / ( theta+thsign*
     $                                  sqrt( one+theta*theta ) )
                                    cs = sqrt( one / ( one+t*t ) )
                                    sn = t*cs
                                    mxsinj = max( mxsinj, abs( sn ) )
                                    sva( q ) = aaqq*sqrt( max( zero,
     $                                         one+t*apoaq*aapq ) )
                                    aapp = aapp*sqrt( max( zero,
     $                                         one-t*aqoap*aapq ) )
*
                                    apoaq = work( p ) / work( q )
                                    aqoap = work( q ) / work( p )
                                    IF( work( p ).GE.one ) THEN
*
                                       IF( work( q ).GE.one ) THEN
                                          fastr( 3 ) = t*apoaq
                                          fastr( 4 ) = -t*aqoap
                                          work( p ) = work( p )*cs
                                          work( q ) = work( q )*cs
                                          CALL srotm( m, a( 1, p ),
     $                                                1,
     $                                                a( 1, q ), 1,
     $                                                fastr )
                                          IF( rsvec )CALL srotm( mvl,
     $                                        v( 1, p ), 1, v( 1, q ),
     $                                        1, fastr )
                                       ELSE
                                          CALL saxpy( m, -t*aqoap,
     $                                                a( 1, q ), 1,
     $                                                a( 1, p ), 1 )
                                          CALL saxpy( m, cs*sn*apoaq,
     $                                                a( 1, p ), 1,
     $                                                a( 1, q ), 1 )
                                          IF( rsvec ) THEN
                                             CALL saxpy( mvl,
     $                                                   -t*aqoap,
     $                                                   v( 1, q ), 1,
     $                                                   v( 1, p ), 1 )
                                             CALL saxpy( mvl,
     $                                                   cs*sn*apoaq,
     $                                                   v( 1, p ), 1,
     $                                                   v( 1, q ), 1 )
                                          END IF
                                          work( p ) = work( p )*cs
                                          work( q ) = work( q ) / cs
                                       END IF
                                    ELSE
                                       IF( work( q ).GE.one ) THEN
                                          CALL saxpy( m, t*apoaq,
     $                                                a( 1, p ), 1,
     $                                                a( 1, q ), 1 )
                                          CALL saxpy( m,
     $                                                -cs*sn*aqoap,
     $                                                a( 1, q ), 1,
     $                                                a( 1, p ), 1 )
                                          IF( rsvec ) THEN
                                             CALL saxpy( mvl,
     $                                                   t*apoaq,
     $                                                   v( 1, p ), 1,
     $                                                   v( 1, q ), 1 )
                                             CALL saxpy( mvl,
     $                                                   -cs*sn*aqoap,
     $                                                   v( 1, q ), 1,
     $                                                   v( 1, p ), 1 )
                                          END IF
                                          work( p ) = work( p ) / cs
                                          work( q ) = work( q )*cs
                                       ELSE
                                          IF( work( p ).GE.work( q ) )
     $                                        THEN
                                             CALL saxpy( m, -t*aqoap,
     $                                                   a( 1, q ), 1,
     $                                                   a( 1, p ), 1 )
                                             CALL saxpy( m,
     $                                                   cs*sn*apoaq,
     $                                                   a( 1, p ), 1,
     $                                                   a( 1, q ), 1 )
                                             work( p ) = work( p )*cs
                                             work( q ) = work( q ) / cs
                                             IF( rsvec ) THEN
                                                CALL saxpy( mvl,
     $                                               -t*aqoap,
     $                                               v( 1, q ), 1,
     $                                               v( 1, p ), 1 )
                                                CALL saxpy( mvl,
     $                                               cs*sn*apoaq,
     $                                               v( 1, p ), 1,
     $                                               v( 1, q ), 1 )
                                             END IF
                                          ELSE
                                             CALL saxpy( m, t*apoaq,
     $                                                   a( 1, p ), 1,
     $                                                   a( 1, q ), 1 )
                                             CALL saxpy( m,
     $                                                   -cs*sn*aqoap,
     $                                                   a( 1, q ), 1,
     $                                                   a( 1, p ), 1 )
                                             work( p ) = work( p ) / cs
                                             work( q ) = work( q )*cs
                                             IF( rsvec ) THEN
                                                CALL saxpy( mvl,
     $                                               t*apoaq, v( 1, p ),
     $                                               1, v( 1, q ), 1 )
                                                CALL saxpy( mvl,
     $                                               -cs*sn*aqoap,
     $                                               v( 1, q ), 1,
     $                                               v( 1, p ), 1 )
                                             END IF
                                          END IF
                                       END IF
                                    END IF
                                 END IF
*
                              ELSE
                                 IF( aapp.GT.aaqq ) THEN
                                    CALL scopy( m, a( 1, p ), 1,
     $                                          work( n+1 ), 1 )
                                    CALL slascl( 'G', 0, 0, aapp,
     $                                           one,
     $                                           m, 1, work( n+1 ), lda,
     $                                           ierr )
                                    CALL slascl( 'G', 0, 0, aaqq,
     $                                           one,
     $                                           m, 1, a( 1, q ), lda,
     $                                           ierr )
                                    temp1 = -aapq*work( p ) / work( q )
                                    CALL saxpy( m, temp1,
     $                                          work( n+1 ),
     $                                          1, a( 1, q ), 1 )
                                    CALL slascl( 'G', 0, 0, one,
     $                                           aaqq,
     $                                           m, 1, a( 1, q ), lda,
     $                                           ierr )
                                    sva( q ) = aaqq*sqrt( max( zero,
     $                                         one-aapq*aapq ) )
                                    mxsinj = max( mxsinj, sfmin )
                                 ELSE
                                    CALL scopy( m, a( 1, q ), 1,
     $                                          work( n+1 ), 1 )
                                    CALL slascl( 'G', 0, 0, aaqq,
     $                                           one,
     $                                           m, 1, work( n+1 ), lda,
     $                                           ierr )
                                    CALL slascl( 'G', 0, 0, aapp,
     $                                           one,
     $                                           m, 1, a( 1, p ), lda,
     $                                           ierr )
                                    temp1 = -aapq*work( q ) / work( p )
                                    CALL saxpy( m, temp1,
     $                                          work( n+1 ),
     $                                          1, a( 1, p ), 1 )
                                    CALL slascl( 'G', 0, 0, one,
     $                                           aapp,
     $                                           m, 1, a( 1, p ), lda,
     $                                           ierr )
                                    sva( p ) = aapp*sqrt( max( zero,
     $                                         one-aapq*aapq ) )
                                    mxsinj = max( mxsinj, sfmin )
                                 END IF
                              END IF
*           END IF ROTOK THEN ... ELSE
*
*           In the case of cancellation in updating SVA(q)
*           .. recompute SVA(q)
                              IF( ( sva( q ) / aaqq )**2.LE.rooteps )
     $                            THEN
                                 IF( ( aaqq.LT.rootbig ) .AND.
     $                               ( aaqq.GT.rootsfmin ) ) THEN
                                    sva( q ) = snrm2( m, a( 1, q ),
     $                                   1 )*
     $                                         work( q )
                                 ELSE
                                    t = zero
                                    aaqq = one
                                    CALL slassq( m, a( 1, q ), 1, t,
     $                                           aaqq )
                                    sva( q ) = t*sqrt( aaqq )*work( q )
                                 END IF
                              END IF
                              IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
                                 IF( ( aapp.LT.rootbig ) .AND.
     $                               ( aapp.GT.rootsfmin ) ) THEN
                                    aapp = snrm2( m, a( 1, p ), 1 )*
     $                                     work( p )
                                 ELSE
                                    t = zero
                                    aapp = one
                                    CALL slassq( m, a( 1, p ), 1, t,
     $                                           aapp )
                                    aapp = t*sqrt( aapp )*work( p )
                                 END IF
                                 sva( p ) = aapp
                              END IF
*              end of OK rotation
                           ELSE
                              notrot = notrot + 1
*[RTD]      SKIPPED  = SKIPPED  + 1
                              pskipped = pskipped + 1
                              ijblsk = ijblsk + 1
                           END IF
                        ELSE
                           notrot = notrot + 1
                           pskipped = pskipped + 1
                           ijblsk = ijblsk + 1
                        END IF
*
                        IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
     $                      THEN
                           sva( p ) = aapp
                           notrot = 0
                           GO TO 2011
                        END IF
                        IF( ( i.LE.swband ) .AND.
     $                      ( pskipped.GT.rowskip ) ) THEN
                           aapp = -aapp
                           notrot = 0
                           GO TO 2203
                        END IF
*
 2200                CONTINUE
*        end of the q-loop
 2203                CONTINUE
*
                     sva( p ) = aapp
*
                  ELSE
*
                     IF( aapp.EQ.zero )notrot = notrot +
     $                   min( jgl+kbl-1, n ) - jgl + 1
                     IF( aapp.LT.zero )notrot = 0
*
                  END IF
*
 2100          CONTINUE
*     end of the p-loop
 2010       CONTINUE
*     end of the jbc-loop
 2011       CONTINUE
*2011 bailed out of the jbc-loop
            DO 2012 p = igl, min( igl+kbl-1, n )
               sva( p ) = abs( sva( p ) )
 2012       CONTINUE
***
 2000    CONTINUE
*2000 :: end of the ibr-loop
*
*     .. update SVA(N)
         IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
     $       THEN
            sva( n ) = snrm2( m, a( 1, n ), 1 )*work( n )
         ELSE
            t = zero
            aapp = one
            CALL slassq( m, a( 1, n ), 1, t, aapp )
            sva( n ) = t*sqrt( aapp )*work( n )
         END IF
*
*     Additional steering devices
*
         IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
     $       ( iswrot.LE.n ) ) )swband = i
*
         IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( float( n ) )*
     $       tol ) .AND. ( float( n )*mxaapq*mxsinj.LT.tol ) ) THEN
            GO TO 1994
         END IF
*
         IF( notrot.GE.emptsw )GO TO 1994
*
 1993 CONTINUE
*     end i=1:NSWEEP loop
*
* #:( Reaching this point means that the procedure has not converged.
      info = nsweep - 1
      GO TO 1995
*
 1994 CONTINUE
* #:) Reaching this point means numerical convergence after the i-th
*     sweep.
*
      info = 0
* #:) INFO = 0 confirms successful iterations.
 1995 CONTINUE
*
*     Sort the singular values and find how many are above
*     the underflow threshold.
*
      n2 = 0
      n4 = 0
      DO 5991 p = 1, n - 1
         q = isamax( n-p+1, sva( p ), 1 ) + p - 1
         IF( p.NE.q ) THEN
            temp1 = sva( p )
            sva( p ) = sva( q )
            sva( q ) = temp1
            temp1 = work( p )
            work( p ) = work( q )
            work( q ) = temp1
            CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
            IF( rsvec )CALL sswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
         END IF
         IF( sva( p ).NE.zero ) THEN
            n4 = n4 + 1
            IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
         END IF
 5991 CONTINUE
      IF( sva( n ).NE.zero ) THEN
         n4 = n4 + 1
         IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
      END IF
*
*     Normalize the left singular vectors.
*
      IF( lsvec .OR. uctol ) THEN
         DO 1998 p = 1, n2
            CALL sscal( m, work( p ) / sva( p ), a( 1, p ), 1 )
 1998    CONTINUE
      END IF
*
*     Scale the product of Jacobi rotations (assemble the fast rotations).
*
      IF( rsvec ) THEN
         IF( applv ) THEN
            DO 2398 p = 1, n
               CALL sscal( mvl, work( p ), v( 1, p ), 1 )
 2398       CONTINUE
         ELSE
            DO 2399 p = 1, n
               temp1 = one / snrm2( mvl, v( 1, p ), 1 )
               CALL sscal( mvl, temp1, v( 1, p ), 1 )
 2399       CONTINUE
         END IF
      END IF
*
*     Undo scaling, if necessary (and possible).
      IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
     $    .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
     $    ( sfmin / skl ) ) ) ) THEN
         DO 2400 p = 1, n
            sva( p ) = skl*sva( p )
 2400    CONTINUE
         skl = one
      END IF
*
      work( 1 ) = skl
*     The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
*     then some of the singular values may overflow or underflow and
*     the spectrum is given in this factored representation.
*
      work( 2 ) = float( n4 )
*     N4 is the number of computed nonzero singular values of A.
*
      work( 3 ) = float( n2 )
*     N2 is the number of singular values of A greater than SFMIN.
*     If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
*     that may carry some information.
*
      work( 4 ) = float( i )
*     i is the index of the last sweep before declaring convergence.
*
      work( 5 ) = mxaapq
*     MXAAPQ is the largest absolute value of scaled pivots in the
*     last sweep
*
      work( 6 ) = mxsinj
*     MXSINJ is the largest absolute value of the sines of Jacobi angles
*     in the last sweep
*
      RETURN
*     ..
*     .. END OF SGESVJ
*     ..

Here is the call graph for this function:

Here is the caller graph for this function: