LAPACK 3.3.0
|
00001 SUBROUTINE SGEJSV( JOBA, JOBU, JOBV, JOBR, JOBT, JOBP, 00002 & M, N, A, LDA, SVA, U, LDU, V, LDV, 00003 & WORK, LWORK, IWORK, INFO ) 00004 * 00005 * -- LAPACK routine (version 3.3.0) -- 00006 * 00007 * -- Contributed by Zlatko Drmac of the University of Zagreb and -- 00008 * -- Kresimir Veselic of the Fernuniversitaet Hagen -- 00009 * November 2010 00010 * 00011 * -- LAPACK is a software package provided by Univ. of Tennessee, -- 00012 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- 00013 * 00014 * This routine is also part of SIGMA (version 1.23, October 23. 2008.) 00015 * SIGMA is a library of algorithms for highly accurate algorithms for 00016 * computation of SVD, PSVD, QSVD, (H,K)-SVD, and for solution of the 00017 * eigenvalue problems Hx = lambda M x, H M x = lambda x with H, M > 0. 00018 * 00019 * .. Scalar Arguments .. 00020 * 00021 IMPLICIT NONE 00022 INTEGER INFO, LDA, LDU, LDV, LWORK, M, N 00023 * 00024 * .. Array Arguments .. 00025 * 00026 REAL A( LDA, * ), SVA( N ), U( LDU, * ), V( LDV, * ), 00027 & WORK( LWORK ) 00028 INTEGER IWORK( * ) 00029 CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV 00030 * .. 00031 * 00032 * Purpose 00033 * ======= 00034 * SGEJSV computes the singular value decomposition (SVD) of a real M-by-N 00035 * matrix [A], where M >= N. The SVD of [A] is written as 00036 * 00037 * [A] = [U] * [SIGMA] * [V]^t, 00038 * 00039 * where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N 00040 * diagonal elements, [U] is an M-by-N (or M-by-M) orthonormal matrix, and 00041 * [V] is an N-by-N orthogonal matrix. The diagonal elements of [SIGMA] are 00042 * the singular values of [A]. The columns of [U] and [V] are the left and 00043 * the right singular vectors of [A], respectively. The matrices [U] and [V] 00044 * are computed and stored in the arrays U and V, respectively. The diagonal 00045 * of [SIGMA] is computed and stored in the array SVA. 00046 * 00047 * Arguments 00048 * ========= 00049 * 00050 * JOBA (input) CHARACTER*1 00051 * Specifies the level of accuracy: 00052 * = 'C': This option works well (high relative accuracy) if A = B * D, 00053 * with well-conditioned B and arbitrary diagonal matrix D. 00054 * The accuracy cannot be spoiled by COLUMN scaling. The 00055 * accuracy of the computed output depends on the condition of 00056 * B, and the procedure aims at the best theoretical accuracy. 00057 * The relative error max_{i=1:N}|d sigma_i| / sigma_i is 00058 * bounded by f(M,N)*epsilon* cond(B), independent of D. 00059 * The input matrix is preprocessed with the QRF with column 00060 * pivoting. This initial preprocessing and preconditioning by 00061 * a rank revealing QR factorization is common for all values of 00062 * JOBA. Additional actions are specified as follows: 00063 * = 'E': Computation as with 'C' with an additional estimate of the 00064 * condition number of B. It provides a realistic error bound. 00065 * = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings 00066 * D1, D2, and well-conditioned matrix C, this option gives 00067 * higher accuracy than the 'C' option. If the structure of the 00068 * input matrix is not known, and relative accuracy is 00069 * desirable, then this option is advisable. The input matrix A 00070 * is preprocessed with QR factorization with FULL (row and 00071 * column) pivoting. 00072 * = 'G' Computation as with 'F' with an additional estimate of the 00073 * condition number of B, where A=D*B. If A has heavily weighted 00074 * rows, then using this condition number gives too pessimistic 00075 * error bound. 00076 * = 'A': Small singular values are the noise and the matrix is treated 00077 * as numerically rank defficient. The error in the computed 00078 * singular values is bounded by f(m,n)*epsilon*||A||. 00079 * The computed SVD A = U * S * V^t restores A up to 00080 * f(m,n)*epsilon*||A||. 00081 * This gives the procedure the licence to discard (set to zero) 00082 * all singular values below N*epsilon*||A||. 00083 * = 'R': Similar as in 'A'. Rank revealing property of the initial 00084 * QR factorization is used do reveal (using triangular factor) 00085 * a gap sigma_{r+1} < epsilon * sigma_r in which case the 00086 * numerical RANK is declared to be r. The SVD is computed with 00087 * absolute error bounds, but more accurately than with 'A'. 00088 * 00089 * JOBU (input) CHARACTER*1 00090 * Specifies whether to compute the columns of U: 00091 * = 'U': N columns of U are returned in the array U. 00092 * = 'F': full set of M left sing. vectors is returned in the array U. 00093 * = 'W': U may be used as workspace of length M*N. See the description 00094 * of U. 00095 * = 'N': U is not computed. 00096 * 00097 * JOBV (input) CHARACTER*1 00098 * Specifies whether to compute the matrix V: 00099 * = 'V': N columns of V are returned in the array V; Jacobi rotations 00100 * are not explicitly accumulated. 00101 * = 'J': N columns of V are returned in the array V, but they are 00102 * computed as the product of Jacobi rotations. This option is 00103 * allowed only if JOBU .NE. 'N', i.e. in computing the full SVD. 00104 * = 'W': V may be used as workspace of length N*N. See the description 00105 * of V. 00106 * = 'N': V is not computed. 00107 * 00108 * JOBR (input) CHARACTER*1 00109 * Specifies the RANGE for the singular values. Issues the licence to 00110 * set to zero small positive singular values if they are outside 00111 * specified range. If A .NE. 0 is scaled so that the largest singular 00112 * value of c*A is around SQRT(BIG), BIG=SLAMCH('O'), then JOBR issues 00113 * the licence to kill columns of A whose norm in c*A is less than 00114 * SQRT(SFMIN) (for JOBR.EQ.'R'), or less than SMALL=SFMIN/EPSLN, 00115 * where SFMIN=SLAMCH('S'), EPSLN=SLAMCH('E'). 00116 * = 'N': Do not kill small columns of c*A. This option assumes that 00117 * BLAS and QR factorizations and triangular solvers are 00118 * implemented to work in that range. If the condition of A 00119 * is greater than BIG, use SGESVJ. 00120 * = 'R': RESTRICTED range for sigma(c*A) is [SQRT(SFMIN), SQRT(BIG)] 00121 * (roughly, as described above). This option is recommended. 00122 * =========================== 00123 * For computing the singular values in the FULL range [SFMIN,BIG] 00124 * use SGESVJ. 00125 * 00126 * JOBT (input) CHARACTER*1 00127 * If the matrix is square then the procedure may determine to use 00128 * transposed A if A^t seems to be better with respect to convergence. 00129 * If the matrix is not square, JOBT is ignored. This is subject to 00130 * changes in the future. 00131 * The decision is based on two values of entropy over the adjoint 00132 * orbit of A^t * A. See the descriptions of WORK(6) and WORK(7). 00133 * = 'T': transpose if entropy test indicates possibly faster 00134 * convergence of Jacobi process if A^t is taken as input. If A is 00135 * replaced with A^t, then the row pivoting is included automatically. 00136 * = 'N': do not speculate. 00137 * This option can be used to compute only the singular values, or the 00138 * full SVD (U, SIGMA and V). For only one set of singular vectors 00139 * (U or V), the caller should provide both U and V, as one of the 00140 * matrices is used as workspace if the matrix A is transposed. 00141 * The implementer can easily remove this constraint and make the 00142 * code more complicated. See the descriptions of U and V. 00143 * 00144 * JOBP (input) CHARACTER*1 00145 * Issues the licence to introduce structured perturbations to drown 00146 * denormalized numbers. This licence should be active if the 00147 * denormals are poorly implemented, causing slow computation, 00148 * especially in cases of fast convergence (!). For details see [1,2]. 00149 * For the sake of simplicity, this perturbations are included only 00150 * when the full SVD or only the singular values are requested. The 00151 * implementer/user can easily add the perturbation for the cases of 00152 * computing one set of singular vectors. 00153 * = 'P': introduce perturbation 00154 * = 'N': do not perturb 00155 * 00156 * M (input) INTEGER 00157 * The number of rows of the input matrix A. M >= 0. 00158 * 00159 * N (input) INTEGER 00160 * The number of columns of the input matrix A. M >= N >= 0. 00161 * 00162 * A (input/workspace) REAL array, dimension (LDA,N) 00163 * On entry, the M-by-N matrix A. 00164 * 00165 * LDA (input) INTEGER 00166 * The leading dimension of the array A. LDA >= max(1,M). 00167 * 00168 * SVA (workspace/output) REAL array, dimension (N) 00169 * On exit, 00170 * - For WORK(1)/WORK(2) = ONE: The singular values of A. During the 00171 * computation SVA contains Euclidean column norms of the 00172 * iterated matrices in the array A. 00173 * - For WORK(1) .NE. WORK(2): The singular values of A are 00174 * (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if 00175 * sigma_max(A) overflows or if small singular values have been 00176 * saved from underflow by scaling the input matrix A. 00177 * - If JOBR='R' then some of the singular values may be returned 00178 * as exact zeros obtained by "set to zero" because they are 00179 * below the numerical rank threshold or are denormalized numbers. 00180 * 00181 * U (workspace/output) REAL array, dimension ( LDU, N ) 00182 * If JOBU = 'U', then U contains on exit the M-by-N matrix of 00183 * the left singular vectors. 00184 * If JOBU = 'F', then U contains on exit the M-by-M matrix of 00185 * the left singular vectors, including an ONB 00186 * of the orthogonal complement of the Range(A). 00187 * If JOBU = 'W' .AND. (JOBV.EQ.'V' .AND. JOBT.EQ.'T' .AND. M.EQ.N), 00188 * then U is used as workspace if the procedure 00189 * replaces A with A^t. In that case, [V] is computed 00190 * in U as left singular vectors of A^t and then 00191 * copied back to the V array. This 'W' option is just 00192 * a reminder to the caller that in this case U is 00193 * reserved as workspace of length N*N. 00194 * If JOBU = 'N' U is not referenced. 00195 * 00196 * LDU (input) INTEGER 00197 * The leading dimension of the array U, LDU >= 1. 00198 * IF JOBU = 'U' or 'F' or 'W', then LDU >= M. 00199 * 00200 * V (workspace/output) REAL array, dimension ( LDV, N ) 00201 * If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of 00202 * the right singular vectors; 00203 * If JOBV = 'W', AND (JOBU.EQ.'U' AND JOBT.EQ.'T' AND M.EQ.N), 00204 * then V is used as workspace if the pprocedure 00205 * replaces A with A^t. In that case, [U] is computed 00206 * in V as right singular vectors of A^t and then 00207 * copied back to the U array. This 'W' option is just 00208 * a reminder to the caller that in this case V is 00209 * reserved as workspace of length N*N. 00210 * If JOBV = 'N' V is not referenced. 00211 * 00212 * LDV (input) INTEGER 00213 * The leading dimension of the array V, LDV >= 1. 00214 * If JOBV = 'V' or 'J' or 'W', then LDV >= N. 00215 * 00216 * WORK (workspace/output) REAL array, dimension at least LWORK. 00217 * On exit, 00218 * WORK(1) = SCALE = WORK(2) / WORK(1) is the scaling factor such 00219 * that SCALE*SVA(1:N) are the computed singular values 00220 * of A. (See the description of SVA().) 00221 * WORK(2) = See the description of WORK(1). 00222 * WORK(3) = SCONDA is an estimate for the condition number of 00223 * column equilibrated A. (If JOBA .EQ. 'E' or 'G') 00224 * SCONDA is an estimate of SQRT(||(R^t * R)^(-1)||_1). 00225 * It is computed using SPOCON. It holds 00226 * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA 00227 * where R is the triangular factor from the QRF of A. 00228 * However, if R is truncated and the numerical rank is 00229 * determined to be strictly smaller than N, SCONDA is 00230 * returned as -1, thus indicating that the smallest 00231 * singular values might be lost. 00232 * 00233 * If full SVD is needed, the following two condition numbers are 00234 * useful for the analysis of the algorithm. They are provied for 00235 * a developer/implementer who is familiar with the details of 00236 * the method. 00237 * 00238 * WORK(4) = an estimate of the scaled condition number of the 00239 * triangular factor in the first QR factorization. 00240 * WORK(5) = an estimate of the scaled condition number of the 00241 * triangular factor in the second QR factorization. 00242 * The following two parameters are computed if JOBT .EQ. 'T'. 00243 * They are provided for a developer/implementer who is familiar 00244 * with the details of the method. 00245 * 00246 * WORK(6) = the entropy of A^t*A :: this is the Shannon entropy 00247 * of diag(A^t*A) / Trace(A^t*A) taken as point in the 00248 * probability simplex. 00249 * WORK(7) = the entropy of A*A^t. 00250 * 00251 * LWORK (input) INTEGER 00252 * Length of WORK to confirm proper allocation of work space. 00253 * LWORK depends on the job: 00254 * 00255 * If only SIGMA is needed ( JOBU.EQ.'N', JOBV.EQ.'N' ) and 00256 * -> .. no scaled condition estimate required ( JOBE.EQ.'N'): 00257 * LWORK >= max(2*M+N,4*N+1,7). This is the minimal requirement. 00258 * For optimal performance (blocked code) the optimal value 00259 * is LWORK >= max(2*M+N,3*N+(N+1)*NB,7). Here NB is the optimal 00260 * block size for xGEQP3/xGEQRF. 00261 * -> .. an estimate of the scaled condition number of A is 00262 * required (JOBA='E', 'G'). In this case, LWORK is the maximum 00263 * of the above and N*N+4*N, i.e. LWORK >= max(2*M+N,N*N+4N,7). 00264 * 00265 * If SIGMA and the right singular vectors are needed (JOBV.EQ.'V'), 00266 * -> the minimal requirement is LWORK >= max(2*N+M,7). 00267 * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7), 00268 * where NB is the optimal block size. 00269 * 00270 * If SIGMA and the left singular vectors are needed 00271 * -> the minimal requirement is LWORK >= max(2*N+M,7). 00272 * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7), 00273 * where NB is the optimal block size. 00274 * 00275 * If full SVD is needed ( JOBU.EQ.'U' or 'F', JOBV.EQ.'V' ) and 00276 * -> .. the singular vectors are computed without explicit 00277 * accumulation of the Jacobi rotations, LWORK >= 6*N+2*N*N 00278 * -> .. in the iterative part, the Jacobi rotations are 00279 * explicitly accumulated (option, see the description of JOBV), 00280 * then the minimal requirement is LWORK >= max(M+3*N+N*N,7). 00281 * For better performance, if NB is the optimal block size, 00282 * LWORK >= max(3*N+N*N+M,3*N+N*N+N*NB,7). 00283 * 00284 * IWORK (workspace/output) INTEGER array, dimension M+3*N. 00285 * On exit, 00286 * IWORK(1) = the numerical rank determined after the initial 00287 * QR factorization with pivoting. See the descriptions 00288 * of JOBA and JOBR. 00289 * IWORK(2) = the number of the computed nonzero singular values 00290 * IWORK(3) = if nonzero, a warning message: 00291 * If IWORK(3).EQ.1 then some of the column norms of A 00292 * were denormalized floats. The requested high accuracy 00293 * is not warranted by the data. 00294 * 00295 * INFO (output) INTEGER 00296 * < 0 : if INFO = -i, then the i-th argument had an illegal value. 00297 * = 0 : successfull exit; 00298 * > 0 : SGEJSV did not converge in the maximal allowed number 00299 * of sweeps. The computed values may be inaccurate. 00300 * 00301 * Further Details 00302 * =============== 00303 * 00304 * SGEJSV implements a preconditioned Jacobi SVD algorithm. It uses SGEQP3, 00305 * SGEQRF, and SGELQF as preprocessors and preconditioners. Optionally, an 00306 * additional row pivoting can be used as a preprocessor, which in some 00307 * cases results in much higher accuracy. An example is matrix A with the 00308 * structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned 00309 * diagonal matrices and C is well-conditioned matrix. In that case, complete 00310 * pivoting in the first QR factorizations provides accuracy dependent on the 00311 * condition number of C, and independent of D1, D2. Such higher accuracy is 00312 * not completely understood theoretically, but it works well in practice. 00313 * Further, if A can be written as A = B*D, with well-conditioned B and some 00314 * diagonal D, then the high accuracy is guaranteed, both theoretically and 00315 * in software, independent of D. For more details see [1], [2]. 00316 * The computational range for the singular values can be the full range 00317 * ( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS 00318 * & LAPACK routines called by SGEJSV are implemented to work in that range. 00319 * If that is not the case, then the restriction for safe computation with 00320 * the singular values in the range of normalized IEEE numbers is that the 00321 * spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not 00322 * overflow. This code (SGEJSV) is best used in this restricted range, 00323 * meaning that singular values of magnitude below ||A||_2 / SLAMCH('O') are 00324 * returned as zeros. See JOBR for details on this. 00325 * Further, this implementation is somewhat slower than the one described 00326 * in [1,2] due to replacement of some non-LAPACK components, and because 00327 * the choice of some tuning parameters in the iterative part (SGESVJ) is 00328 * left to the implementer on a particular machine. 00329 * The rank revealing QR factorization (in this code: SGEQP3) should be 00330 * implemented as in [3]. We have a new version of SGEQP3 under development 00331 * that is more robust than the current one in LAPACK, with a cleaner cut in 00332 * rank defficient cases. It will be available in the SIGMA library [4]. 00333 * If M is much larger than N, it is obvious that the inital QRF with 00334 * column pivoting can be preprocessed by the QRF without pivoting. That 00335 * well known trick is not used in SGEJSV because in some cases heavy row 00336 * weighting can be treated with complete pivoting. The overhead in cases 00337 * M much larger than N is then only due to pivoting, but the benefits in 00338 * terms of accuracy have prevailed. The implementer/user can incorporate 00339 * this extra QRF step easily. The implementer can also improve data movement 00340 * (matrix transpose, matrix copy, matrix transposed copy) - this 00341 * implementation of SGEJSV uses only the simplest, naive data movement. 00342 * 00343 * Contributors 00344 * 00345 * Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany) 00346 * 00347 * References 00348 * 00349 * [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. 00350 * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. 00351 * LAPACK Working note 169. 00352 * [2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. 00353 * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. 00354 * LAPACK Working note 170. 00355 * [3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR 00356 * factorization software - a case study. 00357 * ACM Trans. math. Softw. Vol. 35, No 2 (2008), pp. 1-28. 00358 * LAPACK Working note 176. 00359 * [4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, 00360 * QSVD, (H,K)-SVD computations. 00361 * Department of Mathematics, University of Zagreb, 2008. 00362 * 00363 * Bugs, examples and comments 00364 * 00365 * Please report all bugs and send interesting examples and/or comments to 00366 * drmac@math.hr. Thank you. 00367 * 00368 * =========================================================================== 00369 * 00370 * .. Local Parameters .. 00371 REAL ZERO, ONE 00372 PARAMETER ( ZERO = 0.0E0, ONE = 1.0E0 ) 00373 * .. 00374 * .. Local Scalars .. 00375 REAL AAPP, AAQQ, AATMAX, AATMIN, BIG, BIG1, COND_OK, 00376 & CONDR1, CONDR2, ENTRA, ENTRAT, EPSLN, MAXPRJ, SCALEM, 00377 & SCONDA, SFMIN, SMALL, TEMP1, USCAL1, USCAL2, XSC 00378 INTEGER IERR, N1, NR, NUMRANK, p, q, WARNING 00379 LOGICAL ALMORT, DEFR, ERREST, GOSCAL, JRACC, KILL, LSVEC, 00380 & L2ABER, L2KILL, L2PERT, L2RANK, L2TRAN, 00381 & NOSCAL, ROWPIV, RSVEC, TRANSP 00382 * .. 00383 * .. Intrinsic Functions .. 00384 INTRINSIC ABS, ALOG, AMAX1, AMIN1, FLOAT, 00385 & MAX0, MIN0, NINT, SIGN, SQRT 00386 * .. 00387 * .. External Functions .. 00388 REAL SLAMCH, SNRM2 00389 INTEGER ISAMAX 00390 LOGICAL LSAME 00391 EXTERNAL ISAMAX, LSAME, SLAMCH, SNRM2 00392 * .. 00393 * .. External Subroutines .. 00394 EXTERNAL SCOPY, SGELQF, SGEQP3, SGEQRF, SLACPY, SLASCL, 00395 & SLASET, SLASSQ, SLASWP, SORGQR, SORMLQ, 00396 & SORMQR, SPOCON, SSCAL, SSWAP, STRSM, XERBLA 00397 * 00398 EXTERNAL SGESVJ 00399 * .. 00400 * 00401 * Test the input arguments 00402 * 00403 LSVEC = LSAME( JOBU, 'U' ) .OR. LSAME( JOBU, 'F' ) 00404 JRACC = LSAME( JOBV, 'J' ) 00405 RSVEC = LSAME( JOBV, 'V' ) .OR. JRACC 00406 ROWPIV = LSAME( JOBA, 'F' ) .OR. LSAME( JOBA, 'G' ) 00407 L2RANK = LSAME( JOBA, 'R' ) 00408 L2ABER = LSAME( JOBA, 'A' ) 00409 ERREST = LSAME( JOBA, 'E' ) .OR. LSAME( JOBA, 'G' ) 00410 L2TRAN = LSAME( JOBT, 'T' ) 00411 L2KILL = LSAME( JOBR, 'R' ) 00412 DEFR = LSAME( JOBR, 'N' ) 00413 L2PERT = LSAME( JOBP, 'P' ) 00414 * 00415 IF ( .NOT.(ROWPIV .OR. L2RANK .OR. L2ABER .OR. 00416 & ERREST .OR. LSAME( JOBA, 'C' ) )) THEN 00417 INFO = - 1 00418 ELSE IF ( .NOT.( LSVEC .OR. LSAME( JOBU, 'N' ) .OR. 00419 & LSAME( JOBU, 'W' )) ) THEN 00420 INFO = - 2 00421 ELSE IF ( .NOT.( RSVEC .OR. LSAME( JOBV, 'N' ) .OR. 00422 & LSAME( JOBV, 'W' )) .OR. ( JRACC .AND. (.NOT.LSVEC) ) ) THEN 00423 INFO = - 3 00424 ELSE IF ( .NOT. ( L2KILL .OR. DEFR ) ) THEN 00425 INFO = - 4 00426 ELSE IF ( .NOT. ( L2TRAN .OR. LSAME( JOBT, 'N' ) ) ) THEN 00427 INFO = - 5 00428 ELSE IF ( .NOT. ( L2PERT .OR. LSAME( JOBP, 'N' ) ) ) THEN 00429 INFO = - 6 00430 ELSE IF ( M .LT. 0 ) THEN 00431 INFO = - 7 00432 ELSE IF ( ( N .LT. 0 ) .OR. ( N .GT. M ) ) THEN 00433 INFO = - 8 00434 ELSE IF ( LDA .LT. M ) THEN 00435 INFO = - 10 00436 ELSE IF ( LSVEC .AND. ( LDU .LT. M ) ) THEN 00437 INFO = - 13 00438 ELSE IF ( RSVEC .AND. ( LDV .LT. N ) ) THEN 00439 INFO = - 14 00440 ELSE IF ( (.NOT.(LSVEC .OR. RSVEC .OR. ERREST).AND. 00441 & (LWORK .LT. MAX0(7,4*N+1,2*M+N))) .OR. 00442 & (.NOT.(LSVEC .OR. LSVEC) .AND. ERREST .AND. 00443 & (LWORK .LT. MAX0(7,4*N+N*N,2*M+N))) .OR. 00444 & (LSVEC .AND. (.NOT.RSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR. 00445 & (RSVEC .AND. (.NOT.LSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR. 00446 & (LSVEC .AND. RSVEC .AND. .NOT.JRACC .AND. (LWORK.LT.6*N+2*N*N)) 00447 & .OR. (LSVEC.AND.RSVEC.AND.JRACC.AND.LWORK.LT.MAX0(7,M+3*N+N*N))) 00448 & THEN 00449 INFO = - 17 00450 ELSE 00451 * #:) 00452 INFO = 0 00453 END IF 00454 * 00455 IF ( INFO .NE. 0 ) THEN 00456 * #:( 00457 CALL XERBLA( 'SGEJSV', - INFO ) 00458 END IF 00459 * 00460 * Quick return for void matrix (Y3K safe) 00461 * #:) 00462 IF ( ( M .EQ. 0 ) .OR. ( N .EQ. 0 ) ) RETURN 00463 * 00464 * Determine whether the matrix U should be M x N or M x M 00465 * 00466 IF ( LSVEC ) THEN 00467 N1 = N 00468 IF ( LSAME( JOBU, 'F' ) ) N1 = M 00469 END IF 00470 * 00471 * Set numerical parameters 00472 * 00473 *! NOTE: Make sure SLAMCH() does not fail on the target architecture. 00474 * 00475 EPSLN = SLAMCH('Epsilon') 00476 SFMIN = SLAMCH('SafeMinimum') 00477 SMALL = SFMIN / EPSLN 00478 BIG = SLAMCH('O') 00479 * 00480 * Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N 00481 * 00482 *(!) If necessary, scale SVA() to protect the largest norm from 00483 * overflow. It is possible that this scaling pushes the smallest 00484 * column norm left from the underflow threshold (extreme case). 00485 * 00486 SCALEM = ONE / SQRT(FLOAT(M)*FLOAT(N)) 00487 NOSCAL = .TRUE. 00488 GOSCAL = .TRUE. 00489 DO 1874 p = 1, N 00490 AAPP = ZERO 00491 AAQQ = ONE 00492 CALL SLASSQ( M, A(1,p), 1, AAPP, AAQQ ) 00493 IF ( AAPP .GT. BIG ) THEN 00494 INFO = - 9 00495 CALL XERBLA( 'SGEJSV', -INFO ) 00496 RETURN 00497 END IF 00498 AAQQ = SQRT(AAQQ) 00499 IF ( ( AAPP .LT. (BIG / AAQQ) ) .AND. NOSCAL ) THEN 00500 SVA(p) = AAPP * AAQQ 00501 ELSE 00502 NOSCAL = .FALSE. 00503 SVA(p) = AAPP * ( AAQQ * SCALEM ) 00504 IF ( GOSCAL ) THEN 00505 GOSCAL = .FALSE. 00506 CALL SSCAL( p-1, SCALEM, SVA, 1 ) 00507 END IF 00508 END IF 00509 1874 CONTINUE 00510 * 00511 IF ( NOSCAL ) SCALEM = ONE 00512 * 00513 AAPP = ZERO 00514 AAQQ = BIG 00515 DO 4781 p = 1, N 00516 AAPP = AMAX1( AAPP, SVA(p) ) 00517 IF ( SVA(p) .NE. ZERO ) AAQQ = AMIN1( AAQQ, SVA(p) ) 00518 4781 CONTINUE 00519 * 00520 * Quick return for zero M x N matrix 00521 * #:) 00522 IF ( AAPP .EQ. ZERO ) THEN 00523 IF ( LSVEC ) CALL SLASET( 'G', M, N1, ZERO, ONE, U, LDU ) 00524 IF ( RSVEC ) CALL SLASET( 'G', N, N, ZERO, ONE, V, LDV ) 00525 WORK(1) = ONE 00526 WORK(2) = ONE 00527 IF ( ERREST ) WORK(3) = ONE 00528 IF ( LSVEC .AND. RSVEC ) THEN 00529 WORK(4) = ONE 00530 WORK(5) = ONE 00531 END IF 00532 IF ( L2TRAN ) THEN 00533 WORK(6) = ZERO 00534 WORK(7) = ZERO 00535 END IF 00536 IWORK(1) = 0 00537 IWORK(2) = 0 00538 RETURN 00539 END IF 00540 * 00541 * Issue warning if denormalized column norms detected. Override the 00542 * high relative accuracy request. Issue licence to kill columns 00543 * (set them to zero) whose norm is less than sigma_max / BIG (roughly). 00544 * #:( 00545 WARNING = 0 00546 IF ( AAQQ .LE. SFMIN ) THEN 00547 L2RANK = .TRUE. 00548 L2KILL = .TRUE. 00549 WARNING = 1 00550 END IF 00551 * 00552 * Quick return for one-column matrix 00553 * #:) 00554 IF ( N .EQ. 1 ) THEN 00555 * 00556 IF ( LSVEC ) THEN 00557 CALL SLASCL( 'G',0,0,SVA(1),SCALEM, M,1,A(1,1),LDA,IERR ) 00558 CALL SLACPY( 'A', M, 1, A, LDA, U, LDU ) 00559 * computing all M left singular vectors of the M x 1 matrix 00560 IF ( N1 .NE. N ) THEN 00561 CALL SGEQRF( M, N, U,LDU, WORK, WORK(N+1),LWORK-N,IERR ) 00562 CALL SORGQR( M,N1,1, U,LDU,WORK,WORK(N+1),LWORK-N,IERR ) 00563 CALL SCOPY( M, A(1,1), 1, U(1,1), 1 ) 00564 END IF 00565 END IF 00566 IF ( RSVEC ) THEN 00567 V(1,1) = ONE 00568 END IF 00569 IF ( SVA(1) .LT. (BIG*SCALEM) ) THEN 00570 SVA(1) = SVA(1) / SCALEM 00571 SCALEM = ONE 00572 END IF 00573 WORK(1) = ONE / SCALEM 00574 WORK(2) = ONE 00575 IF ( SVA(1) .NE. ZERO ) THEN 00576 IWORK(1) = 1 00577 IF ( ( SVA(1) / SCALEM) .GE. SFMIN ) THEN 00578 IWORK(2) = 1 00579 ELSE 00580 IWORK(2) = 0 00581 END IF 00582 ELSE 00583 IWORK(1) = 0 00584 IWORK(2) = 0 00585 END IF 00586 IF ( ERREST ) WORK(3) = ONE 00587 IF ( LSVEC .AND. RSVEC ) THEN 00588 WORK(4) = ONE 00589 WORK(5) = ONE 00590 END IF 00591 IF ( L2TRAN ) THEN 00592 WORK(6) = ZERO 00593 WORK(7) = ZERO 00594 END IF 00595 RETURN 00596 * 00597 END IF 00598 * 00599 TRANSP = .FALSE. 00600 L2TRAN = L2TRAN .AND. ( M .EQ. N ) 00601 * 00602 AATMAX = -ONE 00603 AATMIN = BIG 00604 IF ( ROWPIV .OR. L2TRAN ) THEN 00605 * 00606 * Compute the row norms, needed to determine row pivoting sequence 00607 * (in the case of heavily row weighted A, row pivoting is strongly 00608 * advised) and to collect information needed to compare the 00609 * structures of A * A^t and A^t * A (in the case L2TRAN.EQ..TRUE.). 00610 * 00611 IF ( L2TRAN ) THEN 00612 DO 1950 p = 1, M 00613 XSC = ZERO 00614 TEMP1 = ONE 00615 CALL SLASSQ( N, A(p,1), LDA, XSC, TEMP1 ) 00616 * SLASSQ gets both the ell_2 and the ell_infinity norm 00617 * in one pass through the vector 00618 WORK(M+N+p) = XSC * SCALEM 00619 WORK(N+p) = XSC * (SCALEM*SQRT(TEMP1)) 00620 AATMAX = AMAX1( AATMAX, WORK(N+p) ) 00621 IF (WORK(N+p) .NE. ZERO) AATMIN = AMIN1(AATMIN,WORK(N+p)) 00622 1950 CONTINUE 00623 ELSE 00624 DO 1904 p = 1, M 00625 WORK(M+N+p) = SCALEM*ABS( A(p,ISAMAX(N,A(p,1),LDA)) ) 00626 AATMAX = AMAX1( AATMAX, WORK(M+N+p) ) 00627 AATMIN = AMIN1( AATMIN, WORK(M+N+p) ) 00628 1904 CONTINUE 00629 END IF 00630 * 00631 END IF 00632 * 00633 * For square matrix A try to determine whether A^t would be better 00634 * input for the preconditioned Jacobi SVD, with faster convergence. 00635 * The decision is based on an O(N) function of the vector of column 00636 * and row norms of A, based on the Shannon entropy. This should give 00637 * the right choice in most cases when the difference actually matters. 00638 * It may fail and pick the slower converging side. 00639 * 00640 ENTRA = ZERO 00641 ENTRAT = ZERO 00642 IF ( L2TRAN ) THEN 00643 * 00644 XSC = ZERO 00645 TEMP1 = ONE 00646 CALL SLASSQ( N, SVA, 1, XSC, TEMP1 ) 00647 TEMP1 = ONE / TEMP1 00648 * 00649 ENTRA = ZERO 00650 DO 1113 p = 1, N 00651 BIG1 = ( ( SVA(p) / XSC )**2 ) * TEMP1 00652 IF ( BIG1 .NE. ZERO ) ENTRA = ENTRA + BIG1 * ALOG(BIG1) 00653 1113 CONTINUE 00654 ENTRA = - ENTRA / ALOG(FLOAT(N)) 00655 * 00656 * Now, SVA().^2/Trace(A^t * A) is a point in the probability simplex. 00657 * It is derived from the diagonal of A^t * A. Do the same with the 00658 * diagonal of A * A^t, compute the entropy of the corresponding 00659 * probability distribution. Note that A * A^t and A^t * A have the 00660 * same trace. 00661 * 00662 ENTRAT = ZERO 00663 DO 1114 p = N+1, N+M 00664 BIG1 = ( ( WORK(p) / XSC )**2 ) * TEMP1 00665 IF ( BIG1 .NE. ZERO ) ENTRAT = ENTRAT + BIG1 * ALOG(BIG1) 00666 1114 CONTINUE 00667 ENTRAT = - ENTRAT / ALOG(FLOAT(M)) 00668 * 00669 * Analyze the entropies and decide A or A^t. Smaller entropy 00670 * usually means better input for the algorithm. 00671 * 00672 TRANSP = ( ENTRAT .LT. ENTRA ) 00673 * 00674 * If A^t is better than A, transpose A. 00675 * 00676 IF ( TRANSP ) THEN 00677 * In an optimal implementation, this trivial transpose 00678 * should be replaced with faster transpose. 00679 DO 1115 p = 1, N - 1 00680 DO 1116 q = p + 1, N 00681 TEMP1 = A(q,p) 00682 A(q,p) = A(p,q) 00683 A(p,q) = TEMP1 00684 1116 CONTINUE 00685 1115 CONTINUE 00686 DO 1117 p = 1, N 00687 WORK(M+N+p) = SVA(p) 00688 SVA(p) = WORK(N+p) 00689 1117 CONTINUE 00690 TEMP1 = AAPP 00691 AAPP = AATMAX 00692 AATMAX = TEMP1 00693 TEMP1 = AAQQ 00694 AAQQ = AATMIN 00695 AATMIN = TEMP1 00696 KILL = LSVEC 00697 LSVEC = RSVEC 00698 RSVEC = KILL 00699 IF ( LSVEC ) N1 = N 00700 * 00701 ROWPIV = .TRUE. 00702 END IF 00703 * 00704 END IF 00705 * END IF L2TRAN 00706 * 00707 * Scale the matrix so that its maximal singular value remains less 00708 * than SQRT(BIG) -- the matrix is scaled so that its maximal column 00709 * has Euclidean norm equal to SQRT(BIG/N). The only reason to keep 00710 * SQRT(BIG) instead of BIG is the fact that SGEJSV uses LAPACK and 00711 * BLAS routines that, in some implementations, are not capable of 00712 * working in the full interval [SFMIN,BIG] and that they may provoke 00713 * overflows in the intermediate results. If the singular values spread 00714 * from SFMIN to BIG, then SGESVJ will compute them. So, in that case, 00715 * one should use SGESVJ instead of SGEJSV. 00716 * 00717 BIG1 = SQRT( BIG ) 00718 TEMP1 = SQRT( BIG / FLOAT(N) ) 00719 * 00720 CALL SLASCL( 'G', 0, 0, AAPP, TEMP1, N, 1, SVA, N, IERR ) 00721 IF ( AAQQ .GT. (AAPP * SFMIN) ) THEN 00722 AAQQ = ( AAQQ / AAPP ) * TEMP1 00723 ELSE 00724 AAQQ = ( AAQQ * TEMP1 ) / AAPP 00725 END IF 00726 TEMP1 = TEMP1 * SCALEM 00727 CALL SLASCL( 'G', 0, 0, AAPP, TEMP1, M, N, A, LDA, IERR ) 00728 * 00729 * To undo scaling at the end of this procedure, multiply the 00730 * computed singular values with USCAL2 / USCAL1. 00731 * 00732 USCAL1 = TEMP1 00733 USCAL2 = AAPP 00734 * 00735 IF ( L2KILL ) THEN 00736 * L2KILL enforces computation of nonzero singular values in 00737 * the restricted range of condition number of the initial A, 00738 * sigma_max(A) / sigma_min(A) approx. SQRT(BIG)/SQRT(SFMIN). 00739 XSC = SQRT( SFMIN ) 00740 ELSE 00741 XSC = SMALL 00742 * 00743 * Now, if the condition number of A is too big, 00744 * sigma_max(A) / sigma_min(A) .GT. SQRT(BIG/N) * EPSLN / SFMIN, 00745 * as a precaution measure, the full SVD is computed using SGESVJ 00746 * with accumulated Jacobi rotations. This provides numerically 00747 * more robust computation, at the cost of slightly increased run 00748 * time. Depending on the concrete implementation of BLAS and LAPACK 00749 * (i.e. how they behave in presence of extreme ill-conditioning) the 00750 * implementor may decide to remove this switch. 00751 IF ( ( AAQQ.LT.SQRT(SFMIN) ) .AND. LSVEC .AND. RSVEC ) THEN 00752 JRACC = .TRUE. 00753 END IF 00754 * 00755 END IF 00756 IF ( AAQQ .LT. XSC ) THEN 00757 DO 700 p = 1, N 00758 IF ( SVA(p) .LT. XSC ) THEN 00759 CALL SLASET( 'A', M, 1, ZERO, ZERO, A(1,p), LDA ) 00760 SVA(p) = ZERO 00761 END IF 00762 700 CONTINUE 00763 END IF 00764 * 00765 * Preconditioning using QR factorization with pivoting 00766 * 00767 IF ( ROWPIV ) THEN 00768 * Optional row permutation (Bjoerck row pivoting): 00769 * A result by Cox and Higham shows that the Bjoerck's 00770 * row pivoting combined with standard column pivoting 00771 * has similar effect as Powell-Reid complete pivoting. 00772 * The ell-infinity norms of A are made nonincreasing. 00773 DO 1952 p = 1, M - 1 00774 q = ISAMAX( M-p+1, WORK(M+N+p), 1 ) + p - 1 00775 IWORK(2*N+p) = q 00776 IF ( p .NE. q ) THEN 00777 TEMP1 = WORK(M+N+p) 00778 WORK(M+N+p) = WORK(M+N+q) 00779 WORK(M+N+q) = TEMP1 00780 END IF 00781 1952 CONTINUE 00782 CALL SLASWP( N, A, LDA, 1, M-1, IWORK(2*N+1), 1 ) 00783 END IF 00784 * 00785 * End of the preparation phase (scaling, optional sorting and 00786 * transposing, optional flushing of small columns). 00787 * 00788 * Preconditioning 00789 * 00790 * If the full SVD is needed, the right singular vectors are computed 00791 * from a matrix equation, and for that we need theoretical analysis 00792 * of the Businger-Golub pivoting. So we use SGEQP3 as the first RR QRF. 00793 * In all other cases the first RR QRF can be chosen by other criteria 00794 * (eg speed by replacing global with restricted window pivoting, such 00795 * as in SGEQPX from TOMS # 782). Good results will be obtained using 00796 * SGEQPX with properly (!) chosen numerical parameters. 00797 * Any improvement of SGEQP3 improves overal performance of SGEJSV. 00798 * 00799 * A * P1 = Q1 * [ R1^t 0]^t: 00800 DO 1963 p = 1, N 00801 * .. all columns are free columns 00802 IWORK(p) = 0 00803 1963 CONTINUE 00804 CALL SGEQP3( M,N,A,LDA, IWORK,WORK, WORK(N+1),LWORK-N, IERR ) 00805 * 00806 * The upper triangular matrix R1 from the first QRF is inspected for 00807 * rank deficiency and possibilities for deflation, or possible 00808 * ill-conditioning. Depending on the user specified flag L2RANK, 00809 * the procedure explores possibilities to reduce the numerical 00810 * rank by inspecting the computed upper triangular factor. If 00811 * L2RANK or L2ABER are up, then SGEJSV will compute the SVD of 00812 * A + dA, where ||dA|| <= f(M,N)*EPSLN. 00813 * 00814 NR = 1 00815 IF ( L2ABER ) THEN 00816 * Standard absolute error bound suffices. All sigma_i with 00817 * sigma_i < N*EPSLN*||A|| are flushed to zero. This is an 00818 * agressive enforcement of lower numerical rank by introducing a 00819 * backward error of the order of N*EPSLN*||A||. 00820 TEMP1 = SQRT(FLOAT(N))*EPSLN 00821 DO 3001 p = 2, N 00822 IF ( ABS(A(p,p)) .GE. (TEMP1*ABS(A(1,1))) ) THEN 00823 NR = NR + 1 00824 ELSE 00825 GO TO 3002 00826 END IF 00827 3001 CONTINUE 00828 3002 CONTINUE 00829 ELSE IF ( L2RANK ) THEN 00830 * .. similarly as above, only slightly more gentle (less agressive). 00831 * Sudden drop on the diagonal of R1 is used as the criterion for 00832 * close-to-rank-defficient. 00833 TEMP1 = SQRT(SFMIN) 00834 DO 3401 p = 2, N 00835 IF ( ( ABS(A(p,p)) .LT. (EPSLN*ABS(A(p-1,p-1))) ) .OR. 00836 & ( ABS(A(p,p)) .LT. SMALL ) .OR. 00837 & ( L2KILL .AND. (ABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3402 00838 NR = NR + 1 00839 3401 CONTINUE 00840 3402 CONTINUE 00841 * 00842 ELSE 00843 * The goal is high relative accuracy. However, if the matrix 00844 * has high scaled condition number the relative accuracy is in 00845 * general not feasible. Later on, a condition number estimator 00846 * will be deployed to estimate the scaled condition number. 00847 * Here we just remove the underflowed part of the triangular 00848 * factor. This prevents the situation in which the code is 00849 * working hard to get the accuracy not warranted by the data. 00850 TEMP1 = SQRT(SFMIN) 00851 DO 3301 p = 2, N 00852 IF ( ( ABS(A(p,p)) .LT. SMALL ) .OR. 00853 & ( L2KILL .AND. (ABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3302 00854 NR = NR + 1 00855 3301 CONTINUE 00856 3302 CONTINUE 00857 * 00858 END IF 00859 * 00860 ALMORT = .FALSE. 00861 IF ( NR .EQ. N ) THEN 00862 MAXPRJ = ONE 00863 DO 3051 p = 2, N 00864 TEMP1 = ABS(A(p,p)) / SVA(IWORK(p)) 00865 MAXPRJ = AMIN1( MAXPRJ, TEMP1 ) 00866 3051 CONTINUE 00867 IF ( MAXPRJ**2 .GE. ONE - FLOAT(N)*EPSLN ) ALMORT = .TRUE. 00868 END IF 00869 * 00870 * 00871 SCONDA = - ONE 00872 CONDR1 = - ONE 00873 CONDR2 = - ONE 00874 * 00875 IF ( ERREST ) THEN 00876 IF ( N .EQ. NR ) THEN 00877 IF ( RSVEC ) THEN 00878 * .. V is available as workspace 00879 CALL SLACPY( 'U', N, N, A, LDA, V, LDV ) 00880 DO 3053 p = 1, N 00881 TEMP1 = SVA(IWORK(p)) 00882 CALL SSCAL( p, ONE/TEMP1, V(1,p), 1 ) 00883 3053 CONTINUE 00884 CALL SPOCON( 'U', N, V, LDV, ONE, TEMP1, 00885 & WORK(N+1), IWORK(2*N+M+1), IERR ) 00886 ELSE IF ( LSVEC ) THEN 00887 * .. U is available as workspace 00888 CALL SLACPY( 'U', N, N, A, LDA, U, LDU ) 00889 DO 3054 p = 1, N 00890 TEMP1 = SVA(IWORK(p)) 00891 CALL SSCAL( p, ONE/TEMP1, U(1,p), 1 ) 00892 3054 CONTINUE 00893 CALL SPOCON( 'U', N, U, LDU, ONE, TEMP1, 00894 & WORK(N+1), IWORK(2*N+M+1), IERR ) 00895 ELSE 00896 CALL SLACPY( 'U', N, N, A, LDA, WORK(N+1), N ) 00897 DO 3052 p = 1, N 00898 TEMP1 = SVA(IWORK(p)) 00899 CALL SSCAL( p, ONE/TEMP1, WORK(N+(p-1)*N+1), 1 ) 00900 3052 CONTINUE 00901 * .. the columns of R are scaled to have unit Euclidean lengths. 00902 CALL SPOCON( 'U', N, WORK(N+1), N, ONE, TEMP1, 00903 & WORK(N+N*N+1), IWORK(2*N+M+1), IERR ) 00904 END IF 00905 SCONDA = ONE / SQRT(TEMP1) 00906 * SCONDA is an estimate of SQRT(||(R^t * R)^(-1)||_1). 00907 * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA 00908 ELSE 00909 SCONDA = - ONE 00910 END IF 00911 END IF 00912 * 00913 L2PERT = L2PERT .AND. ( ABS( A(1,1)/A(NR,NR) ) .GT. SQRT(BIG1) ) 00914 * If there is no violent scaling, artificial perturbation is not needed. 00915 * 00916 * Phase 3: 00917 * 00918 IF ( .NOT. ( RSVEC .OR. LSVEC ) ) THEN 00919 * 00920 * Singular Values only 00921 * 00922 * .. transpose A(1:NR,1:N) 00923 DO 1946 p = 1, MIN0( N-1, NR ) 00924 CALL SCOPY( N-p, A(p,p+1), LDA, A(p+1,p), 1 ) 00925 1946 CONTINUE 00926 * 00927 * The following two DO-loops introduce small relative perturbation 00928 * into the strict upper triangle of the lower triangular matrix. 00929 * Small entries below the main diagonal are also changed. 00930 * This modification is useful if the computing environment does not 00931 * provide/allow FLUSH TO ZERO underflow, for it prevents many 00932 * annoying denormalized numbers in case of strongly scaled matrices. 00933 * The perturbation is structured so that it does not introduce any 00934 * new perturbation of the singular values, and it does not destroy 00935 * the job done by the preconditioner. 00936 * The licence for this perturbation is in the variable L2PERT, which 00937 * should be .FALSE. if FLUSH TO ZERO underflow is active. 00938 * 00939 IF ( .NOT. ALMORT ) THEN 00940 * 00941 IF ( L2PERT ) THEN 00942 * XSC = SQRT(SMALL) 00943 XSC = EPSLN / FLOAT(N) 00944 DO 4947 q = 1, NR 00945 TEMP1 = XSC*ABS(A(q,q)) 00946 DO 4949 p = 1, N 00947 IF ( ( (p.GT.q) .AND. (ABS(A(p,q)).LE.TEMP1) ) 00948 & .OR. ( p .LT. q ) ) 00949 & A(p,q) = SIGN( TEMP1, A(p,q) ) 00950 4949 CONTINUE 00951 4947 CONTINUE 00952 ELSE 00953 CALL SLASET( 'U', NR-1,NR-1, ZERO,ZERO, A(1,2),LDA ) 00954 END IF 00955 * 00956 * .. second preconditioning using the QR factorization 00957 * 00958 CALL SGEQRF( N,NR, A,LDA, WORK, WORK(N+1),LWORK-N, IERR ) 00959 * 00960 * .. and transpose upper to lower triangular 00961 DO 1948 p = 1, NR - 1 00962 CALL SCOPY( NR-p, A(p,p+1), LDA, A(p+1,p), 1 ) 00963 1948 CONTINUE 00964 * 00965 END IF 00966 * 00967 * Row-cyclic Jacobi SVD algorithm with column pivoting 00968 * 00969 * .. again some perturbation (a "background noise") is added 00970 * to drown denormals 00971 IF ( L2PERT ) THEN 00972 * XSC = SQRT(SMALL) 00973 XSC = EPSLN / FLOAT(N) 00974 DO 1947 q = 1, NR 00975 TEMP1 = XSC*ABS(A(q,q)) 00976 DO 1949 p = 1, NR 00977 IF ( ( (p.GT.q) .AND. (ABS(A(p,q)).LE.TEMP1) ) 00978 & .OR. ( p .LT. q ) ) 00979 & A(p,q) = SIGN( TEMP1, A(p,q) ) 00980 1949 CONTINUE 00981 1947 CONTINUE 00982 ELSE 00983 CALL SLASET( 'U', NR-1, NR-1, ZERO, ZERO, A(1,2), LDA ) 00984 END IF 00985 * 00986 * .. and one-sided Jacobi rotations are started on a lower 00987 * triangular matrix (plus perturbation which is ignored in 00988 * the part which destroys triangular form (confusing?!)) 00989 * 00990 CALL SGESVJ( 'L', 'NoU', 'NoV', NR, NR, A, LDA, SVA, 00991 & N, V, LDV, WORK, LWORK, INFO ) 00992 * 00993 SCALEM = WORK(1) 00994 NUMRANK = NINT(WORK(2)) 00995 * 00996 * 00997 ELSE IF ( RSVEC .AND. ( .NOT. LSVEC ) ) THEN 00998 * 00999 * -> Singular Values and Right Singular Vectors <- 01000 * 01001 IF ( ALMORT ) THEN 01002 * 01003 * .. in this case NR equals N 01004 DO 1998 p = 1, NR 01005 CALL SCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 01006 1998 CONTINUE 01007 CALL SLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 01008 * 01009 CALL SGESVJ( 'L','U','N', N, NR, V,LDV, SVA, NR, A,LDA, 01010 & WORK, LWORK, INFO ) 01011 SCALEM = WORK(1) 01012 NUMRANK = NINT(WORK(2)) 01013 01014 ELSE 01015 * 01016 * .. two more QR factorizations ( one QRF is not enough, two require 01017 * accumulated product of Jacobi rotations, three are perfect ) 01018 * 01019 CALL SLASET( 'Lower', NR-1, NR-1, ZERO, ZERO, A(2,1), LDA ) 01020 CALL SGELQF( NR, N, A, LDA, WORK, WORK(N+1), LWORK-N, IERR) 01021 CALL SLACPY( 'Lower', NR, NR, A, LDA, V, LDV ) 01022 CALL SLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 01023 CALL SGEQRF( NR, NR, V, LDV, WORK(N+1), WORK(2*N+1), 01024 & LWORK-2*N, IERR ) 01025 DO 8998 p = 1, NR 01026 CALL SCOPY( NR-p+1, V(p,p), LDV, V(p,p), 1 ) 01027 8998 CONTINUE 01028 CALL SLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 01029 * 01030 CALL SGESVJ( 'Lower', 'U','N', NR, NR, V,LDV, SVA, NR, U, 01031 & LDU, WORK(N+1), LWORK, INFO ) 01032 SCALEM = WORK(N+1) 01033 NUMRANK = NINT(WORK(N+2)) 01034 IF ( NR .LT. N ) THEN 01035 CALL SLASET( 'A',N-NR, NR, ZERO,ZERO, V(NR+1,1), LDV ) 01036 CALL SLASET( 'A',NR, N-NR, ZERO,ZERO, V(1,NR+1), LDV ) 01037 CALL SLASET( 'A',N-NR,N-NR,ZERO,ONE, V(NR+1,NR+1), LDV ) 01038 END IF 01039 * 01040 CALL SORMLQ( 'Left', 'Transpose', N, N, NR, A, LDA, WORK, 01041 & V, LDV, WORK(N+1), LWORK-N, IERR ) 01042 * 01043 END IF 01044 * 01045 DO 8991 p = 1, N 01046 CALL SCOPY( N, V(p,1), LDV, A(IWORK(p),1), LDA ) 01047 8991 CONTINUE 01048 CALL SLACPY( 'All', N, N, A, LDA, V, LDV ) 01049 * 01050 IF ( TRANSP ) THEN 01051 CALL SLACPY( 'All', N, N, V, LDV, U, LDU ) 01052 END IF 01053 * 01054 ELSE IF ( LSVEC .AND. ( .NOT. RSVEC ) ) THEN 01055 * 01056 * .. Singular Values and Left Singular Vectors .. 01057 * 01058 * .. second preconditioning step to avoid need to accumulate 01059 * Jacobi rotations in the Jacobi iterations. 01060 DO 1965 p = 1, NR 01061 CALL SCOPY( N-p+1, A(p,p), LDA, U(p,p), 1 ) 01062 1965 CONTINUE 01063 CALL SLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 01064 * 01065 CALL SGEQRF( N, NR, U, LDU, WORK(N+1), WORK(2*N+1), 01066 & LWORK-2*N, IERR ) 01067 * 01068 DO 1967 p = 1, NR - 1 01069 CALL SCOPY( NR-p, U(p,p+1), LDU, U(p+1,p), 1 ) 01070 1967 CONTINUE 01071 CALL SLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 01072 * 01073 CALL SGESVJ( 'Lower', 'U', 'N', NR,NR, U, LDU, SVA, NR, A, 01074 & LDA, WORK(N+1), LWORK-N, INFO ) 01075 SCALEM = WORK(N+1) 01076 NUMRANK = NINT(WORK(N+2)) 01077 * 01078 IF ( NR .LT. M ) THEN 01079 CALL SLASET( 'A', M-NR, NR,ZERO, ZERO, U(NR+1,1), LDU ) 01080 IF ( NR .LT. N1 ) THEN 01081 CALL SLASET( 'A',NR, N1-NR, ZERO, ZERO, U(1,NR+1), LDU ) 01082 CALL SLASET( 'A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1), LDU ) 01083 END IF 01084 END IF 01085 * 01086 CALL SORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 01087 & LDU, WORK(N+1), LWORK-N, IERR ) 01088 * 01089 IF ( ROWPIV ) 01090 & CALL SLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 01091 * 01092 DO 1974 p = 1, N1 01093 XSC = ONE / SNRM2( M, U(1,p), 1 ) 01094 CALL SSCAL( M, XSC, U(1,p), 1 ) 01095 1974 CONTINUE 01096 * 01097 IF ( TRANSP ) THEN 01098 CALL SLACPY( 'All', N, N, U, LDU, V, LDV ) 01099 END IF 01100 * 01101 ELSE 01102 * 01103 * .. Full SVD .. 01104 * 01105 IF ( .NOT. JRACC ) THEN 01106 * 01107 IF ( .NOT. ALMORT ) THEN 01108 * 01109 * Second Preconditioning Step (QRF [with pivoting]) 01110 * Note that the composition of TRANSPOSE, QRF and TRANSPOSE is 01111 * equivalent to an LQF CALL. Since in many libraries the QRF 01112 * seems to be better optimized than the LQF, we do explicit 01113 * transpose and use the QRF. This is subject to changes in an 01114 * optimized implementation of SGEJSV. 01115 * 01116 DO 1968 p = 1, NR 01117 CALL SCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 01118 1968 CONTINUE 01119 * 01120 * .. the following two loops perturb small entries to avoid 01121 * denormals in the second QR factorization, where they are 01122 * as good as zeros. This is done to avoid painfully slow 01123 * computation with denormals. The relative size of the perturbation 01124 * is a parameter that can be changed by the implementer. 01125 * This perturbation device will be obsolete on machines with 01126 * properly implemented arithmetic. 01127 * To switch it off, set L2PERT=.FALSE. To remove it from the 01128 * code, remove the action under L2PERT=.TRUE., leave the ELSE part. 01129 * The following two loops should be blocked and fused with the 01130 * transposed copy above. 01131 * 01132 IF ( L2PERT ) THEN 01133 XSC = SQRT(SMALL) 01134 DO 2969 q = 1, NR 01135 TEMP1 = XSC*ABS( V(q,q) ) 01136 DO 2968 p = 1, N 01137 IF ( ( p .GT. q ) .AND. ( ABS(V(p,q)) .LE. TEMP1 ) 01138 & .OR. ( p .LT. q ) ) 01139 & V(p,q) = SIGN( TEMP1, V(p,q) ) 01140 IF ( p. LT. q ) V(p,q) = - V(p,q) 01141 2968 CONTINUE 01142 2969 CONTINUE 01143 ELSE 01144 CALL SLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 01145 END IF 01146 * 01147 * Estimate the row scaled condition number of R1 01148 * (If R1 is rectangular, N > NR, then the condition number 01149 * of the leading NR x NR submatrix is estimated.) 01150 * 01151 CALL SLACPY( 'L', NR, NR, V, LDV, WORK(2*N+1), NR ) 01152 DO 3950 p = 1, NR 01153 TEMP1 = SNRM2(NR-p+1,WORK(2*N+(p-1)*NR+p),1) 01154 CALL SSCAL(NR-p+1,ONE/TEMP1,WORK(2*N+(p-1)*NR+p),1) 01155 3950 CONTINUE 01156 CALL SPOCON('Lower',NR,WORK(2*N+1),NR,ONE,TEMP1, 01157 & WORK(2*N+NR*NR+1),IWORK(M+2*N+1),IERR) 01158 CONDR1 = ONE / SQRT(TEMP1) 01159 * .. here need a second oppinion on the condition number 01160 * .. then assume worst case scenario 01161 * R1 is OK for inverse <=> CONDR1 .LT. FLOAT(N) 01162 * more conservative <=> CONDR1 .LT. SQRT(FLOAT(N)) 01163 * 01164 COND_OK = SQRT(FLOAT(NR)) 01165 *[TP] COND_OK is a tuning parameter. 01166 01167 IF ( CONDR1 .LT. COND_OK ) THEN 01168 * .. the second QRF without pivoting. Note: in an optimized 01169 * implementation, this QRF should be implemented as the QRF 01170 * of a lower triangular matrix. 01171 * R1^t = Q2 * R2 01172 CALL SGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 01173 & LWORK-2*N, IERR ) 01174 * 01175 IF ( L2PERT ) THEN 01176 XSC = SQRT(SMALL)/EPSLN 01177 DO 3959 p = 2, NR 01178 DO 3958 q = 1, p - 1 01179 TEMP1 = XSC * AMIN1(ABS(V(p,p)),ABS(V(q,q))) 01180 IF ( ABS(V(q,p)) .LE. TEMP1 ) 01181 & V(q,p) = SIGN( TEMP1, V(q,p) ) 01182 3958 CONTINUE 01183 3959 CONTINUE 01184 END IF 01185 * 01186 IF ( NR .NE. N ) 01187 * .. save ... 01188 & CALL SLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N ) 01189 * 01190 * .. this transposed copy should be better than naive 01191 DO 1969 p = 1, NR - 1 01192 CALL SCOPY( NR-p, V(p,p+1), LDV, V(p+1,p), 1 ) 01193 1969 CONTINUE 01194 * 01195 CONDR2 = CONDR1 01196 * 01197 ELSE 01198 * 01199 * .. ill-conditioned case: second QRF with pivoting 01200 * Note that windowed pivoting would be equaly good 01201 * numerically, and more run-time efficient. So, in 01202 * an optimal implementation, the next call to SGEQP3 01203 * should be replaced with eg. CALL SGEQPX (ACM TOMS #782) 01204 * with properly (carefully) chosen parameters. 01205 * 01206 * R1^t * P2 = Q2 * R2 01207 DO 3003 p = 1, NR 01208 IWORK(N+p) = 0 01209 3003 CONTINUE 01210 CALL SGEQP3( N, NR, V, LDV, IWORK(N+1), WORK(N+1), 01211 & WORK(2*N+1), LWORK-2*N, IERR ) 01212 ** CALL SGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 01213 ** & LWORK-2*N, IERR ) 01214 IF ( L2PERT ) THEN 01215 XSC = SQRT(SMALL) 01216 DO 3969 p = 2, NR 01217 DO 3968 q = 1, p - 1 01218 TEMP1 = XSC * AMIN1(ABS(V(p,p)),ABS(V(q,q))) 01219 IF ( ABS(V(q,p)) .LE. TEMP1 ) 01220 & V(q,p) = SIGN( TEMP1, V(q,p) ) 01221 3968 CONTINUE 01222 3969 CONTINUE 01223 END IF 01224 * 01225 CALL SLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N ) 01226 * 01227 IF ( L2PERT ) THEN 01228 XSC = SQRT(SMALL) 01229 DO 8970 p = 2, NR 01230 DO 8971 q = 1, p - 1 01231 TEMP1 = XSC * AMIN1(ABS(V(p,p)),ABS(V(q,q))) 01232 V(p,q) = - SIGN( TEMP1, V(q,p) ) 01233 8971 CONTINUE 01234 8970 CONTINUE 01235 ELSE 01236 CALL SLASET( 'L',NR-1,NR-1,ZERO,ZERO,V(2,1),LDV ) 01237 END IF 01238 * Now, compute R2 = L3 * Q3, the LQ factorization. 01239 CALL SGELQF( NR, NR, V, LDV, WORK(2*N+N*NR+1), 01240 & WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, IERR ) 01241 * .. and estimate the condition number 01242 CALL SLACPY( 'L',NR,NR,V,LDV,WORK(2*N+N*NR+NR+1),NR ) 01243 DO 4950 p = 1, NR 01244 TEMP1 = SNRM2( p, WORK(2*N+N*NR+NR+p), NR ) 01245 CALL SSCAL( p, ONE/TEMP1, WORK(2*N+N*NR+NR+p), NR ) 01246 4950 CONTINUE 01247 CALL SPOCON( 'L',NR,WORK(2*N+N*NR+NR+1),NR,ONE,TEMP1, 01248 & WORK(2*N+N*NR+NR+NR*NR+1),IWORK(M+2*N+1),IERR ) 01249 CONDR2 = ONE / SQRT(TEMP1) 01250 * 01251 IF ( CONDR2 .GE. COND_OK ) THEN 01252 * .. save the Householder vectors used for Q3 01253 * (this overwrittes the copy of R2, as it will not be 01254 * needed in this branch, but it does not overwritte the 01255 * Huseholder vectors of Q2.). 01256 CALL SLACPY( 'U', NR, NR, V, LDV, WORK(2*N+1), N ) 01257 * .. and the rest of the information on Q3 is in 01258 * WORK(2*N+N*NR+1:2*N+N*NR+N) 01259 END IF 01260 * 01261 END IF 01262 * 01263 IF ( L2PERT ) THEN 01264 XSC = SQRT(SMALL) 01265 DO 4968 q = 2, NR 01266 TEMP1 = XSC * V(q,q) 01267 DO 4969 p = 1, q - 1 01268 * V(p,q) = - SIGN( TEMP1, V(q,p) ) 01269 V(p,q) = - SIGN( TEMP1, V(p,q) ) 01270 4969 CONTINUE 01271 4968 CONTINUE 01272 ELSE 01273 CALL SLASET( 'U', NR-1,NR-1, ZERO,ZERO, V(1,2), LDV ) 01274 END IF 01275 * 01276 * Second preconditioning finished; continue with Jacobi SVD 01277 * The input matrix is lower trinagular. 01278 * 01279 * Recover the right singular vectors as solution of a well 01280 * conditioned triangular matrix equation. 01281 * 01282 IF ( CONDR1 .LT. COND_OK ) THEN 01283 * 01284 CALL SGESVJ( 'L','U','N',NR,NR,V,LDV,SVA,NR,U, 01285 & LDU,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,INFO ) 01286 SCALEM = WORK(2*N+N*NR+NR+1) 01287 NUMRANK = NINT(WORK(2*N+N*NR+NR+2)) 01288 DO 3970 p = 1, NR 01289 CALL SCOPY( NR, V(1,p), 1, U(1,p), 1 ) 01290 CALL SSCAL( NR, SVA(p), V(1,p), 1 ) 01291 3970 CONTINUE 01292 01293 * .. pick the right matrix equation and solve it 01294 * 01295 IF ( NR. EQ. N ) THEN 01296 * :)) .. best case, R1 is inverted. The solution of this matrix 01297 * equation is Q2*V2 = the product of the Jacobi rotations 01298 * used in SGESVJ, premultiplied with the orthogonal matrix 01299 * from the second QR factorization. 01300 CALL STRSM( 'L','U','N','N', NR,NR,ONE, A,LDA, V,LDV ) 01301 ELSE 01302 * .. R1 is well conditioned, but non-square. Transpose(R2) 01303 * is inverted to get the product of the Jacobi rotations 01304 * used in SGESVJ. The Q-factor from the second QR 01305 * factorization is then built in explicitly. 01306 CALL STRSM('L','U','T','N',NR,NR,ONE,WORK(2*N+1), 01307 & N,V,LDV) 01308 IF ( NR .LT. N ) THEN 01309 CALL SLASET('A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV) 01310 CALL SLASET('A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV) 01311 CALL SLASET('A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV) 01312 END IF 01313 CALL SORMQR('L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 01314 & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR) 01315 END IF 01316 * 01317 ELSE IF ( CONDR2 .LT. COND_OK ) THEN 01318 * 01319 * :) .. the input matrix A is very likely a relative of 01320 * the Kahan matrix :) 01321 * The matrix R2 is inverted. The solution of the matrix equation 01322 * is Q3^T*V3 = the product of the Jacobi rotations (appplied to 01323 * the lower triangular L3 from the LQ factorization of 01324 * R2=L3*Q3), pre-multiplied with the transposed Q3. 01325 CALL SGESVJ( 'L', 'U', 'N', NR, NR, V, LDV, SVA, NR, U, 01326 & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO ) 01327 SCALEM = WORK(2*N+N*NR+NR+1) 01328 NUMRANK = NINT(WORK(2*N+N*NR+NR+2)) 01329 DO 3870 p = 1, NR 01330 CALL SCOPY( NR, V(1,p), 1, U(1,p), 1 ) 01331 CALL SSCAL( NR, SVA(p), U(1,p), 1 ) 01332 3870 CONTINUE 01333 CALL STRSM('L','U','N','N',NR,NR,ONE,WORK(2*N+1),N,U,LDU) 01334 * .. apply the permutation from the second QR factorization 01335 DO 873 q = 1, NR 01336 DO 872 p = 1, NR 01337 WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q) 01338 872 CONTINUE 01339 DO 874 p = 1, NR 01340 U(p,q) = WORK(2*N+N*NR+NR+p) 01341 874 CONTINUE 01342 873 CONTINUE 01343 IF ( NR .LT. N ) THEN 01344 CALL SLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 01345 CALL SLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 01346 CALL SLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 01347 END IF 01348 CALL SORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 01349 & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 01350 ELSE 01351 * Last line of defense. 01352 * #:( This is a rather pathological case: no scaled condition 01353 * improvement after two pivoted QR factorizations. Other 01354 * possibility is that the rank revealing QR factorization 01355 * or the condition estimator has failed, or the COND_OK 01356 * is set very close to ONE (which is unnecessary). Normally, 01357 * this branch should never be executed, but in rare cases of 01358 * failure of the RRQR or condition estimator, the last line of 01359 * defense ensures that SGEJSV completes the task. 01360 * Compute the full SVD of L3 using SGESVJ with explicit 01361 * accumulation of Jacobi rotations. 01362 CALL SGESVJ( 'L', 'U', 'V', NR, NR, V, LDV, SVA, NR, U, 01363 & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO ) 01364 SCALEM = WORK(2*N+N*NR+NR+1) 01365 NUMRANK = NINT(WORK(2*N+N*NR+NR+2)) 01366 IF ( NR .LT. N ) THEN 01367 CALL SLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 01368 CALL SLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 01369 CALL SLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 01370 END IF 01371 CALL SORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 01372 & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 01373 * 01374 CALL SORMLQ( 'L', 'T', NR, NR, NR, WORK(2*N+1), N, 01375 & WORK(2*N+N*NR+1), U, LDU, WORK(2*N+N*NR+NR+1), 01376 & LWORK-2*N-N*NR-NR, IERR ) 01377 DO 773 q = 1, NR 01378 DO 772 p = 1, NR 01379 WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q) 01380 772 CONTINUE 01381 DO 774 p = 1, NR 01382 U(p,q) = WORK(2*N+N*NR+NR+p) 01383 774 CONTINUE 01384 773 CONTINUE 01385 * 01386 END IF 01387 * 01388 * Permute the rows of V using the (column) permutation from the 01389 * first QRF. Also, scale the columns to make them unit in 01390 * Euclidean norm. This applies to all cases. 01391 * 01392 TEMP1 = SQRT(FLOAT(N)) * EPSLN 01393 DO 1972 q = 1, N 01394 DO 972 p = 1, N 01395 WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q) 01396 972 CONTINUE 01397 DO 973 p = 1, N 01398 V(p,q) = WORK(2*N+N*NR+NR+p) 01399 973 CONTINUE 01400 XSC = ONE / SNRM2( N, V(1,q), 1 ) 01401 IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 01402 & CALL SSCAL( N, XSC, V(1,q), 1 ) 01403 1972 CONTINUE 01404 * At this moment, V contains the right singular vectors of A. 01405 * Next, assemble the left singular vector matrix U (M x N). 01406 IF ( NR .LT. M ) THEN 01407 CALL SLASET( 'A', M-NR, NR, ZERO, ZERO, U(NR+1,1), LDU ) 01408 IF ( NR .LT. N1 ) THEN 01409 CALL SLASET('A',NR,N1-NR,ZERO,ZERO,U(1,NR+1),LDU) 01410 CALL SLASET('A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1),LDU) 01411 END IF 01412 END IF 01413 * 01414 * The Q matrix from the first QRF is built into the left singular 01415 * matrix U. This applies to all cases. 01416 * 01417 CALL SORMQR( 'Left', 'No_Tr', M, N1, N, A, LDA, WORK, U, 01418 & LDU, WORK(N+1), LWORK-N, IERR ) 01419 01420 * The columns of U are normalized. The cost is O(M*N) flops. 01421 TEMP1 = SQRT(FLOAT(M)) * EPSLN 01422 DO 1973 p = 1, NR 01423 XSC = ONE / SNRM2( M, U(1,p), 1 ) 01424 IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 01425 & CALL SSCAL( M, XSC, U(1,p), 1 ) 01426 1973 CONTINUE 01427 * 01428 * If the initial QRF is computed with row pivoting, the left 01429 * singular vectors must be adjusted. 01430 * 01431 IF ( ROWPIV ) 01432 & CALL SLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 01433 * 01434 ELSE 01435 * 01436 * .. the initial matrix A has almost orthogonal columns and 01437 * the second QRF is not needed 01438 * 01439 CALL SLACPY( 'Upper', N, N, A, LDA, WORK(N+1), N ) 01440 IF ( L2PERT ) THEN 01441 XSC = SQRT(SMALL) 01442 DO 5970 p = 2, N 01443 TEMP1 = XSC * WORK( N + (p-1)*N + p ) 01444 DO 5971 q = 1, p - 1 01445 WORK(N+(q-1)*N+p)=-SIGN(TEMP1,WORK(N+(p-1)*N+q)) 01446 5971 CONTINUE 01447 5970 CONTINUE 01448 ELSE 01449 CALL SLASET( 'Lower',N-1,N-1,ZERO,ZERO,WORK(N+2),N ) 01450 END IF 01451 * 01452 CALL SGESVJ( 'Upper', 'U', 'N', N, N, WORK(N+1), N, SVA, 01453 & N, U, LDU, WORK(N+N*N+1), LWORK-N-N*N, INFO ) 01454 * 01455 SCALEM = WORK(N+N*N+1) 01456 NUMRANK = NINT(WORK(N+N*N+2)) 01457 DO 6970 p = 1, N 01458 CALL SCOPY( N, WORK(N+(p-1)*N+1), 1, U(1,p), 1 ) 01459 CALL SSCAL( N, SVA(p), WORK(N+(p-1)*N+1), 1 ) 01460 6970 CONTINUE 01461 * 01462 CALL STRSM( 'Left', 'Upper', 'NoTrans', 'No UD', N, N, 01463 & ONE, A, LDA, WORK(N+1), N ) 01464 DO 6972 p = 1, N 01465 CALL SCOPY( N, WORK(N+p), N, V(IWORK(p),1), LDV ) 01466 6972 CONTINUE 01467 TEMP1 = SQRT(FLOAT(N))*EPSLN 01468 DO 6971 p = 1, N 01469 XSC = ONE / SNRM2( N, V(1,p), 1 ) 01470 IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 01471 & CALL SSCAL( N, XSC, V(1,p), 1 ) 01472 6971 CONTINUE 01473 * 01474 * Assemble the left singular vector matrix U (M x N). 01475 * 01476 IF ( N .LT. M ) THEN 01477 CALL SLASET( 'A', M-N, N, ZERO, ZERO, U(N+1,1), LDU ) 01478 IF ( N .LT. N1 ) THEN 01479 CALL SLASET( 'A',N, N1-N, ZERO, ZERO, U(1,N+1),LDU ) 01480 CALL SLASET( 'A',M-N,N1-N, ZERO, ONE,U(N+1,N+1),LDU ) 01481 END IF 01482 END IF 01483 CALL SORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 01484 & LDU, WORK(N+1), LWORK-N, IERR ) 01485 TEMP1 = SQRT(FLOAT(M))*EPSLN 01486 DO 6973 p = 1, N1 01487 XSC = ONE / SNRM2( M, U(1,p), 1 ) 01488 IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 01489 & CALL SSCAL( M, XSC, U(1,p), 1 ) 01490 6973 CONTINUE 01491 * 01492 IF ( ROWPIV ) 01493 & CALL SLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 01494 * 01495 END IF 01496 * 01497 * end of the >> almost orthogonal case << in the full SVD 01498 * 01499 ELSE 01500 * 01501 * This branch deploys a preconditioned Jacobi SVD with explicitly 01502 * accumulated rotations. It is included as optional, mainly for 01503 * experimental purposes. It does perfom well, and can also be used. 01504 * In this implementation, this branch will be automatically activated 01505 * if the condition number sigma_max(A) / sigma_min(A) is predicted 01506 * to be greater than the overflow threshold. This is because the 01507 * a posteriori computation of the singular vectors assumes robust 01508 * implementation of BLAS and some LAPACK procedures, capable of working 01509 * in presence of extreme values. Since that is not always the case, ... 01510 * 01511 DO 7968 p = 1, NR 01512 CALL SCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 01513 7968 CONTINUE 01514 * 01515 IF ( L2PERT ) THEN 01516 XSC = SQRT(SMALL/EPSLN) 01517 DO 5969 q = 1, NR 01518 TEMP1 = XSC*ABS( V(q,q) ) 01519 DO 5968 p = 1, N 01520 IF ( ( p .GT. q ) .AND. ( ABS(V(p,q)) .LE. TEMP1 ) 01521 & .OR. ( p .LT. q ) ) 01522 & V(p,q) = SIGN( TEMP1, V(p,q) ) 01523 IF ( p. LT. q ) V(p,q) = - V(p,q) 01524 5968 CONTINUE 01525 5969 CONTINUE 01526 ELSE 01527 CALL SLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 01528 END IF 01529 01530 CALL SGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 01531 & LWORK-2*N, IERR ) 01532 CALL SLACPY( 'L', N, NR, V, LDV, WORK(2*N+1), N ) 01533 * 01534 DO 7969 p = 1, NR 01535 CALL SCOPY( NR-p+1, V(p,p), LDV, U(p,p), 1 ) 01536 7969 CONTINUE 01537 01538 IF ( L2PERT ) THEN 01539 XSC = SQRT(SMALL/EPSLN) 01540 DO 9970 q = 2, NR 01541 DO 9971 p = 1, q - 1 01542 TEMP1 = XSC * AMIN1(ABS(U(p,p)),ABS(U(q,q))) 01543 U(p,q) = - SIGN( TEMP1, U(q,p) ) 01544 9971 CONTINUE 01545 9970 CONTINUE 01546 ELSE 01547 CALL SLASET('U', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 01548 END IF 01549 01550 CALL SGESVJ( 'L', 'U', 'V', NR, NR, U, LDU, SVA, 01551 & N, V, LDV, WORK(2*N+N*NR+1), LWORK-2*N-N*NR, INFO ) 01552 SCALEM = WORK(2*N+N*NR+1) 01553 NUMRANK = NINT(WORK(2*N+N*NR+2)) 01554 01555 IF ( NR .LT. N ) THEN 01556 CALL SLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 01557 CALL SLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 01558 CALL SLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 01559 END IF 01560 01561 CALL SORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 01562 & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 01563 * 01564 * Permute the rows of V using the (column) permutation from the 01565 * first QRF. Also, scale the columns to make them unit in 01566 * Euclidean norm. This applies to all cases. 01567 * 01568 TEMP1 = SQRT(FLOAT(N)) * EPSLN 01569 DO 7972 q = 1, N 01570 DO 8972 p = 1, N 01571 WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q) 01572 8972 CONTINUE 01573 DO 8973 p = 1, N 01574 V(p,q) = WORK(2*N+N*NR+NR+p) 01575 8973 CONTINUE 01576 XSC = ONE / SNRM2( N, V(1,q), 1 ) 01577 IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 01578 & CALL SSCAL( N, XSC, V(1,q), 1 ) 01579 7972 CONTINUE 01580 * 01581 * At this moment, V contains the right singular vectors of A. 01582 * Next, assemble the left singular vector matrix U (M x N). 01583 * 01584 IF ( NR .LT. M ) THEN 01585 CALL SLASET( 'A', M-NR, NR, ZERO, ZERO, U(NR+1,1), LDU ) 01586 IF ( NR .LT. N1 ) THEN 01587 CALL SLASET( 'A',NR, N1-NR, ZERO, ZERO, U(1,NR+1),LDU ) 01588 CALL SLASET( 'A',M-NR,N1-NR, ZERO, ONE,U(NR+1,NR+1),LDU ) 01589 END IF 01590 END IF 01591 * 01592 CALL SORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 01593 & LDU, WORK(N+1), LWORK-N, IERR ) 01594 * 01595 IF ( ROWPIV ) 01596 & CALL SLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 01597 * 01598 * 01599 END IF 01600 IF ( TRANSP ) THEN 01601 * .. swap U and V because the procedure worked on A^t 01602 DO 6974 p = 1, N 01603 CALL SSWAP( N, U(1,p), 1, V(1,p), 1 ) 01604 6974 CONTINUE 01605 END IF 01606 * 01607 END IF 01608 * end of the full SVD 01609 * 01610 * Undo scaling, if necessary (and possible) 01611 * 01612 IF ( USCAL2 .LE. (BIG/SVA(1))*USCAL1 ) THEN 01613 CALL SLASCL( 'G', 0, 0, USCAL1, USCAL2, NR, 1, SVA, N, IERR ) 01614 USCAL1 = ONE 01615 USCAL2 = ONE 01616 END IF 01617 * 01618 IF ( NR .LT. N ) THEN 01619 DO 3004 p = NR+1, N 01620 SVA(p) = ZERO 01621 3004 CONTINUE 01622 END IF 01623 * 01624 WORK(1) = USCAL2 * SCALEM 01625 WORK(2) = USCAL1 01626 IF ( ERREST ) WORK(3) = SCONDA 01627 IF ( LSVEC .AND. RSVEC ) THEN 01628 WORK(4) = CONDR1 01629 WORK(5) = CONDR2 01630 END IF 01631 IF ( L2TRAN ) THEN 01632 WORK(6) = ENTRA 01633 WORK(7) = ENTRAT 01634 END IF 01635 * 01636 IWORK(1) = NR 01637 IWORK(2) = NUMRANK 01638 IWORK(3) = WARNING 01639 * 01640 RETURN 01641 * .. 01642 * .. END OF SGEJSV 01643 * .. 01644 END 01645 *