LAPACK 3.12.1
LAPACK: Linear Algebra PACKage
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Modules Pages
sgeqp3rk.f
Go to the documentation of this file.
1*> \brief \b SGEQP3RK computes a truncated Householder QR factorization with column pivoting of a real m-by-n matrix A by using Level 3 BLAS and overwrites a real m-by-nrhs matrix B with Q**T * B.
2*
3* =========== DOCUMENTATION ===========
4*
5* Online html documentation available at
6* http://www.netlib.org/lapack/explore-html/
7*
8*> Download SGEQP3RK + dependencies
9*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/sgeqp3rk.f">
10*> [TGZ]</a>
11*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/sgeqp3rk.f">
12*> [ZIP]</a>
13*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/sgeqp3rk.f">
14*> [TXT]</a>
15*
16* Definition:
17* ===========
18*
19* SUBROUTINE SGEQP3RK( M, N, NRHS, KMAX, ABSTOL, RELTOL, A, LDA,
20* $ K, MAXC2NRMK, RELMAXC2NRMK, JPIV, TAU,
21* $ WORK, LWORK, IWORK, INFO )
22* IMPLICIT NONE
23*
24* .. Scalar Arguments ..
25* INTEGER INFO, K, KMAX, LDA, LWORK, M, N, NRHS
26* REAL ABSTOL, MAXC2NRMK, RELMAXC2NRMK, RELTOL
27* ..
28* .. Array Arguments ..
29* INTEGER IWORK( * ), JPIV( * )
30* REAL A( LDA, * ), TAU( * ), WORK( * )
31* ..
32*
33*
34*> \par Purpose:
35* =============
36*>
37*> \verbatim
38*>
39*> SGEQP3RK performs two tasks simultaneously:
40*>
41*> Task 1: The routine computes a truncated (rank K) or full rank
42*> Householder QR factorization with column pivoting of a real
43*> M-by-N matrix A using Level 3 BLAS. K is the number of columns
44*> that were factorized, i.e. factorization rank of the
45*> factor R, K <= min(M,N).
46*>
47*> A * P(K) = Q(K) * R(K) =
48*>
49*> = Q(K) * ( R11(K) R12(K) ) = Q(K) * ( R(K)_approx )
50*> ( 0 R22(K) ) ( 0 R(K)_residual ),
51*>
52*> where:
53*>
54*> P(K) is an N-by-N permutation matrix;
55*> Q(K) is an M-by-M orthogonal matrix;
56*> R(K)_approx = ( R11(K), R12(K) ) is a rank K approximation of the
57*> full rank factor R with K-by-K upper-triangular
58*> R11(K) and K-by-N rectangular R12(K). The diagonal
59*> entries of R11(K) appear in non-increasing order
60*> of absolute value, and absolute values of all of
61*> them exceed the maximum column 2-norm of R22(K)
62*> up to roundoff error.
63*> R(K)_residual = R22(K) is the residual of a rank K approximation
64*> of the full rank factor R. It is a
65*> an (M-K)-by-(N-K) rectangular matrix;
66*> 0 is a an (M-K)-by-K zero matrix.
67*>
68*> Task 2: At the same time, the routine overwrites a real M-by-NRHS
69*> matrix B with Q(K)**T * B using Level 3 BLAS.
70*>
71*> =====================================================================
72*>
73*> The matrices A and B are stored on input in the array A as
74*> the left and right blocks A(1:M,1:N) and A(1:M, N+1:N+NRHS)
75*> respectively.
76*>
77*> N NRHS
78*> array_A = M [ mat_A, mat_B ]
79*>
80*> The truncation criteria (i.e. when to stop the factorization)
81*> can be any of the following:
82*>
83*> 1) The input parameter KMAX, the maximum number of columns
84*> KMAX to factorize, i.e. the factorization rank is limited
85*> to KMAX. If KMAX >= min(M,N), the criterion is not used.
86*>
87*> 2) The input parameter ABSTOL, the absolute tolerance for
88*> the maximum column 2-norm of the residual matrix R22(K). This
89*> means that the factorization stops if this norm is less or
90*> equal to ABSTOL. If ABSTOL < 0.0, the criterion is not used.
91*>
92*> 3) The input parameter RELTOL, the tolerance for the maximum
93*> column 2-norm matrix of the residual matrix R22(K) divided
94*> by the maximum column 2-norm of the original matrix A, which
95*> is equal to abs(R(1,1)). This means that the factorization stops
96*> when the ratio of the maximum column 2-norm of R22(K) to
97*> the maximum column 2-norm of A is less than or equal to RELTOL.
98*> If RELTOL < 0.0, the criterion is not used.
99*>
100*> 4) In case both stopping criteria ABSTOL or RELTOL are not used,
101*> and when the residual matrix R22(K) is a zero matrix in some
102*> factorization step K. ( This stopping criterion is implicit. )
103*>
104*> The algorithm stops when any of these conditions is first
105*> satisfied, otherwise the whole matrix A is factorized.
106*>
107*> To factorize the whole matrix A, use the values
108*> KMAX >= min(M,N), ABSTOL < 0.0 and RELTOL < 0.0.
109*>
110*> The routine returns:
111*> a) Q(K), R(K)_approx = ( R11(K), R12(K) ),
112*> R(K)_residual = R22(K), P(K), i.e. the resulting matrices
113*> of the factorization; P(K) is represented by JPIV,
114*> ( if K = min(M,N), R(K)_approx is the full factor R,
115*> and there is no residual matrix R(K)_residual);
116*> b) K, the number of columns that were factorized,
117*> i.e. factorization rank;
118*> c) MAXC2NRMK, the maximum column 2-norm of the residual
119*> matrix R(K)_residual = R22(K),
120*> ( if K = min(M,N), MAXC2NRMK = 0.0 );
121*> d) RELMAXC2NRMK equals MAXC2NRMK divided by MAXC2NRM, the maximum
122*> column 2-norm of the original matrix A, which is equal
123*> to abs(R(1,1)), ( if K = min(M,N), RELMAXC2NRMK = 0.0 );
124*> e) Q(K)**T * B, the matrix B with the orthogonal
125*> transformation Q(K)**T applied on the left.
126*>
127*> The N-by-N permutation matrix P(K) is stored in a compact form in
128*> the integer array JPIV. For 1 <= j <= N, column j
129*> of the matrix A was interchanged with column JPIV(j).
130*>
131*> The M-by-M orthogonal matrix Q is represented as a product
132*> of elementary Householder reflectors
133*>
134*> Q(K) = H(1) * H(2) * . . . * H(K),
135*>
136*> where K is the number of columns that were factorized.
137*>
138*> Each H(j) has the form
139*>
140*> H(j) = I - tau * v * v**T,
141*>
142*> where 1 <= j <= K and
143*> I is an M-by-M identity matrix,
144*> tau is a real scalar,
145*> v is a real vector with v(1:j-1) = 0 and v(j) = 1.
146*>
147*> v(j+1:M) is stored on exit in A(j+1:M,j) and tau in TAU(j).
148*>
149*> See the Further Details section for more information.
150*> \endverbatim
151*
152* Arguments:
153* ==========
154*
155*> \param[in] M
156*> \verbatim
157*> M is INTEGER
158*> The number of rows of the matrix A. M >= 0.
159*> \endverbatim
160*>
161*> \param[in] N
162*> \verbatim
163*> N is INTEGER
164*> The number of columns of the matrix A. N >= 0.
165*> \endverbatim
166*>
167*> \param[in] NRHS
168*> \verbatim
169*> NRHS is INTEGER
170*> The number of right hand sides, i.e. the number of
171*> columns of the matrix B. NRHS >= 0.
172*> \endverbatim
173*>
174*> \param[in] KMAX
175*> \verbatim
176*> KMAX is INTEGER
177*>
178*> The first factorization stopping criterion. KMAX >= 0.
179*>
180*> The maximum number of columns of the matrix A to factorize,
181*> i.e. the maximum factorization rank.
182*>
183*> a) If KMAX >= min(M,N), then this stopping criterion
184*> is not used, the routine factorizes columns
185*> depending on ABSTOL and RELTOL.
186*>
187*> b) If KMAX = 0, then this stopping criterion is
188*> satisfied on input and the routine exits immediately.
189*> This means that the factorization is not performed,
190*> the matrices A and B are not modified, and
191*> the matrix A is itself the residual.
192*> \endverbatim
193*>
194*> \param[in] ABSTOL
195*> \verbatim
196*> ABSTOL is REAL
197*>
198*> The second factorization stopping criterion, cannot be NaN.
199*>
200*> The absolute tolerance (stopping threshold) for
201*> maximum column 2-norm of the residual matrix R22(K).
202*> The algorithm converges (stops the factorization) when
203*> the maximum column 2-norm of the residual matrix R22(K)
204*> is less than or equal to ABSTOL. Let SAFMIN = SLAMCH('S').
205*>
206*> a) If ABSTOL is NaN, then no computation is performed
207*> and an error message ( INFO = -5 ) is issued
208*> by XERBLA.
209*>
210*> b) If ABSTOL < 0.0, then this stopping criterion is not
211*> used, the routine factorizes columns depending
212*> on KMAX and RELTOL.
213*> This includes the case ABSTOL = -Inf.
214*>
215*> c) If 0.0 <= ABSTOL < 2*SAFMIN, then ABSTOL = 2*SAFMIN
216*> is used. This includes the case ABSTOL = -0.0.
217*>
218*> d) If 2*SAFMIN <= ABSTOL then the input value
219*> of ABSTOL is used.
220*>
221*> Let MAXC2NRM be the maximum column 2-norm of the
222*> whole original matrix A.
223*> If ABSTOL chosen above is >= MAXC2NRM, then this
224*> stopping criterion is satisfied on input and routine exits
225*> immediately after MAXC2NRM is computed. The routine
226*> returns MAXC2NRM in MAXC2NORMK,
227*> and 1.0 in RELMAXC2NORMK.
228*> This includes the case ABSTOL = +Inf. This means that the
229*> factorization is not performed, the matrices A and B are not
230*> modified, and the matrix A is itself the residual.
231*> \endverbatim
232*>
233*> \param[in] RELTOL
234*> \verbatim
235*> RELTOL is REAL
236*>
237*> The third factorization stopping criterion, cannot be NaN.
238*>
239*> The tolerance (stopping threshold) for the ratio
240*> abs(R(K+1,K+1))/abs(R(1,1)) of the maximum column 2-norm of
241*> the residual matrix R22(K) to the maximum column 2-norm of
242*> the original matrix A. The algorithm converges (stops the
243*> factorization), when abs(R(K+1,K+1))/abs(R(1,1)) A is less
244*> than or equal to RELTOL. Let EPS = SLAMCH('E').
245*>
246*> a) If RELTOL is NaN, then no computation is performed
247*> and an error message ( INFO = -6 ) is issued
248*> by XERBLA.
249*>
250*> b) If RELTOL < 0.0, then this stopping criterion is not
251*> used, the routine factorizes columns depending
252*> on KMAX and ABSTOL.
253*> This includes the case RELTOL = -Inf.
254*>
255*> c) If 0.0 <= RELTOL < EPS, then RELTOL = EPS is used.
256*> This includes the case RELTOL = -0.0.
257*>
258*> d) If EPS <= RELTOL then the input value of RELTOL
259*> is used.
260*>
261*> Let MAXC2NRM be the maximum column 2-norm of the
262*> whole original matrix A.
263*> If RELTOL chosen above is >= 1.0, then this stopping
264*> criterion is satisfied on input and routine exits
265*> immediately after MAXC2NRM is computed.
266*> The routine returns MAXC2NRM in MAXC2NORMK,
267*> and 1.0 in RELMAXC2NORMK.
268*> This includes the case RELTOL = +Inf. This means that the
269*> factorization is not performed, the matrices A and B are not
270*> modified, and the matrix A is itself the residual.
271*>
272*> NOTE: We recommend that RELTOL satisfy
273*> min( max(M,N)*EPS, sqrt(EPS) ) <= RELTOL
274*> \endverbatim
275*>
276*> \param[in,out] A
277*> \verbatim
278*> A is REAL array, dimension (LDA,N+NRHS)
279*>
280*> On entry:
281*>
282*> a) The subarray A(1:M,1:N) contains the M-by-N matrix A.
283*> b) The subarray A(1:M,N+1:N+NRHS) contains the M-by-NRHS
284*> matrix B.
285*>
286*> N NRHS
287*> array_A = M [ mat_A, mat_B ]
288*>
289*> On exit:
290*>
291*> a) The subarray A(1:M,1:N) contains parts of the factors
292*> of the matrix A:
293*>
294*> 1) If K = 0, A(1:M,1:N) contains the original matrix A.
295*> 2) If K > 0, A(1:M,1:N) contains parts of the
296*> factors:
297*>
298*> 1. The elements below the diagonal of the subarray
299*> A(1:M,1:K) together with TAU(1:K) represent the
300*> orthogonal matrix Q(K) as a product of K Householder
301*> elementary reflectors.
302*>
303*> 2. The elements on and above the diagonal of
304*> the subarray A(1:K,1:N) contain K-by-N
305*> upper-trapezoidal matrix
306*> R(K)_approx = ( R11(K), R12(K) ).
307*> NOTE: If K=min(M,N), i.e. full rank factorization,
308*> then R_approx(K) is the full factor R which
309*> is upper-trapezoidal. If, in addition, M>=N,
310*> then R is upper-triangular.
311*>
312*> 3. The subarray A(K+1:M,K+1:N) contains (M-K)-by-(N-K)
313*> rectangular matrix R(K)_residual = R22(K).
314*>
315*> b) If NRHS > 0, the subarray A(1:M,N+1:N+NRHS) contains
316*> the M-by-NRHS product Q(K)**T * B.
317*> \endverbatim
318*>
319*> \param[in] LDA
320*> \verbatim
321*> LDA is INTEGER
322*> The leading dimension of the array A. LDA >= max(1,M).
323*> This is the leading dimension for both matrices, A and B.
324*> \endverbatim
325*>
326*> \param[out] K
327*> \verbatim
328*> K is INTEGER
329*> Factorization rank of the matrix A, i.e. the rank of
330*> the factor R, which is the same as the number of non-zero
331*> rows of the factor R. 0 <= K <= min(M,KMAX,N).
332*>
333*> K also represents the number of non-zero Householder
334*> vectors.
335*>
336*> NOTE: If K = 0, a) the arrays A and B are not modified;
337*> b) the array TAU(1:min(M,N)) is set to ZERO,
338*> if the matrix A does not contain NaN,
339*> otherwise the elements TAU(1:min(M,N))
340*> are undefined;
341*> c) the elements of the array JPIV are set
342*> as follows: for j = 1:N, JPIV(j) = j.
343*> \endverbatim
344*>
345*> \param[out] MAXC2NRMK
346*> \verbatim
347*> MAXC2NRMK is REAL
348*> The maximum column 2-norm of the residual matrix R22(K),
349*> when the factorization stopped at rank K. MAXC2NRMK >= 0.
350*>
351*> a) If K = 0, i.e. the factorization was not performed,
352*> the matrix A was not modified and is itself a residual
353*> matrix, then MAXC2NRMK equals the maximum column 2-norm
354*> of the original matrix A.
355*>
356*> b) If 0 < K < min(M,N), then MAXC2NRMK is returned.
357*>
358*> c) If K = min(M,N), i.e. the whole matrix A was
359*> factorized and there is no residual matrix,
360*> then MAXC2NRMK = 0.0.
361*>
362*> NOTE: MAXC2NRMK in the factorization step K would equal
363*> R(K+1,K+1) in the next factorization step K+1.
364*> \endverbatim
365*>
366*> \param[out] RELMAXC2NRMK
367*> \verbatim
368*> RELMAXC2NRMK is REAL
369*> The ratio MAXC2NRMK / MAXC2NRM of the maximum column
370*> 2-norm of the residual matrix R22(K) (when the factorization
371*> stopped at rank K) to the maximum column 2-norm of the
372*> whole original matrix A. RELMAXC2NRMK >= 0.
373*>
374*> a) If K = 0, i.e. the factorization was not performed,
375*> the matrix A was not modified and is itself a residual
376*> matrix, then RELMAXC2NRMK = 1.0.
377*>
378*> b) If 0 < K < min(M,N), then
379*> RELMAXC2NRMK = MAXC2NRMK / MAXC2NRM is returned.
380*>
381*> c) If K = min(M,N), i.e. the whole matrix A was
382*> factorized and there is no residual matrix,
383*> then RELMAXC2NRMK = 0.0.
384*>
385*> NOTE: RELMAXC2NRMK in the factorization step K would equal
386*> abs(R(K+1,K+1))/abs(R(1,1)) in the next factorization
387*> step K+1.
388*> \endverbatim
389*>
390*> \param[out] JPIV
391*> \verbatim
392*> JPIV is INTEGER array, dimension (N)
393*> Column pivot indices. For 1 <= j <= N, column j
394*> of the matrix A was interchanged with column JPIV(j).
395*>
396*> The elements of the array JPIV(1:N) are always set
397*> by the routine, for example, even when no columns
398*> were factorized, i.e. when K = 0, the elements are
399*> set as JPIV(j) = j for j = 1:N.
400*> \endverbatim
401*>
402*> \param[out] TAU
403*> \verbatim
404*> TAU is REAL array, dimension (min(M,N))
405*> The scalar factors of the elementary reflectors.
406*>
407*> If 0 < K <= min(M,N), only the elements TAU(1:K) of
408*> the array TAU are modified by the factorization.
409*> After the factorization computed, if no NaN was found
410*> during the factorization, the remaining elements
411*> TAU(K+1:min(M,N)) are set to zero, otherwise the
412*> elements TAU(K+1:min(M,N)) are not set and therefore
413*> undefined.
414*> ( If K = 0, all elements of TAU are set to zero, if
415*> the matrix A does not contain NaN. )
416*> \endverbatim
417*>
418*> \param[out] WORK
419*> \verbatim
420*> WORK is REAL array, dimension (MAX(1,LWORK))
421*> On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
422*> \endverbatim
423*>
424*> \param[in] LWORK
425*> \verbatim
426*> LWORK is INTEGER
427*> The dimension of the array WORK.
428*> LWORK >= 1, if MIN(M,N) = 0, and
429*> LWORK >= (3*N+NRHS-1), otherwise.
430*> For optimal performance LWORK >= (2*N + NB*( N+NRHS+1 )),
431*> where NB is the optimal block size for SGEQP3RK returned
432*> by ILAENV. Minimal block size MINNB=2.
433*>
434*> NOTE: The decision, whether to use unblocked BLAS 2
435*> or blocked BLAS 3 code is based not only on the dimension
436*> LWORK of the availbale workspace WORK, but also also on the
437*> matrix A dimension N via crossover point NX returned
438*> by ILAENV. (For N less than NX, unblocked code should be
439*> used.)
440*>
441*> If LWORK = -1, then a workspace query is assumed;
442*> the routine only calculates the optimal size of the WORK
443*> array, returns this value as the first entry of the WORK
444*> array, and no error message related to LWORK is issued
445*> by XERBLA.
446*> \endverbatim
447*>
448*> \param[out] IWORK
449*> \verbatim
450*> IWORK is INTEGER array, dimension (N-1).
451*> Is a work array. ( IWORK is used to store indices
452*> of "bad" columns for norm downdating in the residual
453*> matrix in the blocked step auxiliary subroutine SLAQP3RK ).
454*> \endverbatim
455*>
456*> \param[out] INFO
457*> \verbatim
458*> INFO is INTEGER
459*> 1) INFO = 0: successful exit.
460*> 2) INFO < 0: if INFO = -i, the i-th argument had an
461*> illegal value.
462*> 3) If INFO = j_1, where 1 <= j_1 <= N, then NaN was
463*> detected and the routine stops the computation.
464*> The j_1-th column of the matrix A or the j_1-th
465*> element of array TAU contains the first occurrence
466*> of NaN in the factorization step K+1 ( when K columns
467*> have been factorized ).
468*>
469*> On exit:
470*> K is set to the number of
471*> factorized columns without
472*> exception.
473*> MAXC2NRMK is set to NaN.
474*> RELMAXC2NRMK is set to NaN.
475*> TAU(K+1:min(M,N)) is not set and contains undefined
476*> elements. If j_1=K+1, TAU(K+1)
477*> may contain NaN.
478*> 4) If INFO = j_2, where N+1 <= j_2 <= 2*N, then no NaN
479*> was detected, but +Inf (or -Inf) was detected and
480*> the routine continues the computation until completion.
481*> The (j_2-N)-th column of the matrix A contains the first
482*> occurrence of +Inf (or -Inf) in the factorization
483*> step K+1 ( when K columns have been factorized ).
484*> \endverbatim
485*
486* Authors:
487* ========
488*
489*> \author Univ. of Tennessee
490*> \author Univ. of California Berkeley
491*> \author Univ. of Colorado Denver
492*> \author NAG Ltd.
493*
494*> \ingroup geqp3rk
495*
496*> \par Further Details:
497* =====================
498*
499*> \verbatim
500*> SGEQP3RK is based on the same BLAS3 Householder QR factorization
501*> algorithm with column pivoting as in SGEQP3 routine which uses
502*> SLARFG routine to generate Householder reflectors
503*> for QR factorization.
504*>
505*> We can also write:
506*>
507*> A = A_approx(K) + A_residual(K)
508*>
509*> The low rank approximation matrix A(K)_approx from
510*> the truncated QR factorization of rank K of the matrix A is:
511*>
512*> A(K)_approx = Q(K) * ( R(K)_approx ) * P(K)**T
513*> ( 0 0 )
514*>
515*> = Q(K) * ( R11(K) R12(K) ) * P(K)**T
516*> ( 0 0 )
517*>
518*> The residual A_residual(K) of the matrix A is:
519*>
520*> A_residual(K) = Q(K) * ( 0 0 ) * P(K)**T =
521*> ( 0 R(K)_residual )
522*>
523*> = Q(K) * ( 0 0 ) * P(K)**T
524*> ( 0 R22(K) )
525*>
526*> The truncated (rank K) factorization guarantees that
527*> the maximum column 2-norm of A_residual(K) is less than
528*> or equal to MAXC2NRMK up to roundoff error.
529*>
530*> NOTE: An approximation of the null vectors
531*> of A can be easily computed from R11(K)
532*> and R12(K):
533*>
534*> Null( A(K) )_approx = P * ( inv(R11(K)) * R12(K) )
535*> ( -I )
536*>
537*> \endverbatim
538*
539*> \par References:
540* ================
541*> [1] A Level 3 BLAS QR factorization algorithm with column pivoting developed in 1996.
542*> G. Quintana-Orti, Depto. de Informatica, Universidad Jaime I, Spain.
543*> X. Sun, Computer Science Dept., Duke University, USA.
544*> C. H. Bischof, Math. and Comp. Sci. Div., Argonne National Lab, USA.
545*> A BLAS-3 version of the QR factorization with column pivoting.
546*> LAPACK Working Note 114
547*> <a href="https://www.netlib.org/lapack/lawnspdf/lawn114.pdf">https://www.netlib.org/lapack/lawnspdf/lawn114.pdf</a>
548*> and in
549*> SIAM J. Sci. Comput., 19(5):1486-1494, Sept. 1998.
550*> <a href="https://doi.org/10.1137/S1064827595296732">https://doi.org/10.1137/S1064827595296732</a>
551*>
552*> [2] A partial column norm updating strategy developed in 2006.
553*> Z. Drmac and Z. Bujanovic, Dept. of Math., University of Zagreb, Croatia.
554*> On the failure of rank revealing QR factorization software – a case study.
555*> LAPACK Working Note 176.
556*> <a href="http://www.netlib.org/lapack/lawnspdf/lawn176.pdf">http://www.netlib.org/lapack/lawnspdf/lawn176.pdf</a>
557*> and in
558*> ACM Trans. Math. Softw. 35, 2, Article 12 (July 2008), 28 pages.
559*> <a href="https://doi.org/10.1145/1377612.1377616">https://doi.org/10.1145/1377612.1377616</a>
560*
561*> \par Contributors:
562* ==================
563*>
564*> \verbatim
565*>
566*> November 2023, Igor Kozachenko, James Demmel,
567*> EECS Department,
568*> University of California, Berkeley, USA.
569*>
570*> \endverbatim
571*
572* =====================================================================
573 SUBROUTINE sgeqp3rk( M, N, NRHS, KMAX, ABSTOL, RELTOL, A, LDA,
574 $ K, MAXC2NRMK, RELMAXC2NRMK, JPIV, TAU,
575 $ WORK, LWORK, IWORK, INFO )
576 IMPLICIT NONE
577*
578* -- LAPACK computational routine --
579* -- LAPACK is a software package provided by Univ. of Tennessee, --
580* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
581*
582* .. Scalar Arguments ..
583 INTEGER INFO, K, KF, KMAX, LDA, LWORK, M, N, NRHS
584 REAL ABSTOL, MAXC2NRMK, RELMAXC2NRMK, RELTOL
585* ..
586* .. Array Arguments ..
587 INTEGER IWORK( * ), JPIV( * )
588 REAL A( LDA, * ), TAU( * ), WORK( * )
589* ..
590*
591* =====================================================================
592*
593* .. Parameters ..
594 INTEGER INB, INBMIN, IXOVER
595 PARAMETER ( INB = 1, inbmin = 2, ixover = 3 )
596 REAL ZERO, ONE, TWO
597 parameter( zero = 0.0e+0, one = 1.0e+0, two = 2.0e+0 )
598* ..
599* .. Local Scalars ..
600 LOGICAL LQUERY, DONE
601 INTEGER IINFO, IOFFSET, IWS, J, JB, JBF, JMAXB, JMAX,
602 $ jmaxc2nrm, kp1, lwkopt, minmn, n_sub, nb,
603 $ nbmin, nx
604 REAL EPS, HUGEVAL, MAXC2NRM, SAFMIN
605* ..
606* .. External Subroutines ..
607 EXTERNAL slaqp2rk, slaqp3rk, xerbla
608* ..
609* .. External Functions ..
610 LOGICAL SISNAN
611 INTEGER ISAMAX, ILAENV
612 REAL SLAMCH, SNRM2, SROUNDUP_LWORK
613 EXTERNAL sisnan, slamch, snrm2, isamax, ilaenv,
614 $ sroundup_lwork
615* ..
616* .. Intrinsic Functions ..
617 INTRINSIC real, max, min
618* ..
619* .. Executable Statements ..
620*
621* Test input arguments
622* ====================
623*
624 info = 0
625 lquery = ( lwork.EQ.-1 )
626 IF( m.LT.0 ) THEN
627 info = -1
628 ELSE IF( n.LT.0 ) THEN
629 info = -2
630 ELSE IF( nrhs.LT.0 ) THEN
631 info = -3
632 ELSE IF( kmax.LT.0 ) THEN
633 info = -4
634 ELSE IF( sisnan( abstol ) ) THEN
635 info = -5
636 ELSE IF( sisnan( reltol ) ) THEN
637 info = -6
638 ELSE IF( lda.LT.max( 1, m ) ) THEN
639 info = -8
640 END IF
641*
642* If the input parameters M, N, NRHS, KMAX, LDA are valid:
643* a) Test the input workspace size LWORK for the minimum
644* size requirement IWS.
645* b) Determine the optimal block size NB and optimal
646* workspace size LWKOPT to be returned in WORK(1)
647* in case of (1) LWORK < IWS, (2) LQUERY = .TRUE.,
648* (3) when routine exits.
649* Here, IWS is the miminum workspace required for unblocked
650* code.
651*
652 IF( info.EQ.0 ) THEN
653 minmn = min( m, n )
654 IF( minmn.EQ.0 ) THEN
655 iws = 1
656 lwkopt = 1
657 ELSE
658*
659* Minimal workspace size in case of using only unblocked
660* BLAS 2 code in SLAQP2RK.
661* 1) SGEQP3RK and SLAQP2RK: 2*N to store full and partial
662* column 2-norms.
663* 2) SLAQP2RK: N+NRHS-1 to use in WORK array that is used
664* in SLARF1F subroutine inside SLAQP2RK to apply an
665* elementary reflector from the left.
666* TOTAL_WORK_SIZE = 3*N + NRHS - 1
667*
668 iws = 3*n + nrhs - 1
669*
670* Assign to NB optimal block size.
671*
672 nb = ilaenv( inb, 'SGEQP3RK', ' ', m, n, -1, -1 )
673*
674* A formula for the optimal workspace size in case of using
675* both unblocked BLAS 2 in SLAQP2RK and blocked BLAS 3 code
676* in SLAQP3RK.
677* 1) SGEQP3RK, SLAQP2RK, SLAQP3RK: 2*N to store full and
678* partial column 2-norms.
679* 2) SLAQP2RK: N+NRHS-1 to use in WORK array that is used
680* in SLARF1F subroutine to apply an elementary reflector
681* from the left.
682* 3) SLAQP3RK: NB*(N+NRHS) to use in the work array F that
683* is used to apply a block reflector from
684* the left.
685* 4) SLAQP3RK: NB to use in the auxilixary array AUX.
686* Sizes (2) and ((3) + (4)) should intersect, therefore
687* TOTAL_WORK_SIZE = 2*N + NB*( N+NRHS+1 ), given NBMIN=2.
688*
689 lwkopt = 2*n + nb*( n+nrhs+1 )
690 END IF
691 work( 1 ) = sroundup_lwork( lwkopt )
692*
693 IF( ( lwork.LT.iws ) .AND. .NOT.lquery ) THEN
694 info = -15
695 END IF
696 END IF
697*
698* NOTE: The optimal workspace size is returned in WORK(1), if
699* the input parameters M, N, NRHS, KMAX, LDA are valid.
700*
701 IF( info.NE.0 ) THEN
702 CALL xerbla( 'SGEQP3RK', -info )
703 RETURN
704 ELSE IF( lquery ) THEN
705 RETURN
706 END IF
707*
708* Quick return if possible for M=0 or N=0.
709*
710 IF( minmn.EQ.0 ) THEN
711 k = 0
712 maxc2nrmk = zero
713 relmaxc2nrmk = zero
714 work( 1 ) = sroundup_lwork( lwkopt )
715 RETURN
716 END IF
717*
718* ==================================================================
719*
720* Initialize column pivot array JPIV.
721*
722 DO j = 1, n
723 jpiv( j ) = j
724 END DO
725*
726* ==================================================================
727*
728* Initialize storage for partial and exact column 2-norms.
729* a) The elements WORK(1:N) are used to store partial column
730* 2-norms of the matrix A, and may decrease in each computation
731* step; initialize to the values of complete columns 2-norms.
732* b) The elements WORK(N+1:2*N) are used to store complete column
733* 2-norms of the matrix A, they are not changed during the
734* computation; initialize the values of complete columns 2-norms.
735*
736 DO j = 1, n
737 work( j ) = snrm2( m, a( 1, j ), 1 )
738 work( n+j ) = work( j )
739 END DO
740*
741* ==================================================================
742*
743* Compute the pivot column index and the maximum column 2-norm
744* for the whole original matrix stored in A(1:M,1:N).
745*
746 kp1 = isamax( n, work( 1 ), 1 )
747 maxc2nrm = work( kp1 )
748*
749* ==================================================================.
750*
751 IF( sisnan( maxc2nrm ) ) THEN
752*
753* Check if the matrix A contains NaN, set INFO parameter
754* to the column number where the first NaN is found and return
755* from the routine.
756*
757 k = 0
758 info = kp1
759*
760* Set MAXC2NRMK and RELMAXC2NRMK to NaN.
761*
762 maxc2nrmk = maxc2nrm
763 relmaxc2nrmk = maxc2nrm
764*
765* Array TAU is not set and contains undefined elements.
766*
767 work( 1 ) = sroundup_lwork( lwkopt )
768 RETURN
769 END IF
770*
771* ===================================================================
772*
773 IF( maxc2nrm.EQ.zero ) THEN
774*
775* Check is the matrix A is a zero matrix, set array TAU and
776* return from the routine.
777*
778 k = 0
779 maxc2nrmk = zero
780 relmaxc2nrmk = zero
781*
782 DO j = 1, minmn
783 tau( j ) = zero
784 END DO
785*
786 work( 1 ) = sroundup_lwork( lwkopt )
787 RETURN
788*
789 END IF
790*
791* ===================================================================
792*
793 hugeval = slamch( 'Overflow' )
794*
795 IF( maxc2nrm.GT.hugeval ) THEN
796*
797* Check if the matrix A contains +Inf or -Inf, set INFO parameter
798* to the column number, where the first +/-Inf is found plus N,
799* and continue the computation.
800*
801 info = n + kp1
802*
803 END IF
804*
805* ==================================================================
806*
807* Quick return if possible for the case when the first
808* stopping criterion is satisfied, i.e. KMAX = 0.
809*
810 IF( kmax.EQ.0 ) THEN
811 k = 0
812 maxc2nrmk = maxc2nrm
813 relmaxc2nrmk = one
814 DO j = 1, minmn
815 tau( j ) = zero
816 END DO
817 work( 1 ) = sroundup_lwork( lwkopt )
818 RETURN
819 END IF
820*
821* ==================================================================
822*
823 eps = slamch('Epsilon')
824*
825* Adjust ABSTOL
826*
827 IF( abstol.GE.zero ) THEN
828 safmin = slamch('Safe minimum')
829 abstol = max( abstol, two*safmin )
830 END IF
831*
832* Adjust RELTOL
833*
834 IF( reltol.GE.zero ) THEN
835 reltol = max( reltol, eps )
836 END IF
837*
838* ===================================================================
839*
840* JMAX is the maximum index of the column to be factorized,
841* which is also limited by the first stopping criterion KMAX.
842*
843 jmax = min( kmax, minmn )
844*
845* ===================================================================
846*
847* Quick return if possible for the case when the second or third
848* stopping criterion for the whole original matrix is satified,
849* i.e. MAXC2NRM <= ABSTOL or RELMAXC2NRM <= RELTOL
850* (which is ONE <= RELTOL).
851*
852 IF( maxc2nrm.LE.abstol .OR. one.LE.reltol ) THEN
853*
854 k = 0
855 maxc2nrmk = maxc2nrm
856 relmaxc2nrmk = one
857*
858 DO j = 1, minmn
859 tau( j ) = zero
860 END DO
861*
862 work( 1 ) = sroundup_lwork( lwkopt )
863 RETURN
864 END IF
865*
866* ==================================================================
867* Factorize columns
868* ==================================================================
869*
870* Determine the block size.
871*
872 nbmin = 2
873 nx = 0
874*
875 IF( ( nb.GT.1 ) .AND. ( nb.LT.minmn ) ) THEN
876*
877* Determine when to cross over from blocked to unblocked code.
878* (for N less than NX, unblocked code should be used).
879*
880 nx = max( 0, ilaenv( ixover, 'SGEQP3RK', ' ', m, n, -1,
881 $ -1 ))
882*
883 IF( nx.LT.minmn ) THEN
884*
885* Determine if workspace is large enough for blocked code.
886*
887 IF( lwork.LT.lwkopt ) THEN
888*
889* Not enough workspace to use optimal block size that
890* is currently stored in NB.
891* Reduce NB and determine the minimum value of NB.
892*
893 nb = ( lwork-2*n ) / ( n+1 )
894 nbmin = max( 2, ilaenv( inbmin, 'SGEQP3RK', ' ', m, n,
895 $ -1, -1 ) )
896*
897 END IF
898 END IF
899 END IF
900*
901* ==================================================================
902*
903* DONE is the boolean flag to rerpresent the case when the
904* factorization completed in the block factorization routine,
905* before the end of the block.
906*
907 done = .false.
908*
909* J is the column index.
910*
911 j = 1
912*
913* (1) Use blocked code initially.
914*
915* JMAXB is the maximum column index of the block, when the
916* blocked code is used, is also limited by the first stopping
917* criterion KMAX.
918*
919 jmaxb = min( kmax, minmn - nx )
920*
921 IF( nb.GE.nbmin .AND. nb.LT.jmax .AND. jmaxb.GT.0 ) THEN
922*
923* Loop over the column blocks of the matrix A(1:M,1:JMAXB). Here:
924* J is the column index of a column block;
925* JB is the column block size to pass to block factorization
926* routine in a loop step;
927* JBF is the number of columns that were actually factorized
928* that was returned by the block factorization routine
929* in a loop step, JBF <= JB;
930* N_SUB is the number of columns in the submatrix;
931* IOFFSET is the number of rows that should not be factorized.
932*
933 DO WHILE( j.LE.jmaxb )
934*
935 jb = min( nb, jmaxb-j+1 )
936 n_sub = n-j+1
937 ioffset = j-1
938*
939* Factorize JB columns among the columns A(J:N).
940*
941 CALL slaqp3rk( m, n_sub, nrhs, ioffset, jb, abstol,
942 $ reltol, kp1, maxc2nrm, a( 1, j ), lda,
943 $ done, jbf, maxc2nrmk, relmaxc2nrmk,
944 $ jpiv( j ), tau( j ),
945 $ work( j ), work( n+j ),
946 $ work( 2*n+1 ), work( 2*n+jb+1 ),
947 $ n+nrhs-j+1, iwork, iinfo )
948*
949* Set INFO on the first occurence of Inf.
950*
951 IF( iinfo.GT.n_sub .AND. info.EQ.0 ) THEN
952 info = 2*ioffset + iinfo
953 END IF
954*
955 IF( done ) THEN
956*
957* Either the submatrix is zero before the end of the
958* column block, or ABSTOL or RELTOL criterion is
959* satisfied before the end of the column block, we can
960* return from the routine. Perform the following before
961* returning:
962* a) Set the number of factorized columns K,
963* K = IOFFSET + JBF from the last call of blocked
964* routine.
965* NOTE: 1) MAXC2NRMK and RELMAXC2NRMK are returned
966* by the block factorization routine;
967* 2) The remaining TAUs are set to ZERO by the
968* block factorization routine.
969*
970 k = ioffset + jbf
971*
972* Set INFO on the first occurrence of NaN, NaN takes
973* prcedence over Inf.
974*
975 IF( iinfo.LE.n_sub .AND. iinfo.GT.0 ) THEN
976 info = ioffset + iinfo
977 END IF
978*
979* Return from the routine.
980*
981 work( 1 ) = sroundup_lwork( lwkopt )
982*
983 RETURN
984*
985 END IF
986*
987 j = j + jbf
988*
989 END DO
990*
991 END IF
992*
993* Use unblocked code to factor the last or only block.
994* J = JMAX+1 means we factorized the maximum possible number of
995* columns, that is in ELSE clause we need to compute
996* the MAXC2NORM and RELMAXC2NORM to return after we processed
997* the blocks.
998*
999 IF( j.LE.jmax ) THEN
1000*
1001* N_SUB is the number of columns in the submatrix;
1002* IOFFSET is the number of rows that should not be factorized.
1003*
1004 n_sub = n-j+1
1005 ioffset = j-1
1006*
1007 CALL slaqp2rk( m, n_sub, nrhs, ioffset, jmax-j+1,
1008 $ abstol, reltol, kp1, maxc2nrm, a( 1, j ), lda,
1009 $ kf, maxc2nrmk, relmaxc2nrmk, jpiv( j ),
1010 $ tau( j ), work( j ), work( n+j ),
1011 $ work( 2*n+1 ), iinfo )
1012*
1013* ABSTOL or RELTOL criterion is satisfied when the number of
1014* the factorized columns KF is smaller then the number
1015* of columns JMAX-J+1 supplied to be factorized by the
1016* unblocked routine, we can return from
1017* the routine. Perform the following before returning:
1018* a) Set the number of factorized columns K,
1019* b) MAXC2NRMK and RELMAXC2NRMK are returned by the
1020* unblocked factorization routine above.
1021*
1022 k = j - 1 + kf
1023*
1024* Set INFO on the first exception occurence.
1025*
1026* Set INFO on the first exception occurence of Inf or NaN,
1027* (NaN takes precedence over Inf).
1028*
1029 IF( iinfo.GT.n_sub .AND. info.EQ.0 ) THEN
1030 info = 2*ioffset + iinfo
1031 ELSE IF( iinfo.LE.n_sub .AND. iinfo.GT.0 ) THEN
1032 info = ioffset + iinfo
1033 END IF
1034*
1035 ELSE
1036*
1037* Compute the return values for blocked code.
1038*
1039* Set the number of factorized columns if the unblocked routine
1040* was not called.
1041*
1042 k = jmax
1043*
1044* If there exits a residual matrix after the blocked code:
1045* 1) compute the values of MAXC2NRMK, RELMAXC2NRMK of the
1046* residual matrix, otherwise set them to ZERO;
1047* 2) Set TAU(K+1:MINMN) to ZERO.
1048*
1049 IF( k.LT.minmn ) THEN
1050 jmaxc2nrm = k + isamax( n-k, work( k+1 ), 1 )
1051 maxc2nrmk = work( jmaxc2nrm )
1052 IF( k.EQ.0 ) THEN
1053 relmaxc2nrmk = one
1054 ELSE
1055 relmaxc2nrmk = maxc2nrmk / maxc2nrm
1056 END IF
1057*
1058 DO j = k + 1, minmn
1059 tau( j ) = zero
1060 END DO
1061*
1062 END IF
1063*
1064* END IF( J.LE.JMAX ) THEN
1065*
1066 END IF
1067*
1068 work( 1 ) = sroundup_lwork( lwkopt )
1069*
1070 RETURN
1071*
1072* End of SGEQP3RK
1073*
1074 END
subroutine xerbla(srname, info)
Definition cblat2.f:3285
subroutine sgeqp3rk(m, n, nrhs, kmax, abstol, reltol, a, lda, k, maxc2nrmk, relmaxc2nrmk, jpiv, tau, work, lwork, iwork, info)
SGEQP3RK computes a truncated Householder QR factorization with column pivoting of a real m-by-n matr...
Definition sgeqp3rk.f:576
subroutine slaqp2rk(m, n, nrhs, ioffset, kmax, abstol, reltol, kp1, maxc2nrm, a, lda, k, maxc2nrmk, relmaxc2nrmk, jpiv, tau, vn1, vn2, work, info)
SLAQP2RK computes truncated QR factorization with column pivoting of a real matrix block using Level ...
Definition slaqp2rk.f:334
subroutine slaqp3rk(m, n, nrhs, ioffset, nb, abstol, reltol, kp1, maxc2nrm, a, lda, done, kb, maxc2nrmk, relmaxc2nrmk, jpiv, tau, vn1, vn2, auxv, f, ldf, iwork, info)
SLAQP3RK computes a step of truncated QR factorization with column pivoting of a real m-by-n matrix A...
Definition slaqp3rk.f:392