LAPACK 3.12.0
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches
zgeqp3rk.f
Go to the documentation of this file.
1*> \brief \b ZGEQP3RK computes a truncated Householder QR factorization with column pivoting of a complex m-by-n matrix A by using Level 3 BLAS and overwrites m-by-nrhs matrix B with Q**H * B.
2*
3* =========== DOCUMENTATION ===========
4*
5* Online html documentation available at
6* http://www.netlib.org/lapack/explore-html/
7*
8*> \htmlonly
9*> Download ZGEQP3RK + dependencies
10*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zgeqp3rk.f">
11*> [TGZ]</a>
12*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zgeqp3rk.f">
13*> [ZIP]</a>
14*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zgeqp3rk.f">
15*> [TXT]</a>
16*> \endhtmlonly
17*
18* Definition:
19* ===========
20*
21* SUBROUTINE ZGEQP3RK( M, N, NRHS, KMAX, ABSTOL, RELTOL, A, LDA,
22* $ K, MAXC2NRMK, RELMAXC2NRMK, JPIV, TAU,
23* $ WORK, LWORK, RWORK, IWORK, INFO )
24* IMPLICIT NONE
25*
26* .. Scalar Arguments ..
27* INTEGER INFO, K, KMAX, LDA, LWORK, M, N, NRHS
28* DOUBLE PRECISION ABSTOL, MAXC2NRMK, RELMAXC2NRMK, RELTOL
29* ..
30* .. Array Arguments ..
31* INTEGER IWORK( * ), JPIV( * )
32* DOUBLE PRECISION RWORK( * )
33* COMPLEX*16 A( LDA, * ), TAU( * ), WORK( * )
34* ..
35*
36*
37*> \par Purpose:
38* =============
39*>
40*> \verbatim
41*>
42*> ZGEQP3RK performs two tasks simultaneously:
43*>
44*> Task 1: The routine computes a truncated (rank K) or full rank
45*> Householder QR factorization with column pivoting of a complex
46*> M-by-N matrix A using Level 3 BLAS. K is the number of columns
47*> that were factorized, i.e. factorization rank of the
48*> factor R, K <= min(M,N).
49*>
50*> A * P(K) = Q(K) * R(K) =
51*>
52*> = Q(K) * ( R11(K) R12(K) ) = Q(K) * ( R(K)_approx )
53*> ( 0 R22(K) ) ( 0 R(K)_residual ),
54*>
55*> where:
56*>
57*> P(K) is an N-by-N permutation matrix;
58*> Q(K) is an M-by-M unitary matrix;
59*> R(K)_approx = ( R11(K), R12(K) ) is a rank K approximation of the
60*> full rank factor R with K-by-K upper-triangular
61*> R11(K) and K-by-N rectangular R12(K). The diagonal
62*> entries of R11(K) appear in non-increasing order
63*> of absolute value, and absolute values of all of
64*> them exceed the maximum column 2-norm of R22(K)
65*> up to roundoff error.
66*> R(K)_residual = R22(K) is the residual of a rank K approximation
67*> of the full rank factor R. It is a
68*> an (M-K)-by-(N-K) rectangular matrix;
69*> 0 is a an (M-K)-by-K zero matrix.
70*>
71*> Task 2: At the same time, the routine overwrites a complex M-by-NRHS
72*> matrix B with Q(K)**H * B using Level 3 BLAS.
73*>
74*> =====================================================================
75*>
76*> The matrices A and B are stored on input in the array A as
77*> the left and right blocks A(1:M,1:N) and A(1:M, N+1:N+NRHS)
78*> respectively.
79*>
80*> N NRHS
81*> array_A = M [ mat_A, mat_B ]
82*>
83*> The truncation criteria (i.e. when to stop the factorization)
84*> can be any of the following:
85*>
86*> 1) The input parameter KMAX, the maximum number of columns
87*> KMAX to factorize, i.e. the factorization rank is limited
88*> to KMAX. If KMAX >= min(M,N), the criterion is not used.
89*>
90*> 2) The input parameter ABSTOL, the absolute tolerance for
91*> the maximum column 2-norm of the residual matrix R22(K). This
92*> means that the factorization stops if this norm is less or
93*> equal to ABSTOL. If ABSTOL < 0.0, the criterion is not used.
94*>
95*> 3) The input parameter RELTOL, the tolerance for the maximum
96*> column 2-norm matrix of the residual matrix R22(K) divided
97*> by the maximum column 2-norm of the original matrix A, which
98*> is equal to abs(R(1,1)). This means that the factorization stops
99*> when the ratio of the maximum column 2-norm of R22(K) to
100*> the maximum column 2-norm of A is less than or equal to RELTOL.
101*> If RELTOL < 0.0, the criterion is not used.
102*>
103*> 4) In case both stopping criteria ABSTOL or RELTOL are not used,
104*> and when the residual matrix R22(K) is a zero matrix in some
105*> factorization step K. ( This stopping criterion is implicit. )
106*>
107*> The algorithm stops when any of these conditions is first
108*> satisfied, otherwise the whole matrix A is factorized.
109*>
110*> To factorize the whole matrix A, use the values
111*> KMAX >= min(M,N), ABSTOL < 0.0 and RELTOL < 0.0.
112*>
113*> The routine returns:
114*> a) Q(K), R(K)_approx = ( R11(K), R12(K) ),
115*> R(K)_residual = R22(K), P(K), i.e. the resulting matrices
116*> of the factorization; P(K) is represented by JPIV,
117*> ( if K = min(M,N), R(K)_approx is the full factor R,
118*> and there is no residual matrix R(K)_residual);
119*> b) K, the number of columns that were factorized,
120*> i.e. factorization rank;
121*> c) MAXC2NRMK, the maximum column 2-norm of the residual
122*> matrix R(K)_residual = R22(K),
123*> ( if K = min(M,N), MAXC2NRMK = 0.0 );
124*> d) RELMAXC2NRMK equals MAXC2NRMK divided by MAXC2NRM, the maximum
125*> column 2-norm of the original matrix A, which is equal
126*> to abs(R(1,1)), ( if K = min(M,N), RELMAXC2NRMK = 0.0 );
127*> e) Q(K)**H * B, the matrix B with the unitary
128*> transformation Q(K)**H applied on the left.
129*>
130*> The N-by-N permutation matrix P(K) is stored in a compact form in
131*> the integer array JPIV. For 1 <= j <= N, column j
132*> of the matrix A was interchanged with column JPIV(j).
133*>
134*> The M-by-M unitary matrix Q is represented as a product
135*> of elementary Householder reflectors
136*>
137*> Q(K) = H(1) * H(2) * . . . * H(K),
138*>
139*> where K is the number of columns that were factorized.
140*>
141*> Each H(j) has the form
142*>
143*> H(j) = I - tau * v * v**H,
144*>
145*> where 1 <= j <= K and
146*> I is an M-by-M identity matrix,
147*> tau is a complex scalar,
148*> v is a complex vector with v(1:j-1) = 0 and v(j) = 1.
149*>
150*> v(j+1:M) is stored on exit in A(j+1:M,j) and tau in TAU(j).
151*>
152*> See the Further Details section for more information.
153*> \endverbatim
154*
155* Arguments:
156* ==========
157*
158*> \param[in] M
159*> \verbatim
160*> M is INTEGER
161*> The number of rows of the matrix A. M >= 0.
162*> \endverbatim
163*>
164*> \param[in] N
165*> \verbatim
166*> N is INTEGER
167*> The number of columns of the matrix A. N >= 0.
168*> \endverbatim
169*>
170*> \param[in] NRHS
171*> \verbatim
172*> NRHS is INTEGER
173*> The number of right hand sides, i.e. the number of
174*> columns of the matrix B. NRHS >= 0.
175*> \endverbatim
176*>
177*> \param[in] KMAX
178*> \verbatim
179*> KMAX is INTEGER
180*>
181*> The first factorization stopping criterion. KMAX >= 0.
182*>
183*> The maximum number of columns of the matrix A to factorize,
184*> i.e. the maximum factorization rank.
185*>
186*> a) If KMAX >= min(M,N), then this stopping criterion
187*> is not used, the routine factorizes columns
188*> depending on ABSTOL and RELTOL.
189*>
190*> b) If KMAX = 0, then this stopping criterion is
191*> satisfied on input and the routine exits immediately.
192*> This means that the factorization is not performed,
193*> the matrices A and B are not modified, and
194*> the matrix A is itself the residual.
195*> \endverbatim
196*>
197*> \param[in] ABSTOL
198*> \verbatim
199*> ABSTOL is DOUBLE PRECISION
200*>
201*> The second factorization stopping criterion, cannot be NaN.
202*>
203*> The absolute tolerance (stopping threshold) for
204*> maximum column 2-norm of the residual matrix R22(K).
205*> The algorithm converges (stops the factorization) when
206*> the maximum column 2-norm of the residual matrix R22(K)
207*> is less than or equal to ABSTOL. Let SAFMIN = DLAMCH('S').
208*>
209*> a) If ABSTOL is NaN, then no computation is performed
210*> and an error message ( INFO = -5 ) is issued
211*> by XERBLA.
212*>
213*> b) If ABSTOL < 0.0, then this stopping criterion is not
214*> used, the routine factorizes columns depending
215*> on KMAX and RELTOL.
216*> This includes the case ABSTOL = -Inf.
217*>
218*> c) If 0.0 <= ABSTOL < 2*SAFMIN, then ABSTOL = 2*SAFMIN
219*> is used. This includes the case ABSTOL = -0.0.
220*>
221*> d) If 2*SAFMIN <= ABSTOL then the input value
222*> of ABSTOL is used.
223*>
224*> Let MAXC2NRM be the maximum column 2-norm of the
225*> whole original matrix A.
226*> If ABSTOL chosen above is >= MAXC2NRM, then this
227*> stopping criterion is satisfied on input and routine exits
228*> immediately after MAXC2NRM is computed. The routine
229*> returns MAXC2NRM in MAXC2NORMK,
230*> and 1.0 in RELMAXC2NORMK.
231*> This includes the case ABSTOL = +Inf. This means that the
232*> factorization is not performed, the matrices A and B are not
233*> modified, and the matrix A is itself the residual.
234*> \endverbatim
235*>
236*> \param[in] RELTOL
237*> \verbatim
238*> RELTOL is DOUBLE PRECISION
239*>
240*> The third factorization stopping criterion, cannot be NaN.
241*>
242*> The tolerance (stopping threshold) for the ratio
243*> abs(R(K+1,K+1))/abs(R(1,1)) of the maximum column 2-norm of
244*> the residual matrix R22(K) to the maximum column 2-norm of
245*> the original matrix A. The algorithm converges (stops the
246*> factorization), when abs(R(K+1,K+1))/abs(R(1,1)) A is less
247*> than or equal to RELTOL. Let EPS = DLAMCH('E').
248*>
249*> a) If RELTOL is NaN, then no computation is performed
250*> and an error message ( INFO = -6 ) is issued
251*> by XERBLA.
252*>
253*> b) If RELTOL < 0.0, then this stopping criterion is not
254*> used, the routine factorizes columns depending
255*> on KMAX and ABSTOL.
256*> This includes the case RELTOL = -Inf.
257*>
258*> c) If 0.0 <= RELTOL < EPS, then RELTOL = EPS is used.
259*> This includes the case RELTOL = -0.0.
260*>
261*> d) If EPS <= RELTOL then the input value of RELTOL
262*> is used.
263*>
264*> Let MAXC2NRM be the maximum column 2-norm of the
265*> whole original matrix A.
266*> If RELTOL chosen above is >= 1.0, then this stopping
267*> criterion is satisfied on input and routine exits
268*> immediately after MAXC2NRM is computed.
269*> The routine returns MAXC2NRM in MAXC2NORMK,
270*> and 1.0 in RELMAXC2NORMK.
271*> This includes the case RELTOL = +Inf. This means that the
272*> factorization is not performed, the matrices A and B are not
273*> modified, and the matrix A is itself the residual.
274*>
275*> NOTE: We recommend that RELTOL satisfy
276*> min( 10*max(M,N)*EPS, sqrt(EPS) ) <= RELTOL
277*> \endverbatim
278*>
279*> \param[in,out] A
280*> \verbatim
281*> A is COMPLEX*16 array, dimension (LDA,N+NRHS)
282*>
283*> On entry:
284*>
285*> a) The subarray A(1:M,1:N) contains the M-by-N matrix A.
286*> b) The subarray A(1:M,N+1:N+NRHS) contains the M-by-NRHS
287*> matrix B.
288*>
289*> N NRHS
290*> array_A = M [ mat_A, mat_B ]
291*>
292*> On exit:
293*>
294*> a) The subarray A(1:M,1:N) contains parts of the factors
295*> of the matrix A:
296*>
297*> 1) If K = 0, A(1:M,1:N) contains the original matrix A.
298*> 2) If K > 0, A(1:M,1:N) contains parts of the
299*> factors:
300*>
301*> 1. The elements below the diagonal of the subarray
302*> A(1:M,1:K) together with TAU(1:K) represent the
303*> unitary matrix Q(K) as a product of K Householder
304*> elementary reflectors.
305*>
306*> 2. The elements on and above the diagonal of
307*> the subarray A(1:K,1:N) contain K-by-N
308*> upper-trapezoidal matrix
309*> R(K)_approx = ( R11(K), R12(K) ).
310*> NOTE: If K=min(M,N), i.e. full rank factorization,
311*> then R_approx(K) is the full factor R which
312*> is upper-trapezoidal. If, in addition, M>=N,
313*> then R is upper-triangular.
314*>
315*> 3. The subarray A(K+1:M,K+1:N) contains (M-K)-by-(N-K)
316*> rectangular matrix R(K)_residual = R22(K).
317*>
318*> b) If NRHS > 0, the subarray A(1:M,N+1:N+NRHS) contains
319*> the M-by-NRHS product Q(K)**H * B.
320*> \endverbatim
321*>
322*> \param[in] LDA
323*> \verbatim
324*> LDA is INTEGER
325*> The leading dimension of the array A. LDA >= max(1,M).
326*> This is the leading dimension for both matrices, A and B.
327*> \endverbatim
328*>
329*> \param[out] K
330*> \verbatim
331*> K is INTEGER
332*> Factorization rank of the matrix A, i.e. the rank of
333*> the factor R, which is the same as the number of non-zero
334*> rows of the factor R. 0 <= K <= min(M,KMAX,N).
335*>
336*> K also represents the number of non-zero Householder
337*> vectors.
338*>
339*> NOTE: If K = 0, a) the arrays A and B are not modified;
340*> b) the array TAU(1:min(M,N)) is set to ZERO,
341*> if the matrix A does not contain NaN,
342*> otherwise the elements TAU(1:min(M,N))
343*> are undefined;
344*> c) the elements of the array JPIV are set
345*> as follows: for j = 1:N, JPIV(j) = j.
346*> \endverbatim
347*>
348*> \param[out] MAXC2NRMK
349*> \verbatim
350*> MAXC2NRMK is DOUBLE PRECISION
351*> The maximum column 2-norm of the residual matrix R22(K),
352*> when the factorization stopped at rank K. MAXC2NRMK >= 0.
353*>
354*> a) If K = 0, i.e. the factorization was not performed,
355*> the matrix A was not modified and is itself a residual
356*> matrix, then MAXC2NRMK equals the maximum column 2-norm
357*> of the original matrix A.
358*>
359*> b) If 0 < K < min(M,N), then MAXC2NRMK is returned.
360*>
361*> c) If K = min(M,N), i.e. the whole matrix A was
362*> factorized and there is no residual matrix,
363*> then MAXC2NRMK = 0.0.
364*>
365*> NOTE: MAXC2NRMK in the factorization step K would equal
366*> R(K+1,K+1) in the next factorization step K+1.
367*> \endverbatim
368*>
369*> \param[out] RELMAXC2NRMK
370*> \verbatim
371*> RELMAXC2NRMK is DOUBLE PRECISION
372*> The ratio MAXC2NRMK / MAXC2NRM of the maximum column
373*> 2-norm of the residual matrix R22(K) (when the factorization
374*> stopped at rank K) to the maximum column 2-norm of the
375*> whole original matrix A. RELMAXC2NRMK >= 0.
376*>
377*> a) If K = 0, i.e. the factorization was not performed,
378*> the matrix A was not modified and is itself a residual
379*> matrix, then RELMAXC2NRMK = 1.0.
380*>
381*> b) If 0 < K < min(M,N), then
382*> RELMAXC2NRMK = MAXC2NRMK / MAXC2NRM is returned.
383*>
384*> c) If K = min(M,N), i.e. the whole matrix A was
385*> factorized and there is no residual matrix,
386*> then RELMAXC2NRMK = 0.0.
387*>
388*> NOTE: RELMAXC2NRMK in the factorization step K would equal
389*> abs(R(K+1,K+1))/abs(R(1,1)) in the next factorization
390*> step K+1.
391*> \endverbatim
392*>
393*> \param[out] JPIV
394*> \verbatim
395*> JPIV is INTEGER array, dimension (N)
396*> Column pivot indices. For 1 <= j <= N, column j
397*> of the matrix A was interchanged with column JPIV(j).
398*>
399*> The elements of the array JPIV(1:N) are always set
400*> by the routine, for example, even when no columns
401*> were factorized, i.e. when K = 0, the elements are
402*> set as JPIV(j) = j for j = 1:N.
403*> \endverbatim
404*>
405*> \param[out] TAU
406*> \verbatim
407*> TAU is COMPLEX*16 array, dimension (min(M,N))
408*> The scalar factors of the elementary reflectors.
409*>
410*> If 0 < K <= min(M,N), only the elements TAU(1:K) of
411*> the array TAU are modified by the factorization.
412*> After the factorization computed, if no NaN was found
413*> during the factorization, the remaining elements
414*> TAU(K+1:min(M,N)) are set to zero, otherwise the
415*> elements TAU(K+1:min(M,N)) are not set and therefore
416*> undefined.
417*> ( If K = 0, all elements of TAU are set to zero, if
418*> the matrix A does not contain NaN. )
419*> \endverbatim
420*>
421*> \param[out] WORK
422*> \verbatim
423*> WORK is COMPLEX*16 array, dimension (MAX(1,LWORK))
424*> On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
425*> \endverbatim
426*>
427*> \param[in] LWORK
428*> \verbatim
429*> LWORK is INTEGER
430*> The dimension of the array WORK.
431*. LWORK >= N+NRHS-1
432*> For optimal performance LWORK >= NB*( N+NRHS+1 ),
433*> where NB is the optimal block size for ZGEQP3RK returned
434*> by ILAENV. Minimal block size MINNB=2.
435*>
436*> NOTE: The decision, whether to use unblocked BLAS 2
437*> or blocked BLAS 3 code is based not only on the dimension
438*> LWORK of the availbale workspace WORK, but also also on the
439*> matrix A dimension N via crossover point NX returned
440*> by ILAENV. (For N less than NX, unblocked code should be
441*> used.)
442*>
443*> If LWORK = -1, then a workspace query is assumed;
444*> the routine only calculates the optimal size of the WORK
445*> array, returns this value as the first entry of the WORK
446*> array, and no error message related to LWORK is issued
447*> by XERBLA.
448*> \endverbatim
449*>
450*> \param[out] RWORK
451*> \verbatim
452*> RWORK is DOUBLE PRECISION array, dimension (2*N)
453*> \endverbatim
454*>
455*> \param[out] IWORK
456*> \verbatim
457*> IWORK is INTEGER array, dimension (N-1).
458*> Is a work array. ( IWORK is used to store indices
459*> of "bad" columns for norm downdating in the residual
460*> matrix in the blocked step auxiliary subroutine ZLAQP3RK ).
461*> \endverbatim
462*>
463*> \param[out] INFO
464*> \verbatim
465*> INFO is INTEGER
466*> 1) INFO = 0: successful exit.
467*> 2) INFO < 0: if INFO = -i, the i-th argument had an
468*> illegal value.
469*> 3) If INFO = j_1, where 1 <= j_1 <= N, then NaN was
470*> detected and the routine stops the computation.
471*> The j_1-th column of the matrix A or the j_1-th
472*> element of array TAU contains the first occurrence
473*> of NaN in the factorization step K+1 ( when K columns
474*> have been factorized ).
475*>
476*> On exit:
477*> K is set to the number of
478*> factorized columns without
479*> exception.
480*> MAXC2NRMK is set to NaN.
481*> RELMAXC2NRMK is set to NaN.
482*> TAU(K+1:min(M,N)) is not set and contains undefined
483*> elements. If j_1=K+1, TAU(K+1)
484*> may contain NaN.
485*> 4) If INFO = j_2, where N+1 <= j_2 <= 2*N, then no NaN
486*> was detected, but +Inf (or -Inf) was detected and
487*> the routine continues the computation until completion.
488*> The (j_2-N)-th column of the matrix A contains the first
489*> occurrence of +Inf (or -Inf) in the factorization
490*> step K+1 ( when K columns have been factorized ).
491*> \endverbatim
492*
493* Authors:
494* ========
495*
496*> \author Univ. of Tennessee
497*> \author Univ. of California Berkeley
498*> \author Univ. of Colorado Denver
499*> \author NAG Ltd.
500*
501*> \ingroup geqp3rk
502*
503*> \par Further Details:
504* =====================
505*
506*> \verbatim
507*> ZGEQP3RK is based on the same BLAS3 Householder QR factorization
508*> algorithm with column pivoting as in ZGEQP3 routine which uses
509*> ZLARFG routine to generate Householder reflectors
510*> for QR factorization.
511*>
512*> We can also write:
513*>
514*> A = A_approx(K) + A_residual(K)
515*>
516*> The low rank approximation matrix A(K)_approx from
517*> the truncated QR factorization of rank K of the matrix A is:
518*>
519*> A(K)_approx = Q(K) * ( R(K)_approx ) * P(K)**T
520*> ( 0 0 )
521*>
522*> = Q(K) * ( R11(K) R12(K) ) * P(K)**T
523*> ( 0 0 )
524*>
525*> The residual A_residual(K) of the matrix A is:
526*>
527*> A_residual(K) = Q(K) * ( 0 0 ) * P(K)**T =
528*> ( 0 R(K)_residual )
529*>
530*> = Q(K) * ( 0 0 ) * P(K)**T
531*> ( 0 R22(K) )
532*>
533*> The truncated (rank K) factorization guarantees that
534*> the maximum column 2-norm of A_residual(K) is less than
535*> or equal to MAXC2NRMK up to roundoff error.
536*>
537*> NOTE: An approximation of the null vectors
538*> of A can be easily computed from R11(K)
539*> and R12(K):
540*>
541*> Null( A(K) )_approx = P * ( inv(R11(K)) * R12(K) )
542*> ( -I )
543*>
544*> \endverbatim
545*
546*> \par References:
547* ================
548*> [1] A Level 3 BLAS QR factorization algorithm with column pivoting developed in 1996.
549*> G. Quintana-Orti, Depto. de Informatica, Universidad Jaime I, Spain.
550*> X. Sun, Computer Science Dept., Duke University, USA.
551*> C. H. Bischof, Math. and Comp. Sci. Div., Argonne National Lab, USA.
552*> A BLAS-3 version of the QR factorization with column pivoting.
553*> LAPACK Working Note 114
554*> \htmlonly
555*> <a href="https://www.netlib.org/lapack/lawnspdf/lawn114.pdf">https://www.netlib.org/lapack/lawnspdf/lawn114.pdf</a>
556*> \endhtmlonly
557*> and in
558*> SIAM J. Sci. Comput., 19(5):1486-1494, Sept. 1998.
559*> \htmlonly
560*> <a href="https://doi.org/10.1137/S1064827595296732">https://doi.org/10.1137/S1064827595296732</a>
561*> \endhtmlonly
562*>
563*> [2] A partial column norm updating strategy developed in 2006.
564*> Z. Drmac and Z. Bujanovic, Dept. of Math., University of Zagreb, Croatia.
565*> On the failure of rank revealing QR factorization software – a case study.
566*> LAPACK Working Note 176.
567*> \htmlonly
568*> <a href="http://www.netlib.org/lapack/lawnspdf/lawn176.pdf">http://www.netlib.org/lapack/lawnspdf/lawn176.pdf</a>
569*> \endhtmlonly
570*> and in
571*> ACM Trans. Math. Softw. 35, 2, Article 12 (July 2008), 28 pages.
572*> \htmlonly
573*> <a href="https://doi.org/10.1145/1377612.1377616">https://doi.org/10.1145/1377612.1377616</a>
574*> \endhtmlonly
575*
576*> \par Contributors:
577* ==================
578*>
579*> \verbatim
580*>
581*> November 2023, Igor Kozachenko, James Demmel,
582*> EECS Department,
583*> University of California, Berkeley, USA.
584*>
585*> \endverbatim
586*
587* =====================================================================
588 SUBROUTINE zgeqp3rk( M, N, NRHS, KMAX, ABSTOL, RELTOL, A, LDA,
589 $ K, MAXC2NRMK, RELMAXC2NRMK, JPIV, TAU,
590 $ WORK, LWORK, RWORK, IWORK, INFO )
591 IMPLICIT NONE
592*
593* -- LAPACK computational routine --
594* -- LAPACK is a software package provided by Univ. of Tennessee, --
595* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
596*
597* .. Scalar Arguments ..
598 INTEGER INFO, K, KF, KMAX, LDA, LWORK, M, N, NRHS
599 DOUBLE PRECISION ABSTOL, MAXC2NRMK, RELMAXC2NRMK, RELTOL
600* ..
601* .. Array Arguments ..
602 INTEGER IWORK( * ), JPIV( * )
603 DOUBLE PRECISION RWORK( * )
604 COMPLEX*16 A( LDA, * ), TAU( * ), WORK( * )
605* ..
606*
607* =====================================================================
608*
609* .. Parameters ..
610 INTEGER INB, INBMIN, IXOVER
611 PARAMETER ( INB = 1, inbmin = 2, ixover = 3 )
612 DOUBLE PRECISION ZERO, ONE, TWO
613 parameter( zero = 0.0d+0, one = 1.0d+0, two = 2.0d+0 )
614 COMPLEX*16 CZERO
615 parameter( czero = ( 0.0d+0, 0.0d+0 ) )
616* ..
617* .. Local Scalars ..
618 LOGICAL LQUERY, DONE
619 INTEGER IINFO, IOFFSET, IWS, J, JB, JBF, JMAXB, JMAX,
620 $ jmaxc2nrm, kp1, lwkopt, minmn, n_sub, nb,
621 $ nbmin, nx
622 DOUBLE PRECISION EPS, HUGEVAL, MAXC2NRM, SAFMIN
623* ..
624* .. External Subroutines ..
625 EXTERNAL zlaqp2rk, zlaqp3rk, xerbla
626* ..
627* .. External Functions ..
628 LOGICAL DISNAN
629 INTEGER IDAMAX, ILAENV
630 DOUBLE PRECISION DLAMCH, DZNRM2
631 EXTERNAL disnan, dlamch, dznrm2, idamax, ilaenv
632* ..
633* .. Intrinsic Functions ..
634 INTRINSIC dcmplx, max, min
635* ..
636* .. Executable Statements ..
637*
638* Test input arguments
639* ====================
640*
641 info = 0
642 lquery = ( lwork.EQ.-1 )
643 IF( m.LT.0 ) THEN
644 info = -1
645 ELSE IF( n.LT.0 ) THEN
646 info = -2
647 ELSE IF( nrhs.LT.0 ) THEN
648 info = -3
649 ELSE IF( kmax.LT.0 ) THEN
650 info = -4
651 ELSE IF( disnan( abstol ) ) THEN
652 info = -5
653 ELSE IF( disnan( reltol ) ) THEN
654 info = -6
655 ELSE IF( lda.LT.max( 1, m ) ) THEN
656 info = -8
657 END IF
658*
659* If the input parameters M, N, NRHS, KMAX, LDA are valid:
660* a) Test the input workspace size LWORK for the minimum
661* size requirement IWS.
662* b) Determine the optimal block size NB and optimal
663* workspace size LWKOPT to be returned in WORK(1)
664* in case of (1) LWORK < IWS, (2) LQUERY = .TRUE.,
665* (3) when routine exits.
666* Here, IWS is the miminum workspace required for unblocked
667* code.
668*
669 IF( info.EQ.0 ) THEN
670 minmn = min( m, n )
671 IF( minmn.EQ.0 ) THEN
672 iws = 1
673 lwkopt = 1
674 ELSE
675*
676* Minimal workspace size in case of using only unblocked
677* BLAS 2 code in ZLAQP2RK.
678* 1) ZLAQP2RK: N+NRHS-1 to use in WORK array that is used
679* in ZLARF subroutine inside ZLAQP2RK to apply an
680* elementary reflector from the left.
681* TOTAL_WORK_SIZE = 3*N + NRHS - 1
682*
683 iws = n + nrhs - 1
684*
685* Assign to NB optimal block size.
686*
687 nb = ilaenv( inb, 'ZGEQP3RK', ' ', m, n, -1, -1 )
688*
689* A formula for the optimal workspace size in case of using
690* both unblocked BLAS 2 in ZLAQP2RK and blocked BLAS 3 code
691* in ZLAQP3RK.
692* 1) ZGEQP3RK, ZLAQP2RK, ZLAQP3RK: 2*N to store full and
693* partial column 2-norms.
694* 2) ZLAQP2RK: N+NRHS-1 to use in WORK array that is used
695* in ZLARF subroutine to apply an elementary reflector
696* from the left.
697* 3) ZLAQP3RK: NB*(N+NRHS) to use in the work array F that
698* is used to apply a block reflector from
699* the left.
700* 4) ZLAQP3RK: NB to use in the auxilixary array AUX.
701* Sizes (2) and ((3) + (4)) should intersect, therefore
702* TOTAL_WORK_SIZE = 2*N + NB*( N+NRHS+1 ), given NBMIN=2.
703*
704 lwkopt = 2*n + nb*( n+nrhs+1 )
705 END IF
706 work( 1 ) = dcmplx( lwkopt )
707*
708 IF( ( lwork.LT.iws ) .AND. .NOT.lquery ) THEN
709 info = -15
710 END IF
711 END IF
712*
713* NOTE: The optimal workspace size is returned in WORK(1), if
714* the input parameters M, N, NRHS, KMAX, LDA are valid.
715*
716 IF( info.NE.0 ) THEN
717 CALL xerbla( 'ZGEQP3RK', -info )
718 RETURN
719 ELSE IF( lquery ) THEN
720 RETURN
721 END IF
722*
723* Quick return if possible for M=0 or N=0.
724*
725 IF( minmn.EQ.0 ) THEN
726 k = 0
727 maxc2nrmk = zero
728 relmaxc2nrmk = zero
729 work( 1 ) = dcmplx( lwkopt )
730 RETURN
731 END IF
732*
733* ==================================================================
734*
735* Initialize column pivot array JPIV.
736*
737 DO j = 1, n
738 jpiv( j ) = j
739 END DO
740*
741* ==================================================================
742*
743* Initialize storage for partial and exact column 2-norms.
744* a) The elements WORK(1:N) are used to store partial column
745* 2-norms of the matrix A, and may decrease in each computation
746* step; initialize to the values of complete columns 2-norms.
747* b) The elements WORK(N+1:2*N) are used to store complete column
748* 2-norms of the matrix A, they are not changed during the
749* computation; initialize the values of complete columns 2-norms.
750*
751 DO j = 1, n
752 rwork( j ) = dznrm2( m, a( 1, j ), 1 )
753 rwork( n+j ) = rwork( j )
754 END DO
755*
756* ==================================================================
757*
758* Compute the pivot column index and the maximum column 2-norm
759* for the whole original matrix stored in A(1:M,1:N).
760*
761 kp1 = idamax( n, rwork( 1 ), 1 )
762*
763* ==================================================================.
764*
765 IF( disnan( maxc2nrm ) ) THEN
766*
767* Check if the matrix A contains NaN, set INFO parameter
768* to the column number where the first NaN is found and return
769* from the routine.
770*
771 k = 0
772 info = kp1
773*
774* Set MAXC2NRMK and RELMAXC2NRMK to NaN.
775*
776 maxc2nrmk = maxc2nrm
777 relmaxc2nrmk = maxc2nrm
778*
779* Array TAU is not set and contains undefined elements.
780*
781 work( 1 ) = dcmplx( lwkopt )
782 RETURN
783 END IF
784*
785* ===================================================================
786*
787 IF( maxc2nrm.EQ.zero ) THEN
788*
789* Check is the matrix A is a zero matrix, set array TAU and
790* return from the routine.
791*
792 k = 0
793 maxc2nrmk = zero
794 relmaxc2nrmk = zero
795*
796 DO j = 1, minmn
797 tau( j ) = czero
798 END DO
799*
800 work( 1 ) = dcmplx( lwkopt )
801 RETURN
802*
803 END IF
804*
805* ===================================================================
806*
807 hugeval = dlamch( 'Overflow' )
808*
809 IF( maxc2nrm.GT.hugeval ) THEN
810*
811* Check if the matrix A contains +Inf or -Inf, set INFO parameter
812* to the column number, where the first +/-Inf is found plus N,
813* and continue the computation.
814*
815 info = n + kp1
816*
817 END IF
818*
819* ==================================================================
820*
821* Quick return if possible for the case when the first
822* stopping criterion is satisfied, i.e. KMAX = 0.
823*
824 IF( kmax.EQ.0 ) THEN
825 k = 0
826 maxc2nrmk = maxc2nrm
827 relmaxc2nrmk = one
828 DO j = 1, minmn
829 tau( j ) = czero
830 END DO
831 work( 1 ) = dcmplx( lwkopt )
832 RETURN
833 END IF
834*
835* ==================================================================
836*
837 eps = dlamch('Epsilon')
838*
839* Adjust ABSTOL
840*
841 IF( abstol.GE.zero ) THEN
842 safmin = dlamch('Safe minimum')
843 abstol = max( abstol, two*safmin )
844 END IF
845*
846* Adjust RELTOL
847*
848 IF( reltol.GE.zero ) THEN
849 reltol = max( reltol, eps )
850 END IF
851*
852* ===================================================================
853*
854* JMAX is the maximum index of the column to be factorized,
855* which is also limited by the first stopping criterion KMAX.
856*
857 jmax = min( kmax, minmn )
858*
859* ===================================================================
860*
861* Quick return if possible for the case when the second or third
862* stopping criterion for the whole original matrix is satified,
863* i.e. MAXC2NRM <= ABSTOL or RELMAXC2NRM <= RELTOL
864* (which is ONE <= RELTOL).
865*
866 IF( maxc2nrm.LE.abstol .OR. one.LE.reltol ) THEN
867*
868 k = 0
869 maxc2nrmk = maxc2nrm
870 relmaxc2nrmk = one
871*
872 DO j = 1, minmn
873 tau( j ) = czero
874 END DO
875*
876 work( 1 ) = dcmplx( lwkopt )
877 RETURN
878 END IF
879*
880* ==================================================================
881* Factorize columns
882* ==================================================================
883*
884* Determine the block size.
885*
886 nbmin = 2
887 nx = 0
888*
889 IF( ( nb.GT.1 ) .AND. ( nb.LT.minmn ) ) THEN
890*
891* Determine when to cross over from blocked to unblocked code.
892* (for N less than NX, unblocked code should be used).
893*
894 nx = max( 0, ilaenv( ixover, 'ZGEQP3RK', ' ', m, n, -1, -1 ) )
895*
896 IF( nx.LT.minmn ) THEN
897*
898* Determine if workspace is large enough for blocked code.
899*
900 IF( lwork.LT.lwkopt ) THEN
901*
902* Not enough workspace to use optimal block size that
903* is currently stored in NB.
904* Reduce NB and determine the minimum value of NB.
905*
906 nb = ( lwork-2*n ) / ( n+1 )
907 nbmin = max( 2, ilaenv( inbmin, 'ZGEQP3RK', ' ', m, n,
908 $ -1, -1 ) )
909*
910 END IF
911 END IF
912 END IF
913*
914* ==================================================================
915*
916* DONE is the boolean flag to rerpresent the case when the
917* factorization completed in the block factorization routine,
918* before the end of the block.
919*
920 done = .false.
921*
922* J is the column index.
923*
924 j = 1
925*
926* (1) Use blocked code initially.
927*
928* JMAXB is the maximum column index of the block, when the
929* blocked code is used, is also limited by the first stopping
930* criterion KMAX.
931*
932 jmaxb = min( kmax, minmn - nx )
933*
934 IF( nb.GE.nbmin .AND. nb.LT.jmax .AND. jmaxb.GT.0 ) THEN
935*
936* Loop over the column blocks of the matrix A(1:M,1:JMAXB). Here:
937* J is the column index of a column block;
938* JB is the column block size to pass to block factorization
939* routine in a loop step;
940* JBF is the number of columns that were actually factorized
941* that was returned by the block factorization routine
942* in a loop step, JBF <= JB;
943* N_SUB is the number of columns in the submatrix;
944* IOFFSET is the number of rows that should not be factorized.
945*
946 DO WHILE( j.LE.jmaxb )
947*
948 jb = min( nb, jmaxb-j+1 )
949 n_sub = n-j+1
950 ioffset = j-1
951*
952* Factorize JB columns among the columns A(J:N).
953*
954 CALL zlaqp3rk( m, n_sub, nrhs, ioffset, jb, abstol,
955 $ reltol, kp1, maxc2nrm, a( 1, j ), lda,
956 $ done, jbf, maxc2nrmk, relmaxc2nrmk,
957 $ jpiv( j ), tau( j ),
958 $ rwork( j ), rwork( n+j ),
959 $ work( 1 ), work( jb+1 ),
960 $ n+nrhs-j+1, iwork, iinfo )
961*
962* Set INFO on the first occurence of Inf.
963*
964 IF( iinfo.GT.n_sub .AND. info.EQ.0 ) THEN
965 info = 2*ioffset + iinfo
966 END IF
967*
968 IF( done ) THEN
969*
970* Either the submatrix is zero before the end of the
971* column block, or ABSTOL or RELTOL criterion is
972* satisfied before the end of the column block, we can
973* return from the routine. Perform the following before
974* returning:
975* a) Set the number of factorized columns K,
976* K = IOFFSET + JBF from the last call of blocked
977* routine.
978* NOTE: 1) MAXC2NRMK and RELMAXC2NRMK are returned
979* by the block factorization routine;
980* 2) The remaining TAUs are set to ZERO by the
981* block factorization routine.
982*
983 k = ioffset + jbf
984*
985* Set INFO on the first occurrence of NaN, NaN takes
986* prcedence over Inf.
987*
988 IF( iinfo.LE.n_sub .AND. iinfo.GT.0 ) THEN
989 info = ioffset + iinfo
990 END IF
991*
992* Return from the routine.
993*
994 work( 1 ) = dcmplx( lwkopt )
995*
996 RETURN
997*
998 END IF
999*
1000 j = j + jbf
1001*
1002 END DO
1003*
1004 END IF
1005*
1006* Use unblocked code to factor the last or only block.
1007* J = JMAX+1 means we factorized the maximum possible number of
1008* columns, that is in ELSE clause we need to compute
1009* the MAXC2NORM and RELMAXC2NORM to return after we processed
1010* the blocks.
1011*
1012 IF( j.LE.jmax ) THEN
1013*
1014* N_SUB is the number of columns in the submatrix;
1015* IOFFSET is the number of rows that should not be factorized.
1016*
1017 n_sub = n-j+1
1018 ioffset = j-1
1019*
1020 CALL zlaqp2rk( m, n_sub, nrhs, ioffset, jmax-j+1,
1021 $ abstol, reltol, kp1, maxc2nrm, a( 1, j ), lda,
1022 $ kf, maxc2nrmk, relmaxc2nrmk, jpiv( j ),
1023 $ tau( j ), rwork( j ), rwork( n+j ),
1024 $ work( 1 ), iinfo )
1025*
1026* ABSTOL or RELTOL criterion is satisfied when the number of
1027* the factorized columns KF is smaller then the number
1028* of columns JMAX-J+1 supplied to be factorized by the
1029* unblocked routine, we can return from
1030* the routine. Perform the following before returning:
1031* a) Set the number of factorized columns K,
1032* b) MAXC2NRMK and RELMAXC2NRMK are returned by the
1033* unblocked factorization routine above.
1034*
1035 k = j - 1 + kf
1036*
1037* Set INFO on the first exception occurence.
1038*
1039* Set INFO on the first exception occurence of Inf or NaN,
1040* (NaN takes precedence over Inf).
1041*
1042 IF( iinfo.GT.n_sub .AND. info.EQ.0 ) THEN
1043 info = 2*ioffset + iinfo
1044 ELSE IF( iinfo.LE.n_sub .AND. iinfo.GT.0 ) THEN
1045 info = ioffset + iinfo
1046 END IF
1047*
1048 ELSE
1049*
1050* Compute the return values for blocked code.
1051*
1052* Set the number of factorized columns if the unblocked routine
1053* was not called.
1054*
1055 k = jmax
1056*
1057* If there exits a residual matrix after the blocked code:
1058* 1) compute the values of MAXC2NRMK, RELMAXC2NRMK of the
1059* residual matrix, otherwise set them to ZERO;
1060* 2) Set TAU(K+1:MINMN) to ZERO.
1061*
1062 IF( k.LT.minmn ) THEN
1063 jmaxc2nrm = k + idamax( n-k, rwork( k+1 ), 1 )
1064 maxc2nrmk = rwork( jmaxc2nrm )
1065 IF( k.EQ.0 ) THEN
1066 relmaxc2nrmk = one
1067 ELSE
1068 relmaxc2nrmk = maxc2nrmk / maxc2nrm
1069 END IF
1070*
1071 DO j = k + 1, minmn
1072 tau( j ) = czero
1073 END DO
1074*
1075 ELSE
1076 maxc2nrmk = zero
1077 relmaxc2nrmk = zero
1078*
1079 END IF
1080*
1081* END IF( J.LE.JMAX ) THEN
1082*
1083 END IF
1084*
1085 work( 1 ) = dcmplx( lwkopt )
1086*
1087 RETURN
1088*
1089* End of ZGEQP3RK
1090*
1091 END
subroutine xerbla(srname, info)
Definition cblat2.f:3285
subroutine zgeqp3rk(m, n, nrhs, kmax, abstol, reltol, a, lda, k, maxc2nrmk, relmaxc2nrmk, jpiv, tau, work, lwork, rwork, iwork, info)
ZGEQP3RK computes a truncated Householder QR factorization with column pivoting of a complex m-by-n m...
Definition zgeqp3rk.f:591
subroutine zlaqp2rk(m, n, nrhs, ioffset, kmax, abstol, reltol, kp1, maxc2nrm, a, lda, k, maxc2nrmk, relmaxc2nrmk, jpiv, tau, vn1, vn2, work, info)
ZLAQP2RK computes truncated QR factorization with column pivoting of a complex matrix block using Lev...
Definition zlaqp2rk.f:345
subroutine zlaqp3rk(m, n, nrhs, ioffset, nb, abstol, reltol, kp1, maxc2nrm, a, lda, done, kb, maxc2nrmk, relmaxc2nrmk, jpiv, tau, vn1, vn2, auxv, f, ldf, iwork, info)
ZLAQP3RK computes a step of truncated QR factorization with column pivoting of a complex m-by-n matri...
Definition zlaqp3rk.f:396