SCALAPACK 2.2.2
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches
psgeequ.f
Go to the documentation of this file.
1 SUBROUTINE psgeequ( M, N, A, IA, JA, DESCA, R, C, ROWCND, COLCND,
2 $ AMAX, INFO )
3*
4* -- ScaLAPACK routine (version 1.7) --
5* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
6* and University of California, Berkeley.
7* May 1, 1997
8*
9* .. Scalar Arguments ..
10 INTEGER IA, INFO, JA, M, N
11 REAL AMAX, COLCND, ROWCND
12* ..
13* .. Array Arguments ..
14 INTEGER DESCA( * )
15 REAL A( * ), C( * ), R( * )
16* ..
17*
18* Purpose
19* =======
20*
21* PSGEEQU computes row and column scalings intended to equilibrate an
22* M-by-N distributed matrix sub( A ) = A(IA:IA+N-1,JA:JA:JA+N-1) and
23* reduce its condition number. R returns the row scale factors and C
24* the column scale factors, chosen to try to make the largest entry in
25* each row and column of the distributed matrix B with elements
26* B(i,j) = R(i) * A(i,j) * C(j) have absolute value 1.
27*
28* R(i) and C(j) are restricted to be between SMLNUM = smallest safe
29* number and BIGNUM = largest safe number. Use of these scaling
30* factors is not guaranteed to reduce the condition number of
31* sub( A ) but works well in practice.
32*
33* Notes
34* =====
35*
36* Each global data object is described by an associated description
37* vector. This vector stores the information required to establish
38* the mapping between an object element and its corresponding process
39* and memory location.
40*
41* Let A be a generic term for any 2D block cyclicly distributed array.
42* Such a global array has an associated description vector DESCA.
43* In the following comments, the character _ should be read as
44* "of the global array".
45*
46* NOTATION STORED IN EXPLANATION
47* --------------- -------------- --------------------------------------
48* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
49* DTYPE_A = 1.
50* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
51* the BLACS process grid A is distribu-
52* ted over. The context itself is glo-
53* bal, but the handle (the integer
54* value) may vary.
55* M_A (global) DESCA( M_ ) The number of rows in the global
56* array A.
57* N_A (global) DESCA( N_ ) The number of columns in the global
58* array A.
59* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
60* the rows of the array.
61* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
62* the columns of the array.
63* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
64* row of the array A is distributed.
65* CSRC_A (global) DESCA( CSRC_ ) The process column over which the
66* first column of the array A is
67* distributed.
68* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
69* array. LLD_A >= MAX(1,LOCr(M_A)).
70*
71* Let K be the number of rows or columns of a distributed matrix,
72* and assume that its process grid has dimension p x q.
73* LOCr( K ) denotes the number of elements of K that a process
74* would receive if K were distributed over the p processes of its
75* process column.
76* Similarly, LOCc( K ) denotes the number of elements of K that a
77* process would receive if K were distributed over the q processes of
78* its process row.
79* The values of LOCr() and LOCc() may be determined via a call to the
80* ScaLAPACK tool function, NUMROC:
81* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
82* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
83* An upper bound for these quantities may be computed by:
84* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
85* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
86*
87* Arguments
88* =========
89*
90* M (global input) INTEGER
91* The number of rows to be operated on i.e the number of rows
92* of the distributed submatrix sub( A ). M >= 0.
93*
94* N (global input) INTEGER
95* The number of columns to be operated on i.e the number of
96* columns of the distributed submatrix sub( A ). N >= 0.
97*
98* A (local input) REAL pointer into the local memory
99* to an array of dimension ( LLD_A, LOCc(JA+N-1) ), the
100* local pieces of the M-by-N distributed matrix whose
101* equilibration factors are to be computed.
102*
103* IA (global input) INTEGER
104* The row index in the global array A indicating the first
105* row of sub( A ).
106*
107* JA (global input) INTEGER
108* The column index in the global array A indicating the
109* first column of sub( A ).
110*
111* DESCA (global and local input) INTEGER array of dimension DLEN_.
112* The array descriptor for the distributed matrix A.
113*
114* R (local output) REAL array, dimension LOCr(M_A)
115* If INFO = 0 or INFO > IA+M-1, R(IA:IA+M-1) contains the row
116* scale factors for sub( A ). R is aligned with the distributed
117* matrix A, and replicated across every process column. R is
118* tied to the distributed matrix A.
119*
120* C (local output) REAL array, dimension LOCc(N_A)
121* If INFO = 0, C(JA:JA+N-1) contains the column scale factors
122* for sub( A ). C is aligned with the distributed matrix A, and
123* replicated down every process row. C is tied to the distri-
124* buted matrix A.
125*
126* ROWCND (global output) REAL
127* If INFO = 0 or INFO > IA+M-1, ROWCND contains the ratio of
128* the smallest R(i) to the largest R(i) (IA <= i <= IA+M-1).
129* If ROWCND >= 0.1 and AMAX is neither too large nor too small,
130* it is not worth scaling by R(IA:IA+M-1).
131*
132* COLCND (global output) REAL
133* If INFO = 0, COLCND contains the ratio of the smallest C(j)
134* to the largest C(j) (JA <= j <= JA+N-1). If COLCND >= 0.1, it
135* is not worth scaling by C(JA:JA+N-1).
136*
137* AMAX (global output) REAL
138* Absolute value of largest distributed matrix element. If
139* AMAX is very close to overflow or very close to underflow,
140* the matrix should be scaled.
141*
142* INFO (global output) INTEGER
143* = 0: successful exit
144* < 0: If the i-th argument is an array and the j-entry had
145* an illegal value, then INFO = -(i*100+j), if the i-th
146* argument is a scalar and had an illegal value, then
147* INFO = -i.
148* > 0: If INFO = i, and i is
149* <= M: the i-th row of the distributed matrix sub( A )
150* is exactly zero,
151* > M: the (i-M)-th column of the distributed
152* matrix sub( A ) is exactly zero.
153*
154* =====================================================================
155*
156* .. Parameters ..
157 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
158 $ lld_, mb_, m_, nb_, n_, rsrc_
159 parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
160 $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
161 $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
162 REAL ONE, ZERO
163 parameter( one = 1.0e+0, zero = 0.0e+0 )
164* ..
165* .. Local Scalars ..
166 CHARACTER COLCTOP, ROWCTOP
167 INTEGER I, IACOL, IAROW, ICOFF, ICTXT, IDUMM, IIA,
168 $ ioffa, iroff, j, jja, lda, mp, mycol, myrow,
169 $ npcol, nprow, nq
170 REAL BIGNUM, RCMAX, RCMIN, SMLNUM
171* ..
172* .. Local Arrays ..
173 INTEGER DESCC( DLEN_ ), DESCR( DLEN_ )
174* ..
175* .. External Subroutines ..
176 EXTERNAL blacs_gridinfo, chk1mat, descset, igamx2d,
177 $ infog2l, pchk1mat, pb_topget, pxerbla, sgamn2d,
178 $ sgamx2d
179* ..
180* .. External Functions ..
181 INTEGER INDXL2G, NUMROC
182 REAL PSLAMCH
183 EXTERNAL indxl2g, numroc, pslamch
184* ..
185* .. Intrinsic Functions ..
186 INTRINSIC abs, max, min, mod
187* ..
188* .. Executable Statements ..
189*
190* Get grid parameters
191*
192 ictxt = desca( ctxt_ )
193 CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
194*
195* Test the input parameters.
196*
197 info = 0
198 IF( nprow.EQ.-1 ) THEN
199 info = -(600+ctxt_)
200 ELSE
201 CALL chk1mat( m, 1, n, 2, ia, ja, desca, 6, info )
202 CALL pchk1mat( m, 1, n, 2, ia, ja, desca, 6, 0, idumm, idumm,
203 $ info )
204 END IF
205*
206 IF( info.NE.0 ) THEN
207 CALL pxerbla( ictxt, 'PSGEEQU', -info )
208 RETURN
209 END IF
210*
211* Quick return if possible
212*
213 IF( m.EQ.0 .OR. n.EQ.0 ) THEN
214 rowcnd = one
215 colcnd = one
216 amax = zero
217 RETURN
218 END IF
219*
220 CALL pb_topget( ictxt, 'Combine', 'Rowwise', rowctop )
221 CALL pb_topget( ictxt, 'Combine', 'Columnwise', colctop )
222*
223* Get machine constants and local indexes.
224*
225 smlnum = pslamch( ictxt, 'S' )
226 bignum = one / smlnum
227 CALL infog2l( ia, ja, desca, nprow, npcol, myrow, mycol, iia, jja,
228 $ iarow, iacol )
229 iroff = mod( ia-1, desca( mb_ ) )
230 icoff = mod( ja-1, desca( nb_ ) )
231 mp = numroc( m+iroff, desca( mb_ ), myrow, iarow, nprow )
232 nq = numroc( n+icoff, desca( nb_ ), mycol, iacol, npcol )
233 IF( myrow.EQ.iarow )
234 $ mp = mp - iroff
235 IF( mycol.EQ.iacol )
236 $ nq = nq - icoff
237 lda = desca( lld_ )
238*
239* Assign descriptors for R and C arrays
240*
241 CALL descset( descr, m, 1, desca( mb_ ), 1, 0, 0, ictxt,
242 $ max( 1, mp ) )
243 CALL descset( descc, 1, n, 1, desca( nb_ ), 0, 0, ictxt, 1 )
244*
245* Compute row scale factors.
246*
247 DO 10 i = iia, iia+mp-1
248 r( i ) = zero
249 10 CONTINUE
250*
251* Find the maximum element in each row.
252*
253 ioffa = (jja-1)*lda
254 DO 30 j = jja, jja+nq-1
255 DO 20 i = iia, iia+mp-1
256 r( i ) = max( r( i ), abs( a( ioffa + i ) ) )
257 20 CONTINUE
258 ioffa = ioffa + lda
259 30 CONTINUE
260 CALL sgamx2d( ictxt, 'Rowwise', rowctop, mp, 1, r( iia ),
261 $ max( 1, mp ), idumm, idumm, -1, -1, mycol )
262*
263* Find the maximum and minimum scale factors.
264*
265 rcmin = bignum
266 rcmax = zero
267 DO 40 i = iia, iia+mp-1
268 rcmax = max( rcmax, r( i ) )
269 rcmin = min( rcmin, r( i ) )
270 40 CONTINUE
271 CALL sgamx2d( ictxt, 'Columnwise', colctop, 1, 1, rcmax, 1, idumm,
272 $ idumm, -1, -1, mycol )
273 CALL sgamn2d( ictxt, 'Columnwise', colctop, 1, 1, rcmin, 1, idumm,
274 $ idumm, -1, -1, mycol )
275 amax = rcmax
276*
277 IF( rcmin.EQ.zero ) THEN
278*
279* Find the first zero scale factor and return an error code.
280*
281 DO 50 i = iia, iia+mp-1
282 IF( r( i ).EQ.zero .AND. info.EQ.0 )
283 $ info = indxl2g( i, desca( mb_ ), myrow, desca( rsrc_ ),
284 $ nprow ) - ia + 1
285 50 CONTINUE
286 CALL igamx2d( ictxt, 'Columnwise', colctop, 1, 1, info, 1,
287 $ idumm, idumm, -1, -1, mycol )
288 IF( info.NE.0 )
289 $ RETURN
290 ELSE
291*
292* Invert the scale factors.
293*
294 DO 60 i = iia, iia+mp-1
295 r( i ) = one / min( max( r( i ), smlnum ), bignum )
296 60 CONTINUE
297*
298* Compute ROWCND = min(R(I)) / max(R(I))
299*
300 rowcnd = max( rcmin, smlnum ) / min( rcmax, bignum )
301*
302 END IF
303*
304* Compute column scale factors
305*
306 DO 70 j = jja, jja+nq-1
307 c( j ) = zero
308 70 CONTINUE
309*
310* Find the maximum element in each column,
311* assuming the row scaling computed above.
312*
313 ioffa = (jja-1)*lda
314 DO 90 j = jja, jja+nq-1
315 DO 80 i = iia, iia+mp-1
316 c( j ) = max( c( j ), abs( a( ioffa + i ) )*r( i ) )
317 80 CONTINUE
318 ioffa = ioffa + lda
319 90 CONTINUE
320 CALL sgamx2d( ictxt, 'Columnwise', colctop, 1, nq, c( jja ),
321 $ 1, idumm, idumm, -1, -1, mycol )
322*
323* Find the maximum and minimum scale factors.
324*
325 rcmin = bignum
326 rcmax = zero
327 DO 100 j = jja, jja+nq-1
328 rcmin = min( rcmin, c( j ) )
329 rcmax = max( rcmax, c( j ) )
330 100 CONTINUE
331 CALL sgamx2d( ictxt, 'Columnwise', colctop, 1, 1, rcmax, 1, idumm,
332 $ idumm, -1, -1, mycol )
333 CALL sgamn2d( ictxt, 'Columnwise', colctop, 1, 1, rcmin, 1, idumm,
334 $ idumm, -1, -1, mycol )
335*
336 IF( rcmin.EQ.zero ) THEN
337*
338* Find the first zero scale factor and return an error code.
339*
340 DO 110 j = jja, jja+nq-1
341 IF( c( j ).EQ.zero .AND. info.EQ.0 )
342 $ info = m + indxl2g( j, desca( nb_ ), mycol,
343 $ desca( csrc_ ), npcol ) - ja + 1
344 110 CONTINUE
345 CALL igamx2d( ictxt, 'Columnwise', colctop, 1, 1, info, 1,
346 $ idumm, idumm, -1, -1, mycol )
347 IF( info.NE.0 )
348 $ RETURN
349 ELSE
350*
351* Invert the scale factors.
352*
353 DO 120 j = jja, jja+nq-1
354 c( j ) = one / min( max( c( j ), smlnum ), bignum )
355 120 CONTINUE
356*
357* Compute COLCND = min(C(J)) / max(C(J))
358*
359 colcnd = max( rcmin, smlnum ) / min( rcmax, bignum )
360*
361 END IF
362*
363 RETURN
364*
365* End of PSGEEQU
366*
367 END
subroutine chk1mat(ma, mapos0, na, napos0, ia, ja, desca, descapos0, info)
Definition chk1mat.f:3
subroutine descset(desc, m, n, mb, nb, irsrc, icsrc, ictxt, lld)
Definition descset.f:3
subroutine infog2l(grindx, gcindx, desc, nprow, npcol, myrow, mycol, lrindx, lcindx, rsrc, csrc)
Definition infog2l.f:3
#define max(A, B)
Definition pcgemr.c:180
#define min(A, B)
Definition pcgemr.c:181
subroutine pchk1mat(ma, mapos0, na, napos0, ia, ja, desca, descapos0, nextra, ex, expos, info)
Definition pchkxmat.f:3
subroutine psgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
Definition psgeequ.f:3
subroutine pxerbla(ictxt, srname, info)
Definition pxerbla.f:2