SCALAPACK 2.2.2
LAPACK: Linear Algebra PACKage
All Classes Files Functions Variables Typedefs Macros
pzmax1.f
Go to the documentation of this file.
1 SUBROUTINE pzmax1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2*
3* -- ScaLAPACK auxiliary routine (version 1.7) --
4* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5* and University of California, Berkeley.
6* May 1, 1997
7*
8* .. Scalar Arguments ..
9 INTEGER INDX, INCX, IX, JX, N
10 COMPLEX*16 AMAX
11* ..
12* .. Array Arguments ..
13 INTEGER DESCX( * )
14 COMPLEX*16 X( * )
15* ..
16*
17* Purpose
18* =======
19*
20* PZMAX1 computes the global index of the maximum element in absolute
21* value of a distributed vector sub( X ). The global index is returned
22* in INDX and the value is returned in AMAX,
23*
24* where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25* X(IX,JX:JX+N-1) if INCX = M_X.
26*
27* Notes
28* =====
29*
30* Each global data object is described by an associated description
31* vector. This vector stores the information required to establish
32* the mapping between an object element and its corresponding process
33* and memory location.
34*
35* Let A be a generic term for any 2D block cyclicly distributed array.
36* Such a global array has an associated description vector DESCA.
37* In the following comments, the character _ should be read as
38* "of the global array".
39*
40* NOTATION STORED IN EXPLANATION
41* --------------- -------------- --------------------------------------
42* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
43* DTYPE_A = 1.
44* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45* the BLACS process grid A is distribu-
46* ted over. The context itself is glo-
47* bal, but the handle (the integer
48* value) may vary.
49* M_A (global) DESCA( M_ ) The number of rows in the global
50* array A.
51* N_A (global) DESCA( N_ ) The number of columns in the global
52* array A.
53* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
54* the rows of the array.
55* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
56* the columns of the array.
57* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58* row of the array A is distributed.
59* CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60* first column of the array A is
61* distributed.
62* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
63* array. LLD_A >= MAX(1,LOCr(M_A)).
64*
65* Let K be the number of rows or columns of a distributed matrix,
66* and assume that its process grid has dimension p x q.
67* LOCr( K ) denotes the number of elements of K that a process
68* would receive if K were distributed over the p processes of its
69* process column.
70* Similarly, LOCc( K ) denotes the number of elements of K that a
71* process would receive if K were distributed over the q processes of
72* its process row.
73* The values of LOCr() and LOCc() may be determined via a call to the
74* ScaLAPACK tool function, NUMROC:
75* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77* An upper bound for these quantities may be computed by:
78* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80*
81* Because vectors may be viewed as a subclass of matrices, a
82* distributed vector is considered to be a distributed matrix.
83*
84* When the result of a vector-oriented PBLAS call is a scalar, it will
85* be made available only within the scope which owns the vector(s)
86* being operated on. Let X be a generic term for the input vector(s).
87* Then, the processes which receive the answer will be (note that if
88* an operation involves more than one vector, the processes which re-
89* ceive the result will be the union of the following calculation for
90* each vector):
91*
92* If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93* row or process column owns the vector operand, therefore only the
94* process of coordinate {RSRC_X, CSRC_X} receives the result;
95*
96* If INCX = M_X, then sub( X ) is a vector distributed over a process
97* row. Each process part of this row receives the result;
98*
99* If INCX = 1, then sub( X ) is a vector distributed over a process
100* column. Each process part of this column receives the result;
101*
102* Based on PZAMAX from Level 1 PBLAS. The change is to use the
103* 'genuine' absolute value.
104*
105* The serial version was contributed to LAPACK by Nick Higham for use
106* with ZLACON.
107*
108* Arguments
109* =========
110*
111* N (global input) pointer to INTEGER
112* The number of components of the distributed vector sub( X ).
113* N >= 0.
114*
115* AMAX (global output) pointer to DOUBLE PRECISION
116* The absolute value of the largest entry of the distributed
117* vector sub( X ) only in the scope of sub( X ).
118*
119* INDX (global output) pointer to INTEGER
120* The global index of the element of the distributed vector
121* sub( X ) whose real part has maximum absolute value.
122*
123* X (local input) COMPLEX*16 array containing the local
124* pieces of a distributed matrix of dimension of at least
125* ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126* This array contains the entries of the distributed vector
127* sub( X ).
128*
129* IX (global input) INTEGER
130* The row index in the global array X indicating the first
131* row of sub( X ).
132*
133* JX (global input) INTEGER
134* The column index in the global array X indicating the
135* first column of sub( X ).
136*
137* DESCX (global and local input) INTEGER array of dimension DLEN_.
138* The array descriptor for the distributed matrix X.
139*
140* INCX (global input) INTEGER
141* The global increment for the elements of X. Only two values
142* of INCX are supported in this version, namely 1 and M_X.
143* INCX must not be zero.
144*
145* =====================================================================
146*
147* .. Parameters ..
148 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149 $ LLD_, MB_, M_, NB_, N_, RSRC_
150 parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
151 $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
152 $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
153 COMPLEX*16 ZERO
154 parameter( zero = ( 0.0d+0, 0.0d+0 ) )
155* ..
156* .. Local Scalars ..
157 CHARACTER CBTOP, CCTOP, RBTOP, RCTOP
158 INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159 $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160 $ NPCOL, NPROW, NQ
161* ..
162* .. Local Arrays ..
163 COMPLEX*16 WORK( 2 )
164* ..
165* .. External Subroutines ..
166 EXTERNAL blacs_gridinfo, igebr2d, igebs2d, infog2l,
167 $ pb_topget, pztreecomb, zcombamax1, zgamx2d
168* ..
169* .. External Functions ..
170 LOGICAL LSAME
171 INTEGER IZMAX1, INDXL2G, NUMROC
172 EXTERNAL izmax1, indxl2g, numroc
173* ..
174* .. Intrinsic Functions ..
175 INTRINSIC abs, dble, dcmplx, mod, nint
176* ..
177* .. Executable Statements ..
178*
179* Get grid parameters
180*
181 ictxt = descx( ctxt_ )
182 CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
183*
184* Quick return if possible.
185*
186 indx = 0
187 amax = zero
188 IF( n.LE.0 )
189 $ RETURN
190*
191* Retrieve local information for vector X.
192*
193 ldx = descx( lld_ )
194 CALL infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix, jjx,
195 $ ixrow, ixcol )
196*
197 IF( incx.EQ.1 .AND. descx( m_ ).EQ.1 .AND. n.EQ.1 ) THEN
198 indx = jx
199 amax = x( iix+(jjx-1)*ldx )
200 RETURN
201 END IF
202*
203* Find the maximum value and its index
204*
205 IF( incx.EQ.descx( m_ ) ) THEN
206*
207 IF( myrow.EQ.ixrow ) THEN
208*
209 icoff = mod( jx-1, descx( nb_ ) )
210 nq = numroc( n+icoff, descx( nb_ ), mycol, ixcol, npcol )
211 IF( mycol.EQ.ixcol )
212 $ nq = nq-icoff
213*
214 CALL pb_topget( ictxt, 'Broadcast', 'Rowwise', rbtop )
215*
216 IF( lsame( rbtop, ' ' ) ) THEN
217*
218 IF( nq.GT.0 ) THEN
219 lcindx = jjx-1+izmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
220 work( 1 ) = x( iix+(lcindx-1)*ldx )
221 work( 2 ) = dcmplx( dble( indxl2g( lcindx,
222 $ descx( nb_ ), mycol, descx( csrc_ ), npcol ) ) )
223 ELSE
224 work( 1 ) = zero
225 work( 2 ) = zero
226 END IF
227*
228 CALL pztreecomb( ictxt, 'Row', 2, work, -1, mycol,
229 $ zcombamax1 )
230*
231 amax = work( 1 )
232 IF( amax.EQ.zero ) THEN
233 indx = jx
234 ELSE
235 indx = nint( dble( work( 2 ) ) )
236 END IF
237*
238 ELSE
239*
240 CALL pb_topget( ictxt, 'Combine', 'Rowwise', rctop )
241*
242 IF( nq.GT.0 ) THEN
243 lcindx = jjx-1+izmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
244 amax = x( iix + (lcindx-1)*ldx )
245 ELSE
246 amax = zero
247 END IF
248*
249* Find the maximum value
250*
251 CALL zgamx2d( ictxt, 'Rowwise', rctop, 1, 1, amax, 1,
252 $ idumm, maxpos, 1, -1, myrow )
253*
254 IF( amax.NE.zero ) THEN
255*
256* Broadcast corresponding global index
257*
258 IF( mycol.EQ.maxpos ) THEN
259 indx = indxl2g( lcindx, descx( nb_ ), mycol,
260 $ descx( csrc_ ), npcol )
261 CALL igebs2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
262 $ 1 )
263 ELSE
264 CALL igebr2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
265 $ 1, myrow, maxpos )
266 END IF
267*
268 ELSE
269*
270 indx = jx
271*
272 END IF
273*
274 END IF
275*
276 END IF
277*
278 ELSE
279*
280 IF( mycol.EQ.ixcol ) THEN
281*
282 iroff = mod( ix-1, descx( mb_ ) )
283 np = numroc( n+iroff, descx( mb_ ), myrow, ixrow, nprow )
284 IF( myrow.EQ.ixrow )
285 $ np = np-iroff
286*
287 CALL pb_topget( ictxt, 'Broadcast', 'Columnwise', cbtop )
288*
289 IF( lsame( cbtop, ' ' ) ) THEN
290*
291 IF( np.GT.0 ) THEN
292 lcindx = iix-1+izmax1( np, x( iix+(jjx-1)*ldx ), 1 )
293 work( 1 ) = x( lcindx + (jjx-1)*ldx )
294 work( 2 ) = dcmplx( dble( indxl2g( lcindx,
295 $ descx( mb_ ), myrow, descx( rsrc_ ), nprow ) ) )
296 ELSE
297 work( 1 ) = zero
298 work( 2 ) = zero
299 END IF
300*
301 CALL pztreecomb( ictxt, 'Column', 2, work, -1, mycol,
302 $ zcombamax1 )
303*
304 amax = work( 1 )
305 IF( amax.EQ.zero ) THEN
306 indx = ix
307 ELSE
308 indx = nint( dble( work( 2 ) ) )
309 END IF
310*
311 ELSE
312*
313 CALL pb_topget( ictxt, 'Combine', 'Columnwise', cctop )
314*
315 IF( np.GT.0 ) THEN
316 lcindx = iix-1+izmax1( np, x( iix+(jjx-1)*ldx ), 1 )
317 amax = x( lcindx + (jjx-1)*ldx )
318 ELSE
319 amax = zero
320 END IF
321*
322* Find the maximum value
323*
324 CALL zgamx2d( ictxt, 'Columnwise', cctop, 1, 1, amax, 1,
325 $ maxpos, idumm, 1, -1, mycol )
326*
327 IF( amax.NE.zero ) THEN
328*
329* Broadcast corresponding global index
330*
331 IF( myrow.EQ.maxpos ) THEN
332 indx = indxl2g( lcindx, descx( mb_ ), myrow,
333 $ descx( rsrc_ ), nprow )
334 CALL igebs2d( ictxt, 'Columnwise', cbtop, 1, 1,
335 $ indx, 1 )
336 ELSE
337 CALL igebr2d( ictxt, 'Columnwise', cbtop, 1, 1,
338 $ indx, 1, maxpos, mycol )
339 END IF
340*
341 ELSE
342*
343 indx = ix
344*
345 END IF
346*
347 END IF
348*
349 END IF
350*
351 END IF
352*
353 RETURN
354*
355* End of PZMAX1
356*
357 END
358*
359 SUBROUTINE zcombamax1 ( V1, V2 )
360*
361* -- ScaLAPACK auxiliary routine (version 1.7) --
362* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
363* and University of California, Berkeley.
364* May 1, 1997
365*
366* .. Array Arguments ..
367 COMPLEX*16 V1( 2 ), V2( 2 )
368* ..
369*
370* Purpose
371* =======
372*
373* ZCOMBAMAX1 finds the element having maximum real part absolute
374* value as well as its corresponding globl index.
375*
376* Arguments
377* =========
378*
379* V1 (local input/local output) COMPLEX*16 array of
380* dimension 2. The first maximum absolute value element and
381* its global index. V1(1) = AMAX, V1(2) = INDX.
382*
383* V2 (local input) COMPLEX*16 array of dimension 2.
384* The second maximum absolute value element and its global
385* index. V2(1) = AMAX, V2(2) = INDX.
386*
387* =====================================================================
388*
389* .. Intrinsic Functions ..
390 INTRINSIC abs, dble
391* ..
392* .. Executable Statements ..
393*
394 IF( abs( dble( v1( 1 ) ) ).LT.abs( dble( v2( 1 ) ) ) ) THEN
395 v1( 1 ) = v2( 1 )
396 v1( 2 ) = v2( 2 )
397 END IF
398*
399 RETURN
400*
401* End of ZCOMBAMAX1
402*
403 END
subroutine infog2l(grindx, gcindx, desc, nprow, npcol, myrow, mycol, lrindx, lcindx, rsrc, csrc)
Definition infog2l.f:3
subroutine zcombamax1(v1, v2)
Definition pzmax1.f:360
subroutine pzmax1(n, amax, indx, x, ix, jx, descx, incx)
Definition pzmax1.f:2
subroutine pztreecomb(ictxt, scope, n, mine, rdest0, cdest0, subptr)
Definition pztreecomb.f:3