SCALAPACK 2.2.2
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches
pcmax1.f
Go to the documentation of this file.
1 SUBROUTINE pcmax1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2*
3* -- ScaLAPACK auxiliary routine (version 1.7) --
4* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5* and University of California, Berkeley.
6* May 25, 2001
7*
8* .. Scalar Arguments ..
9 INTEGER INDX, INCX, IX, JX, N
10 COMPLEX AMAX
11* ..
12* .. Array Arguments ..
13 INTEGER DESCX( * )
14 COMPLEX X( * )
15* ..
16*
17* Purpose
18* =======
19*
20* PCMAX1 computes the global index of the maximum element in absolute
21* value of a distributed vector sub( X ). The global index is returned
22* in INDX and the value is returned in AMAX,
23*
24* where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25* X(IX,JX:JX+N-1) if INCX = M_X.
26*
27* Notes
28* =====
29*
30* Each global data object is described by an associated description
31* vector. This vector stores the information required to establish
32* the mapping between an object element and its corresponding process
33* and memory location.
34*
35* Let A be a generic term for any 2D block cyclicly distributed array.
36* Such a global array has an associated description vector DESCA.
37* In the following comments, the character _ should be read as
38* "of the global array".
39*
40* NOTATION STORED IN EXPLANATION
41* --------------- -------------- --------------------------------------
42* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
43* DTYPE_A = 1.
44* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45* the BLACS process grid A is distribu-
46* ted over. The context itself is glo-
47* bal, but the handle (the integer
48* value) may vary.
49* M_A (global) DESCA( M_ ) The number of rows in the global
50* array A.
51* N_A (global) DESCA( N_ ) The number of columns in the global
52* array A.
53* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
54* the rows of the array.
55* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
56* the columns of the array.
57* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58* row of the array A is distributed.
59* CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60* first column of the array A is
61* distributed.
62* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
63* array. LLD_A >= MAX(1,LOCr(M_A)).
64*
65* Let K be the number of rows or columns of a distributed matrix,
66* and assume that its process grid has dimension p x q.
67* LOCr( K ) denotes the number of elements of K that a process
68* would receive if K were distributed over the p processes of its
69* process column.
70* Similarly, LOCc( K ) denotes the number of elements of K that a
71* process would receive if K were distributed over the q processes of
72* its process row.
73* The values of LOCr() and LOCc() may be determined via a call to the
74* ScaLAPACK tool function, NUMROC:
75* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77* An upper bound for these quantities may be computed by:
78* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80*
81* Because vectors may be viewed as a subclass of matrices, a
82* distributed vector is considered to be a distributed matrix.
83*
84* When the result of a vector-oriented PBLAS call is a scalar, it will
85* be made available only within the scope which owns the vector(s)
86* being operated on. Let X be a generic term for the input vector(s).
87* Then, the processes which receive the answer will be (note that if
88* an operation involves more than one vector, the processes which re-
89* ceive the result will be the union of the following calculation for
90* each vector):
91*
92* If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93* row or process column owns the vector operand, therefore only the
94* process of coordinate {RSRC_X, CSRC_X} receives the result;
95*
96* If INCX = M_X, then sub( X ) is a vector distributed over a process
97* row. Each process part of this row receives the result;
98*
99* If INCX = 1, then sub( X ) is a vector distributed over a process
100* column. Each process part of this column receives the result;
101*
102* Based on PCAMAX from Level 1 PBLAS. The change is to use the
103* 'genuine' absolute value.
104*
105* The serial version was contributed to LAPACK by Nick Higham for use
106* with CLACON.
107*
108* Arguments
109* =========
110*
111* N (global input) pointer to INTEGER
112* The number of components of the distributed vector sub( X ).
113* N >= 0.
114*
115* AMAX (global output) pointer to REAL
116* The absolute value of the largest entry of the distributed
117* vector sub( X ) only in the scope of sub( X ).
118*
119* INDX (global output) pointer to INTEGER
120* The global index of the element of the distributed vector
121* sub( X ) whose real part has maximum absolute value.
122*
123* X (local input) COMPLEX array containing the local
124* pieces of a distributed matrix of dimension of at least
125* ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126* This array contains the entries of the distributed vector
127* sub( X ).
128*
129* IX (global input) INTEGER
130* The row index in the global array X indicating the first
131* row of sub( X ).
132*
133* JX (global input) INTEGER
134* The column index in the global array X indicating the
135* first column of sub( X ).
136*
137* DESCX (global and local input) INTEGER array of dimension DLEN_.
138* The array descriptor for the distributed matrix X.
139*
140* INCX (global input) INTEGER
141* The global increment for the elements of X. Only two values
142* of INCX are supported in this version, namely 1 and M_X.
143* INCX must not be zero.
144*
145* =====================================================================
146*
147* .. Parameters ..
148 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149 $ LLD_, MB_, M_, NB_, N_, RSRC_
150 parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
151 $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
152 $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
153 COMPLEX ZERO
154 parameter( zero = ( 0.0e+0, 0.0e+0 ) )
155* ..
156* .. Local Scalars ..
157 CHARACTER CBTOP, CCTOP, RBTOP, RCTOP
158 INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159 $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160 $ NPCOL, NPROW, NQ
161* ..
162* .. Local Arrays ..
163 COMPLEX WORK( 2 )
164* ..
165* .. External Subroutines ..
166 EXTERNAL blacs_gridinfo, ccombamax1, cgamx2d,
167 $ igebr2d, igebs2d, infog2l, pctreecomb,
168 $ pb_topget
169* ..
170* .. External Functions ..
171 LOGICAL LSAME
172 INTEGER ICMAX1, INDXL2G, NUMROC
173 EXTERNAL icmax1, indxl2g, numroc
174* ..
175* .. Intrinsic Functions ..
176 INTRINSIC abs, cmplx, mod, nint, real
177* ..
178* .. Executable Statements ..
179*
180* Get grid parameters
181*
182 ictxt = descx( ctxt_ )
183 CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
184*
185* Quick return if possible.
186*
187 indx = 0
188 amax = zero
189 IF( n.LE.0 )
190 $ RETURN
191*
192* Retrieve local information for vector X.
193*
194 ldx = descx( lld_ )
195 CALL infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix, jjx,
196 $ ixrow, ixcol )
197*
198 IF( incx.EQ.1 .AND. descx( m_ ).EQ.1 .AND. n.EQ.1 ) THEN
199 indx = jx
200 amax = x( iix+(jjx-1)*ldx )
201 RETURN
202 END IF
203*
204* Find the maximum value and its index
205*
206 IF( incx.EQ.descx( m_ ) ) THEN
207*
208 IF( myrow.EQ.ixrow ) THEN
209*
210 icoff = mod( jx-1, descx( nb_ ) )
211 nq = numroc( n+icoff, descx( nb_ ), mycol, ixcol, npcol )
212 IF( mycol.EQ.ixcol )
213 $ nq = nq-icoff
214*
215 CALL pb_topget( ictxt, 'Broadcast', 'Rowwise', rbtop )
216*
217 IF( lsame( rbtop, ' ' ) ) THEN
218*
219 IF( nq.GT.0 ) THEN
220 lcindx = jjx-1+icmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
221 work( 1 ) = x( iix+(lcindx-1)*ldx )
222 work( 2 ) = cmplx( real( indxl2g( lcindx,
223 $ descx( nb_ ), mycol, descx( csrc_ ), npcol ) ) )
224 ELSE
225 work( 1 ) = zero
226 work( 2 ) = zero
227 END IF
228*
229 CALL pctreecomb( ictxt, 'Row', 2, work, -1, mycol,
230 $ ccombamax1 )
231*
232 amax = work( 1 )
233 IF( amax.EQ.zero ) THEN
234 indx = jx
235 ELSE
236 indx = nint( real( work( 2 ) ) )
237 END IF
238*
239 ELSE
240*
241 CALL pb_topget( ictxt, 'Combine', 'Rowwise', rctop )
242*
243 IF( nq.GT.0 ) THEN
244 lcindx = jjx-1+icmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
245 amax = x( iix + (lcindx-1)*ldx )
246 ELSE
247 amax = zero
248 END IF
249*
250* Find the maximum value
251*
252 CALL cgamx2d( ictxt, 'Rowwise', rctop, 1, 1, amax, 1,
253 $ idumm, maxpos, 1, -1, myrow )
254*
255 IF( amax.NE.zero ) THEN
256*
257* Broadcast corresponding global index
258*
259 IF( mycol.EQ.maxpos ) THEN
260 indx = indxl2g( lcindx, descx( nb_ ), mycol,
261 $ descx( csrc_ ), npcol )
262 CALL igebs2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
263 $ 1 )
264 ELSE
265 CALL igebr2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
266 $ 1, myrow, maxpos )
267 END IF
268*
269 ELSE
270*
271 indx = jx
272*
273 END IF
274*
275 END IF
276*
277 END IF
278*
279 ELSE
280*
281 IF( mycol.EQ.ixcol ) THEN
282*
283 iroff = mod( ix-1, descx( mb_ ) )
284 np = numroc( n+iroff, descx( mb_ ), myrow, ixrow, nprow )
285 IF( myrow.EQ.ixrow )
286 $ np = np-iroff
287*
288 CALL pb_topget( ictxt, 'Broadcast', 'Columnwise', cbtop )
289*
290 IF( lsame( cbtop, ' ' ) ) THEN
291*
292 IF( np.GT.0 ) THEN
293 lcindx = iix-1+icmax1( np, x( iix+(jjx-1)*ldx ), 1 )
294 work( 1 ) = x( lcindx + (jjx-1)*ldx )
295 work( 2 ) = cmplx( real( indxl2g( lcindx,
296 $ descx( mb_ ), myrow, descx( rsrc_ ), nprow ) ) )
297 ELSE
298 work( 1 ) = zero
299 work( 2 ) = zero
300 END IF
301*
302 CALL pctreecomb( ictxt, 'Column', 2, work, -1, mycol,
303 $ ccombamax1 )
304*
305 amax = work( 1 )
306 IF( amax.EQ.zero ) THEN
307 indx = ix
308 ELSE
309 indx = nint( real( work( 2 ) ) )
310 END IF
311*
312 ELSE
313*
314 CALL pb_topget( ictxt, 'Combine', 'Columnwise', cctop )
315*
316 IF( np.GT.0 ) THEN
317 lcindx = iix-1+icmax1( np, x( iix+(jjx-1)*ldx ), 1 )
318 amax = x( lcindx + (jjx-1)*ldx )
319 ELSE
320 amax = zero
321 END IF
322*
323* Find the maximum value
324*
325 CALL cgamx2d( ictxt, 'Columnwise', cctop, 1, 1, amax, 1,
326 $ maxpos, idumm, 1, -1, mycol )
327*
328 IF( amax.NE.zero ) THEN
329*
330* Broadcast corresponding global index
331*
332 IF( myrow.EQ.maxpos ) THEN
333 indx = indxl2g( lcindx, descx( mb_ ), myrow,
334 $ descx( rsrc_ ), nprow )
335 CALL igebs2d( ictxt, 'Columnwise', cbtop, 1, 1,
336 $ indx, 1 )
337 ELSE
338 CALL igebr2d( ictxt, 'Columnwise', cbtop, 1, 1,
339 $ indx, 1, maxpos, mycol )
340 END IF
341*
342 ELSE
343*
344 indx = ix
345*
346 END IF
347*
348 END IF
349*
350 END IF
351*
352 END IF
353*
354 RETURN
355*
356* End of PCMAX1
357*
358 END
359*
360 SUBROUTINE ccombamax1 ( V1, V2 )
361*
362* -- ScaLAPACK auxiliary routine (version 1.7) --
363* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
364* and University of California, Berkeley.
365* May 1, 1997
366*
367* .. Array Arguments ..
368 COMPLEX V1( 2 ), V2( 2 )
369* ..
370*
371* Purpose
372* =======
373*
374* CCOMBAMAX1 finds the element having maximum real part absolute
375* value as well as its corresponding globl index.
376*
377* Arguments
378* =========
379*
380* V1 (local input/local output) COMPLEX array of
381* dimension 2. The first maximum absolute value element and
382* its global index. V1(1) = AMAX, V1(2) = INDX.
383*
384* V2 (local input) COMPLEX array of dimension 2.
385* The second maximum absolute value element and its global
386* index. V2(1) = AMAX, V2(2) = INDX.
387*
388* =====================================================================
389*
390* .. Intrinsic Functions ..
391 INTRINSIC abs, real
392* ..
393* .. Executable Statements ..
394*
395 IF( abs( real( v1( 1 ) ) ).LT.abs( real( v2( 1 ) ) ) ) THEN
396 v1( 1 ) = v2( 1 )
397 v1( 2 ) = v2( 2 )
398 END IF
399*
400 RETURN
401*
402* End of CCOMBAMAX1
403*
404 END
float cmplx[2]
Definition pblas.h:136
subroutine infog2l(grindx, gcindx, desc, nprow, npcol, myrow, mycol, lrindx, lcindx, rsrc, csrc)
Definition infog2l.f:3
subroutine ccombamax1(v1, v2)
Definition pcmax1.f:361
subroutine pcmax1(n, amax, indx, x, ix, jx, descx, incx)
Definition pcmax1.f:2
subroutine pctreecomb(ictxt, scope, n, mine, rdest0, cdest0, subptr)
Definition pctreecomb.f:3