ScaLAPACK 2.1  2.1
ScaLAPACK: Scalable Linear Algebra PACKage
pcmax1.f
Go to the documentation of this file.
1  SUBROUTINE pcmax1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2 *
3 * -- ScaLAPACK auxiliary routine (version 1.7) --
4 * University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5 * and University of California, Berkeley.
6 * May 25, 2001
7 *
8 * .. Scalar Arguments ..
9  INTEGER INDX, INCX, IX, JX, N
10  COMPLEX AMAX
11 * ..
12 * .. Array Arguments ..
13  INTEGER DESCX( * )
14  COMPLEX X( * )
15 * ..
16 *
17 * Purpose
18 * =======
19 *
20 * PCMAX1 computes the global index of the maximum element in absolute
21 * value of a distributed vector sub( X ). The global index is returned
22 * in INDX and the value is returned in AMAX,
23 *
24 * where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25 * X(IX,JX:JX+N-1) if INCX = M_X.
26 *
27 * Notes
28 * =====
29 *
30 * Each global data object is described by an associated description
31 * vector. This vector stores the information required to establish
32 * the mapping between an object element and its corresponding process
33 * and memory location.
34 *
35 * Let A be a generic term for any 2D block cyclicly distributed array.
36 * Such a global array has an associated description vector DESCA.
37 * In the following comments, the character _ should be read as
38 * "of the global array".
39 *
40 * NOTATION STORED IN EXPLANATION
41 * --------------- -------------- --------------------------------------
42 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
43 * DTYPE_A = 1.
44 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45 * the BLACS process grid A is distribu-
46 * ted over. The context itself is glo-
47 * bal, but the handle (the integer
48 * value) may vary.
49 * M_A (global) DESCA( M_ ) The number of rows in the global
50 * array A.
51 * N_A (global) DESCA( N_ ) The number of columns in the global
52 * array A.
53 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
54 * the rows of the array.
55 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
56 * the columns of the array.
57 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58 * row of the array A is distributed.
59 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60 * first column of the array A is
61 * distributed.
62 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
63 * array. LLD_A >= MAX(1,LOCr(M_A)).
64 *
65 * Let K be the number of rows or columns of a distributed matrix,
66 * and assume that its process grid has dimension p x q.
67 * LOCr( K ) denotes the number of elements of K that a process
68 * would receive if K were distributed over the p processes of its
69 * process column.
70 * Similarly, LOCc( K ) denotes the number of elements of K that a
71 * process would receive if K were distributed over the q processes of
72 * its process row.
73 * The values of LOCr() and LOCc() may be determined via a call to the
74 * ScaLAPACK tool function, NUMROC:
75 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77 * An upper bound for these quantities may be computed by:
78 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80 *
81 * Because vectors may be viewed as a subclass of matrices, a
82 * distributed vector is considered to be a distributed matrix.
83 *
84 * When the result of a vector-oriented PBLAS call is a scalar, it will
85 * be made available only within the scope which owns the vector(s)
86 * being operated on. Let X be a generic term for the input vector(s).
87 * Then, the processes which receive the answer will be (note that if
88 * an operation involves more than one vector, the processes which re-
89 * ceive the result will be the union of the following calculation for
90 * each vector):
91 *
92 * If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93 * row or process column owns the vector operand, therefore only the
94 * process of coordinate {RSRC_X, CSRC_X} receives the result;
95 *
96 * If INCX = M_X, then sub( X ) is a vector distributed over a process
97 * row. Each process part of this row receives the result;
98 *
99 * If INCX = 1, then sub( X ) is a vector distributed over a process
100 * column. Each process part of this column receives the result;
101 *
102 * Based on PCAMAX from Level 1 PBLAS. The change is to use the
103 * 'genuine' absolute value.
104 *
105 * The serial version was contributed to LAPACK by Nick Higham for use
106 * with CLACON.
107 *
108 * Arguments
109 * =========
110 *
111 * N (global input) pointer to INTEGER
112 * The number of components of the distributed vector sub( X ).
113 * N >= 0.
114 *
115 * AMAX (global output) pointer to REAL
116 * The absolute value of the largest entry of the distributed
117 * vector sub( X ) only in the scope of sub( X ).
118 *
119 * INDX (global output) pointer to INTEGER
120 * The global index of the element of the distributed vector
121 * sub( X ) whose real part has maximum absolute value.
122 *
123 * X (local input) COMPLEX array containing the local
124 * pieces of a distributed matrix of dimension of at least
125 * ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126 * This array contains the entries of the distributed vector
127 * sub( X ).
128 *
129 * IX (global input) INTEGER
130 * The row index in the global array X indicating the first
131 * row of sub( X ).
132 *
133 * JX (global input) INTEGER
134 * The column index in the global array X indicating the
135 * first column of sub( X ).
136 *
137 * DESCX (global and local input) INTEGER array of dimension DLEN_.
138 * The array descriptor for the distributed matrix X.
139 *
140 * INCX (global input) INTEGER
141 * The global increment for the elements of X. Only two values
142 * of INCX are supported in this version, namely 1 and M_X.
143 * INCX must not be zero.
144 *
145 * =====================================================================
146 *
147 * .. Parameters ..
148  INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149  $ LLD_, MB_, M_, NB_, N_, RSRC_
150  parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
151  $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
152  $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
153  COMPLEX ZERO
154  parameter( zero = ( 0.0e+0, 0.0e+0 ) )
155 * ..
156 * .. Local Scalars ..
157  CHARACTER CBTOP, CCTOP, RBTOP, RCTOP
158  INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159  $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160  $ NPCOL, NPROW, NQ
161 * ..
162 * .. Local Arrays ..
163  COMPLEX WORK( 2 )
164 * ..
165 * .. External Subroutines ..
166  EXTERNAL blacs_gridinfo, ccombamax1, cgamx2d,
167  $ igebr2d, igebs2d, infog2l, pctreecomb,
168  $ pb_topget
169 * ..
170 * .. External Functions ..
171  LOGICAL LSAME
172  INTEGER ICMAX1, INDXL2G, NUMROC
173  EXTERNAL icmax1, indxl2g, numroc
174 * ..
175 * .. Intrinsic Functions ..
176  INTRINSIC abs, cmplx, mod, nint, real
177 * ..
178 * .. Executable Statements ..
179 *
180 * Get grid parameters
181 *
182  ictxt = descx( ctxt_ )
183  CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
184 *
185 * Quick return if possible.
186 *
187  indx = 0
188  amax = zero
189  IF( n.LE.0 )
190  $ RETURN
191 *
192 * Retrieve local information for vector X.
193 *
194  ldx = descx( lld_ )
195  CALL infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix, jjx,
196  $ ixrow, ixcol )
197 *
198  IF( incx.EQ.1 .AND. descx( m_ ).EQ.1 .AND. n.EQ.1 ) THEN
199  indx = jx
200  amax = x( iix+(jjx-1)*ldx )
201  RETURN
202  END IF
203 *
204 * Find the maximum value and its index
205 *
206  IF( incx.EQ.descx( m_ ) ) THEN
207 *
208  IF( myrow.EQ.ixrow ) THEN
209 *
210  icoff = mod( jx-1, descx( nb_ ) )
211  nq = numroc( n+icoff, descx( nb_ ), mycol, ixcol, npcol )
212  IF( mycol.EQ.ixcol )
213  $ nq = nq-icoff
214 *
215  CALL pb_topget( ictxt, 'Broadcast', 'Rowwise', rbtop )
216 *
217  IF( lsame( rbtop, ' ' ) ) THEN
218 *
219  IF( nq.GT.0 ) THEN
220  lcindx = jjx-1+icmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
221  work( 1 ) = x( iix+(lcindx-1)*ldx )
222  work( 2 ) = cmplx( real( indxl2g( lcindx,
223  $ descx( nb_ ), mycol, descx( csrc_ ), npcol ) ) )
224  ELSE
225  work( 1 ) = zero
226  work( 2 ) = zero
227  END IF
228 *
229  CALL pctreecomb( ictxt, 'Row', 2, work, -1, mycol,
230  $ ccombamax1 )
231 *
232  amax = work( 1 )
233  IF( amax.EQ.zero ) THEN
234  indx = jx
235  ELSE
236  indx = nint( real( work( 2 ) ) )
237  END IF
238 *
239  ELSE
240 *
241  CALL pb_topget( ictxt, 'Combine', 'Rowwise', rctop )
242 *
243  IF( nq.GT.0 ) THEN
244  lcindx = jjx-1+icmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
245  amax = x( iix + (lcindx-1)*ldx )
246  ELSE
247  amax = zero
248  END IF
249 *
250 * Find the maximum value
251 *
252  CALL cgamx2d( ictxt, 'Rowwise', rctop, 1, 1, amax, 1,
253  $ idumm, maxpos, 1, -1, myrow )
254 *
255  IF( amax.NE.zero ) THEN
256 *
257 * Broadcast corresponding global index
258 *
259  IF( mycol.EQ.maxpos ) THEN
260  indx = indxl2g( lcindx, descx( nb_ ), mycol,
261  $ descx( csrc_ ), npcol )
262  CALL igebs2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
263  $ 1 )
264  ELSE
265  CALL igebr2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
266  $ 1, myrow, maxpos )
267  END IF
268 *
269  ELSE
270 *
271  indx = jx
272 *
273  END IF
274 *
275  END IF
276 *
277  END IF
278 *
279  ELSE
280 *
281  IF( mycol.EQ.ixcol ) THEN
282 *
283  iroff = mod( ix-1, descx( mb_ ) )
284  np = numroc( n+iroff, descx( mb_ ), myrow, ixrow, nprow )
285  IF( myrow.EQ.ixrow )
286  $ np = np-iroff
287 *
288  CALL pb_topget( ictxt, 'Broadcast', 'Columnwise', cbtop )
289 *
290  IF( lsame( cbtop, ' ' ) ) THEN
291 *
292  IF( np.GT.0 ) THEN
293  lcindx = iix-1+icmax1( np, x( iix+(jjx-1)*ldx ), 1 )
294  work( 1 ) = x( lcindx + (jjx-1)*ldx )
295  work( 2 ) = cmplx( real( indxl2g( lcindx,
296  $ descx( mb_ ), myrow, descx( rsrc_ ), nprow ) ) )
297  ELSE
298  work( 1 ) = zero
299  work( 2 ) = zero
300  END IF
301 *
302  CALL pctreecomb( ictxt, 'Column', 2, work, -1, mycol,
303  $ ccombamax1 )
304 *
305  amax = work( 1 )
306  IF( amax.EQ.zero ) THEN
307  indx = ix
308  ELSE
309  indx = nint( real( work( 2 ) ) )
310  END IF
311 *
312  ELSE
313 *
314  CALL pb_topget( ictxt, 'Combine', 'Columnwise', cctop )
315 *
316  IF( np.GT.0 ) THEN
317  lcindx = iix-1+icmax1( np, x( iix+(jjx-1)*ldx ), 1 )
318  amax = x( lcindx + (jjx-1)*ldx )
319  ELSE
320  amax = zero
321  END IF
322 *
323 * Find the maximum value
324 *
325  CALL cgamx2d( ictxt, 'Columnwise', cctop, 1, 1, amax, 1,
326  $ maxpos, idumm, 1, -1, mycol )
327 *
328  IF( amax.NE.zero ) THEN
329 *
330 * Broadcast corresponding global index
331 *
332  IF( myrow.EQ.maxpos ) THEN
333  indx = indxl2g( lcindx, descx( mb_ ), myrow,
334  $ descx( rsrc_ ), nprow )
335  CALL igebs2d( ictxt, 'Columnwise', cbtop, 1, 1,
336  $ indx, 1 )
337  ELSE
338  CALL igebr2d( ictxt, 'Columnwise', cbtop, 1, 1,
339  $ indx, 1, maxpos, mycol )
340  END IF
341 *
342  ELSE
343 *
344  indx = ix
345 *
346  END IF
347 *
348  END IF
349 *
350  END IF
351 *
352  END IF
353 *
354  RETURN
355 *
356 * End of PCMAX1
357 *
358  END
359 *
360  SUBROUTINE ccombamax1 ( V1, V2 )
361 *
362 * -- ScaLAPACK auxiliary routine (version 1.7) --
363 * University of Tennessee, Knoxville, Oak Ridge National Laboratory,
364 * and University of California, Berkeley.
365 * May 1, 1997
366 *
367 * .. Array Arguments ..
368  COMPLEX V1( 2 ), V2( 2 )
369 * ..
370 *
371 * Purpose
372 * =======
373 *
374 * CCOMBAMAX1 finds the element having maximum real part absolute
375 * value as well as its corresponding globl index.
376 *
377 * Arguments
378 * =========
379 *
380 * V1 (local input/local output) COMPLEX array of
381 * dimension 2. The first maximum absolute value element and
382 * its global index. V1(1) = AMAX, V1(2) = INDX.
383 *
384 * V2 (local input) COMPLEX array of dimension 2.
385 * The second maximum absolute value element and its global
386 * index. V2(1) = AMAX, V2(2) = INDX.
387 *
388 * =====================================================================
389 *
390 * .. Intrinsic Functions ..
391  INTRINSIC abs, real
392 * ..
393 * .. Executable Statements ..
394 *
395  IF( abs( real( v1( 1 ) ) ).LT.abs( real( v2( 1 ) ) ) ) THEN
396  v1( 1 ) = v2( 1 )
397  v1( 2 ) = v2( 2 )
398  END IF
399 *
400  RETURN
401 *
402 * End of CCOMBAMAX1
403 *
404  END
cmplx
float cmplx[2]
Definition: pblas.h:132
pcmax1
subroutine pcmax1(N, AMAX, INDX, X, IX, JX, DESCX, INCX)
Definition: pcmax1.f:2
infog2l
subroutine infog2l(GRINDX, GCINDX, DESC, NPROW, NPCOL, MYROW, MYCOL, LRINDX, LCINDX, RSRC, CSRC)
Definition: infog2l.f:3
ccombamax1
subroutine ccombamax1(V1, V2)
Definition: pcmax1.f:361
pctreecomb
subroutine pctreecomb(ICTXT, SCOPE, N, MINE, RDEST0, CDEST0, SUBPTR)
Definition: pctreecomb.f:3