ScaLAPACK 2.1  2.1
ScaLAPACK: Scalable Linear Algebra PACKage
pzmax1.f
Go to the documentation of this file.
1  SUBROUTINE pzmax1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2 *
3 * -- ScaLAPACK auxiliary routine (version 1.7) --
4 * University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5 * and University of California, Berkeley.
6 * May 1, 1997
7 *
8 * .. Scalar Arguments ..
9  INTEGER INDX, INCX, IX, JX, N
10  COMPLEX*16 AMAX
11 * ..
12 * .. Array Arguments ..
13  INTEGER DESCX( * )
14  COMPLEX*16 X( * )
15 * ..
16 *
17 * Purpose
18 * =======
19 *
20 * PZMAX1 computes the global index of the maximum element in absolute
21 * value of a distributed vector sub( X ). The global index is returned
22 * in INDX and the value is returned in AMAX,
23 *
24 * where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25 * X(IX,JX:JX+N-1) if INCX = M_X.
26 *
27 * Notes
28 * =====
29 *
30 * Each global data object is described by an associated description
31 * vector. This vector stores the information required to establish
32 * the mapping between an object element and its corresponding process
33 * and memory location.
34 *
35 * Let A be a generic term for any 2D block cyclicly distributed array.
36 * Such a global array has an associated description vector DESCA.
37 * In the following comments, the character _ should be read as
38 * "of the global array".
39 *
40 * NOTATION STORED IN EXPLANATION
41 * --------------- -------------- --------------------------------------
42 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
43 * DTYPE_A = 1.
44 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45 * the BLACS process grid A is distribu-
46 * ted over. The context itself is glo-
47 * bal, but the handle (the integer
48 * value) may vary.
49 * M_A (global) DESCA( M_ ) The number of rows in the global
50 * array A.
51 * N_A (global) DESCA( N_ ) The number of columns in the global
52 * array A.
53 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
54 * the rows of the array.
55 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
56 * the columns of the array.
57 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58 * row of the array A is distributed.
59 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60 * first column of the array A is
61 * distributed.
62 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
63 * array. LLD_A >= MAX(1,LOCr(M_A)).
64 *
65 * Let K be the number of rows or columns of a distributed matrix,
66 * and assume that its process grid has dimension p x q.
67 * LOCr( K ) denotes the number of elements of K that a process
68 * would receive if K were distributed over the p processes of its
69 * process column.
70 * Similarly, LOCc( K ) denotes the number of elements of K that a
71 * process would receive if K were distributed over the q processes of
72 * its process row.
73 * The values of LOCr() and LOCc() may be determined via a call to the
74 * ScaLAPACK tool function, NUMROC:
75 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77 * An upper bound for these quantities may be computed by:
78 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80 *
81 * Because vectors may be viewed as a subclass of matrices, a
82 * distributed vector is considered to be a distributed matrix.
83 *
84 * When the result of a vector-oriented PBLAS call is a scalar, it will
85 * be made available only within the scope which owns the vector(s)
86 * being operated on. Let X be a generic term for the input vector(s).
87 * Then, the processes which receive the answer will be (note that if
88 * an operation involves more than one vector, the processes which re-
89 * ceive the result will be the union of the following calculation for
90 * each vector):
91 *
92 * If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93 * row or process column owns the vector operand, therefore only the
94 * process of coordinate {RSRC_X, CSRC_X} receives the result;
95 *
96 * If INCX = M_X, then sub( X ) is a vector distributed over a process
97 * row. Each process part of this row receives the result;
98 *
99 * If INCX = 1, then sub( X ) is a vector distributed over a process
100 * column. Each process part of this column receives the result;
101 *
102 * Based on PZAMAX from Level 1 PBLAS. The change is to use the
103 * 'genuine' absolute value.
104 *
105 * The serial version was contributed to LAPACK by Nick Higham for use
106 * with ZLACON.
107 *
108 * Arguments
109 * =========
110 *
111 * N (global input) pointer to INTEGER
112 * The number of components of the distributed vector sub( X ).
113 * N >= 0.
114 *
115 * AMAX (global output) pointer to DOUBLE PRECISION
116 * The absolute value of the largest entry of the distributed
117 * vector sub( X ) only in the scope of sub( X ).
118 *
119 * INDX (global output) pointer to INTEGER
120 * The global index of the element of the distributed vector
121 * sub( X ) whose real part has maximum absolute value.
122 *
123 * X (local input) COMPLEX*16 array containing the local
124 * pieces of a distributed matrix of dimension of at least
125 * ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126 * This array contains the entries of the distributed vector
127 * sub( X ).
128 *
129 * IX (global input) INTEGER
130 * The row index in the global array X indicating the first
131 * row of sub( X ).
132 *
133 * JX (global input) INTEGER
134 * The column index in the global array X indicating the
135 * first column of sub( X ).
136 *
137 * DESCX (global and local input) INTEGER array of dimension DLEN_.
138 * The array descriptor for the distributed matrix X.
139 *
140 * INCX (global input) INTEGER
141 * The global increment for the elements of X. Only two values
142 * of INCX are supported in this version, namely 1 and M_X.
143 * INCX must not be zero.
144 *
145 * =====================================================================
146 *
147 * .. Parameters ..
148  INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149  $ LLD_, MB_, M_, NB_, N_, RSRC_
150  parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
151  $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
152  $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
153  COMPLEX*16 ZERO
154  parameter( zero = ( 0.0d+0, 0.0d+0 ) )
155 * ..
156 * .. Local Scalars ..
157  CHARACTER CBTOP, CCTOP, RBTOP, RCTOP
158  INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159  $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160  $ NPCOL, NPROW, NQ
161 * ..
162 * .. Local Arrays ..
163  COMPLEX*16 WORK( 2 )
164 * ..
165 * .. External Subroutines ..
166  EXTERNAL blacs_gridinfo, igebr2d, igebs2d, infog2l,
167  $ pb_topget, pztreecomb, zcombamax1, zgamx2d
168 * ..
169 * .. External Functions ..
170  LOGICAL LSAME
171  INTEGER IZMAX1, INDXL2G, NUMROC
172  EXTERNAL izmax1, indxl2g, numroc
173 * ..
174 * .. Intrinsic Functions ..
175  INTRINSIC abs, dble, dcmplx, mod, nint
176 * ..
177 * .. Executable Statements ..
178 *
179 * Get grid parameters
180 *
181  ictxt = descx( ctxt_ )
182  CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
183 *
184 * Quick return if possible.
185 *
186  indx = 0
187  amax = zero
188  IF( n.LE.0 )
189  $ RETURN
190 *
191 * Retrieve local information for vector X.
192 *
193  ldx = descx( lld_ )
194  CALL infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix, jjx,
195  $ ixrow, ixcol )
196 *
197  IF( incx.EQ.1 .AND. descx( m_ ).EQ.1 .AND. n.EQ.1 ) THEN
198  indx = jx
199  amax = x( iix+(jjx-1)*ldx )
200  RETURN
201  END IF
202 *
203 * Find the maximum value and its index
204 *
205  IF( incx.EQ.descx( m_ ) ) THEN
206 *
207  IF( myrow.EQ.ixrow ) THEN
208 *
209  icoff = mod( jx-1, descx( nb_ ) )
210  nq = numroc( n+icoff, descx( nb_ ), mycol, ixcol, npcol )
211  IF( mycol.EQ.ixcol )
212  $ nq = nq-icoff
213 *
214  CALL pb_topget( ictxt, 'Broadcast', 'Rowwise', rbtop )
215 *
216  IF( lsame( rbtop, ' ' ) ) THEN
217 *
218  IF( nq.GT.0 ) THEN
219  lcindx = jjx-1+izmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
220  work( 1 ) = x( iix+(lcindx-1)*ldx )
221  work( 2 ) = dcmplx( dble( indxl2g( lcindx,
222  $ descx( nb_ ), mycol, descx( csrc_ ), npcol ) ) )
223  ELSE
224  work( 1 ) = zero
225  work( 2 ) = zero
226  END IF
227 *
228  CALL pztreecomb( ictxt, 'Row', 2, work, -1, mycol,
229  $ zcombamax1 )
230 *
231  amax = work( 1 )
232  IF( amax.EQ.zero ) THEN
233  indx = jx
234  ELSE
235  indx = nint( dble( work( 2 ) ) )
236  END IF
237 *
238  ELSE
239 *
240  CALL pb_topget( ictxt, 'Combine', 'Rowwise', rctop )
241 *
242  IF( nq.GT.0 ) THEN
243  lcindx = jjx-1+izmax1( nq, x( iix+(jjx-1)*ldx ), ldx )
244  amax = x( iix + (lcindx-1)*ldx )
245  ELSE
246  amax = zero
247  END IF
248 *
249 * Find the maximum value
250 *
251  CALL zgamx2d( ictxt, 'Rowwise', rctop, 1, 1, amax, 1,
252  $ idumm, maxpos, 1, -1, myrow )
253 *
254  IF( amax.NE.zero ) THEN
255 *
256 * Broadcast corresponding global index
257 *
258  IF( mycol.EQ.maxpos ) THEN
259  indx = indxl2g( lcindx, descx( nb_ ), mycol,
260  $ descx( csrc_ ), npcol )
261  CALL igebs2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
262  $ 1 )
263  ELSE
264  CALL igebr2d( ictxt, 'Rowwise', rbtop, 1, 1, indx,
265  $ 1, myrow, maxpos )
266  END IF
267 *
268  ELSE
269 *
270  indx = jx
271 *
272  END IF
273 *
274  END IF
275 *
276  END IF
277 *
278  ELSE
279 *
280  IF( mycol.EQ.ixcol ) THEN
281 *
282  iroff = mod( ix-1, descx( mb_ ) )
283  np = numroc( n+iroff, descx( mb_ ), myrow, ixrow, nprow )
284  IF( myrow.EQ.ixrow )
285  $ np = np-iroff
286 *
287  CALL pb_topget( ictxt, 'Broadcast', 'Columnwise', cbtop )
288 *
289  IF( lsame( cbtop, ' ' ) ) THEN
290 *
291  IF( np.GT.0 ) THEN
292  lcindx = iix-1+izmax1( np, x( iix+(jjx-1)*ldx ), 1 )
293  work( 1 ) = x( lcindx + (jjx-1)*ldx )
294  work( 2 ) = dcmplx( dble( indxl2g( lcindx,
295  $ descx( mb_ ), myrow, descx( rsrc_ ), nprow ) ) )
296  ELSE
297  work( 1 ) = zero
298  work( 2 ) = zero
299  END IF
300 *
301  CALL pztreecomb( ictxt, 'Column', 2, work, -1, mycol,
302  $ zcombamax1 )
303 *
304  amax = work( 1 )
305  IF( amax.EQ.zero ) THEN
306  indx = ix
307  ELSE
308  indx = nint( dble( work( 2 ) ) )
309  END IF
310 *
311  ELSE
312 *
313  CALL pb_topget( ictxt, 'Combine', 'Columnwise', cctop )
314 *
315  IF( np.GT.0 ) THEN
316  lcindx = iix-1+izmax1( np, x( iix+(jjx-1)*ldx ), 1 )
317  amax = x( lcindx + (jjx-1)*ldx )
318  ELSE
319  amax = zero
320  END IF
321 *
322 * Find the maximum value
323 *
324  CALL zgamx2d( ictxt, 'Columnwise', cctop, 1, 1, amax, 1,
325  $ maxpos, idumm, 1, -1, mycol )
326 *
327  IF( amax.NE.zero ) THEN
328 *
329 * Broadcast corresponding global index
330 *
331  IF( myrow.EQ.maxpos ) THEN
332  indx = indxl2g( lcindx, descx( mb_ ), myrow,
333  $ descx( rsrc_ ), nprow )
334  CALL igebs2d( ictxt, 'Columnwise', cbtop, 1, 1,
335  $ indx, 1 )
336  ELSE
337  CALL igebr2d( ictxt, 'Columnwise', cbtop, 1, 1,
338  $ indx, 1, maxpos, mycol )
339  END IF
340 *
341  ELSE
342 *
343  indx = ix
344 *
345  END IF
346 *
347  END IF
348 *
349  END IF
350 *
351  END IF
352 *
353  RETURN
354 *
355 * End of PZMAX1
356 *
357  END
358 *
359  SUBROUTINE zcombamax1 ( V1, V2 )
360 *
361 * -- ScaLAPACK auxiliary routine (version 1.7) --
362 * University of Tennessee, Knoxville, Oak Ridge National Laboratory,
363 * and University of California, Berkeley.
364 * May 1, 1997
365 *
366 * .. Array Arguments ..
367  COMPLEX*16 V1( 2 ), V2( 2 )
368 * ..
369 *
370 * Purpose
371 * =======
372 *
373 * ZCOMBAMAX1 finds the element having maximum real part absolute
374 * value as well as its corresponding globl index.
375 *
376 * Arguments
377 * =========
378 *
379 * V1 (local input/local output) COMPLEX*16 array of
380 * dimension 2. The first maximum absolute value element and
381 * its global index. V1(1) = AMAX, V1(2) = INDX.
382 *
383 * V2 (local input) COMPLEX*16 array of dimension 2.
384 * The second maximum absolute value element and its global
385 * index. V2(1) = AMAX, V2(2) = INDX.
386 *
387 * =====================================================================
388 *
389 * .. Intrinsic Functions ..
390  INTRINSIC abs, dble
391 * ..
392 * .. Executable Statements ..
393 *
394  IF( abs( dble( v1( 1 ) ) ).LT.abs( dble( v2( 1 ) ) ) ) THEN
395  v1( 1 ) = v2( 1 )
396  v1( 2 ) = v2( 2 )
397  END IF
398 *
399  RETURN
400 *
401 * End of ZCOMBAMAX1
402 *
403  END
infog2l
subroutine infog2l(GRINDX, GCINDX, DESC, NPROW, NPCOL, MYROW, MYCOL, LRINDX, LCINDX, RSRC, CSRC)
Definition: infog2l.f:3
zcombamax1
subroutine zcombamax1(V1, V2)
Definition: pzmax1.f:360
pztreecomb
subroutine pztreecomb(ICTXT, SCOPE, N, MINE, RDEST0, CDEST0, SUBPTR)
Definition: pztreecomb.f:3
pzmax1
subroutine pzmax1(N, AMAX, INDX, X, IX, JX, DESCX, INCX)
Definition: pzmax1.f:2