SCALAPACK 2.2.2
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches
pclassq.f
Go to the documentation of this file.
1 SUBROUTINE pclassq( N, X, IX, JX, DESCX, INCX, SCALE, SUMSQ )
2*
3* -- ScaLAPACK auxiliary routine (version 1.7) --
4* University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5* and University of California, Berkeley.
6* May 1, 1997
7*
8* .. Scalar Arguments ..
9 INTEGER IX, INCX, JX, N
10 REAL SCALE, SUMSQ
11* ..
12* .. Array Arguments ..
13 INTEGER DESCX( * )
14 COMPLEX X( * )
15* ..
16*
17* Purpose
18* =======
19*
20* PCLASSQ returns the values scl and smsq such that
21*
22* ( scl**2 )*smsq = x( 1 )**2 +...+ x( n )**2 + ( scale**2 )*sumsq,
23*
24* where x( i ) = sub( X ) = abs( X( IX+(JX-1)*DESCX(M_)+(i-1)*INCX ) ).
25* The value of sumsq is assumed to be at least unity and the value of
26* ssq will then satisfy
27*
28* 1.0 .le. ssq .le. ( sumsq + 2*n ).
29*
30* scale is assumed to be non-negative and scl returns the value
31*
32* scl = max( scale, abs( real( x( i ) ) ), abs( aimag( x( i ) ) ) ),
33* i
34*
35* scale and sumsq must be supplied in SCALE and SUMSQ respectively.
36* SCALE and SUMSQ are overwritten by scl and ssq respectively.
37*
38* The routine makes only one pass through the vector sub( X ).
39*
40* Notes
41* =====
42*
43* Each global data object is described by an associated description
44* vector. This vector stores the information required to establish
45* the mapping between an object element and its corresponding process
46* and memory location.
47*
48* Let A be a generic term for any 2D block cyclicly distributed array.
49* Such a global array has an associated description vector DESCA.
50* In the following comments, the character _ should be read as
51* "of the global array".
52*
53* NOTATION STORED IN EXPLANATION
54* --------------- -------------- --------------------------------------
55* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
56* DTYPE_A = 1.
57* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
58* the BLACS process grid A is distribu-
59* ted over. The context itself is glo-
60* bal, but the handle (the integer
61* value) may vary.
62* M_A (global) DESCA( M_ ) The number of rows in the global
63* array A.
64* N_A (global) DESCA( N_ ) The number of columns in the global
65* array A.
66* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
67* the rows of the array.
68* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
69* the columns of the array.
70* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
71* row of the array A is distributed.
72* CSRC_A (global) DESCA( CSRC_ ) The process column over which the
73* first column of the array A is
74* distributed.
75* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
76* array. LLD_A >= MAX(1,LOCr(M_A)).
77*
78* Let K be the number of rows or columns of a distributed matrix,
79* and assume that its process grid has dimension p x q.
80* LOCr( K ) denotes the number of elements of K that a process
81* would receive if K were distributed over the p processes of its
82* process column.
83* Similarly, LOCc( K ) denotes the number of elements of K that a
84* process would receive if K were distributed over the q processes of
85* its process row.
86* The values of LOCr() and LOCc() may be determined via a call to the
87* ScaLAPACK tool function, NUMROC:
88* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
89* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
90* An upper bound for these quantities may be computed by:
91* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
92* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
93*
94* Because vectors may be viewed as a subclass of matrices, a
95* distributed vector is considered to be a distributed matrix.
96*
97* The result are only available in the scope of sub( X ), i.e if
98* sub( X ) is distributed along a process row, the correct results are
99* only available in this process row of the grid. Similarly if sub( X )
100* is distributed along a process column, the correct results are only
101* available in this process column of the grid.
102*
103* Arguments
104* =========
105*
106* N (global input) INTEGER
107* The length of the distributed vector sub( X ).
108*
109* X (input) COMPLEX
110* The vector for which a scaled sum of squares is computed.
111* x( i ) = X(IX+(JX-1)*M_X +(i-1)*INCX ), 1 <= i <= n.
112*
113* IX (global input) INTEGER
114* The row index in the global array X indicating the first
115* row of sub( X ).
116*
117* JX (global input) INTEGER
118* The column index in the global array X indicating the
119* first column of sub( X ).
120*
121* DESCX (global and local input) INTEGER array of dimension DLEN_.
122* The array descriptor for the distributed matrix X.
123*
124* INCX (global input) INTEGER
125* The global increment for the elements of X. Only two values
126* of INCX are supported in this version, namely 1 and M_X.
127* INCX must not be zero.
128*
129* SCALE (local input/local output) REAL
130* On entry, the value scale in the equation above.
131* On exit, SCALE is overwritten with scl , the scaling factor
132* for the sum of squares.
133*
134* SUMSQ (local input/local output) REAL
135* On entry, the value sumsq in the equation above.
136* On exit, SUMSQ is overwritten with smsq , the basic sum of
137* squares from which scl has been factored out.
138*
139* =====================================================================
140*
141* .. Parameters ..
142 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
143 $ LLD_, MB_, M_, NB_, N_, RSRC_
144 parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
145 $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
146 $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
147 REAL ZERO
148 parameter( zero = 0.0e+0 )
149* ..
150* .. Local Scalars ..
151 INTEGER I, ICOFF, ICTXT, IIX, IOFF, IROFF, IXCOL,
152 $ IXROW, JJX, LDX, MYCOL, MYROW, NP, NPCOL,
153 $ NPROW, NQ
154 REAL TEMP1
155* ..
156* .. Local Arrays ..
157 REAL WORK( 2 )
158* ..
159* .. External Subroutines ..
160 EXTERNAL blacs_gridinfo, infog2l, pstreecomb, scombssq
161* ..
162* .. External Functions ..
163 INTEGER NUMROC
164 EXTERNAL numroc
165* ..
166* .. Intrinsic Functions ..
167 INTRINSIC abs, aimag, mod, real
168* ..
169* .. Executable Statements ..
170*
171* Get grid parameters.
172*
173 ictxt = descx( ctxt_ )
174 CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
175*
176* Figure local indexes
177*
178 CALL infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix, jjx,
179 $ ixrow, ixcol )
180*
181 ldx = descx( lld_ )
182 IF( incx.EQ.descx( m_ ) ) THEN
183*
184* X is rowwise distributed.
185*
186 IF( myrow.NE.ixrow )
187 $ RETURN
188 icoff = mod( jx, descx( nb_ ) )
189 nq = numroc( n+icoff, descx( nb_ ), mycol, ixcol, npcol )
190 IF( mycol.EQ.ixcol )
191 $ nq = nq - icoff
192*
193* Code direct from LAPACK's CLASSQ, (save subroutine call)
194*
195 IF( nq.GT.0 ) THEN
196 ioff = iix + ( jjx - 1 ) * ldx
197 DO 10 i = 1, nq
198 IF( real( x( ioff ) ).NE.zero ) THEN
199 temp1 = abs( real( x( ioff ) ) )
200 IF( scale.LT.temp1 ) THEN
201 sumsq = 1 + sumsq * ( scale / temp1 )**2
202 scale = temp1
203 ELSE
204 sumsq = sumsq + ( temp1 / scale )**2
205 END IF
206 END IF
207 IF( aimag( x( ioff ) ).NE.zero ) THEN
208 temp1 = abs( aimag( x( ioff ) ) )
209 IF( scale.LT.temp1 ) THEN
210 sumsq = 1 + sumsq*( scale / temp1 )**2
211 scale = temp1
212 ELSE
213 sumsq = sumsq + ( temp1 / scale )**2
214 END IF
215 END IF
216 ioff = ioff + ldx
217 10 CONTINUE
218 END IF
219*
220* Take local result and find global
221*
222 work( 1 ) = scale
223 work( 2 ) = sumsq
224*
225 CALL pstreecomb( ictxt, 'Rowwise', 2, work, -1, ixcol,
226 $ scombssq )
227*
228 scale = work( 1 )
229 sumsq = work( 2 )
230*
231 ELSE IF( incx.EQ.1 ) THEN
232*
233* X is columnwise distributed.
234*
235 IF( mycol.NE.ixcol )
236 $ RETURN
237 iroff = mod( ix, descx( mb_ ) )
238 np = numroc( n+iroff, descx( mb_ ), myrow, ixrow, nprow )
239 IF( myrow.EQ.ixrow )
240 $ np = np - iroff
241*
242* Code direct from LAPACK's CLASSQ, (save subroutine call)
243*
244 IF( np.GT.0 ) THEN
245 ioff = iix + ( jjx - 1 ) * ldx
246 DO 20 i = 1, np
247 IF( real( x( ioff ) ).NE.zero ) THEN
248 temp1 = abs( real( x( ioff ) ) )
249 IF( scale.LT.temp1 ) THEN
250 sumsq = 1 + sumsq*( scale / temp1 )**2
251 scale = temp1
252 ELSE
253 sumsq = sumsq + ( temp1 / scale )**2
254 END IF
255 END IF
256 IF( aimag( x( ioff ) ).NE.zero ) THEN
257 temp1 = abs( aimag( x( ioff ) ) )
258 IF( scale.LT.temp1 ) THEN
259 sumsq = 1 + sumsq*( scale / temp1 )**2
260 scale = temp1
261 ELSE
262 sumsq = sumsq + ( temp1 / scale )**2
263 END IF
264 END IF
265 ioff = ioff + 1
266 20 CONTINUE
267 END IF
268*
269* Take local result and find global
270*
271 work( 1 ) = scale
272 work( 2 ) = sumsq
273*
274 CALL pstreecomb( ictxt, 'Columnwise', 2, work, -1, ixcol,
275 $ scombssq )
276*
277 scale = work( 1 )
278 sumsq = work( 2 )
279*
280 END IF
281*
282 RETURN
283*
284* End of PCLASSQ
285*
286 END
subroutine infog2l(grindx, gcindx, desc, nprow, npcol, myrow, mycol, lrindx, lcindx, rsrc, csrc)
Definition infog2l.f:3
subroutine pclassq(n, x, ix, jx, descx, incx, scale, sumsq)
Definition pclassq.f:2
subroutine pstreecomb(ictxt, scope, n, mine, rdest0, cdest0, subptr)
Definition pstreecomb.f:3
subroutine scombssq(v1, v2)
Definition pstreecomb.f:258