Date: Wed, 29 Jul 2009 11:36:51 -0600 From: Pat Quillen To: "lapack@cs.utk.edu" Cc: Bobby Cheng, Duncan Po, Penny Anderson, Cleve Moler, lapack Subject: [Lapack] xLARFP and scaling. Dear LAPACKers: Also while attempting to upgrade to LAPACK 3.2, more eigs tests began to fail, but this time due to differences from DGEQR2 used by the ARPACK routine DSEUPD. It may be that there is something peculiar going on in ARPACK (which I'm discussing with Lehoucq and Sorensen) but this encouraged me to look closely at xLARFP. We're somewhat troubled by the fact that xLARFP can produce reflectors I - tau * v * v' with tau being arbitrarily small, and the elements in v being arbitrarily large. The reflectors produced by xLARFG have 1 <= tau <= 2 and each element in v bounded above by 1 in absolute value, both of which seem to be desirable properties. Here is a very specific example: Consider x = [1e100; 5e-62]. Calling DLARFP to construct a reflector gives tau \approx 1.48e-323! Applying that reflector to y = [1; 1] gives Hy \approx [1; -1.3175] although the reflector should reflect over something quite nearly parallel to the x-axis. By contrast, DLARFG constructs a reflector with tau \approx 2 and then maps y to Hy = [-1; 1]. Also, making x(2) smaller (but nonzero) will cause DLARFG to continue to produce a reflector H such that Hy (as applied by DLARF) will be [-1;1], whereas DLARFP will stop producing such a reflector once tau underflows. Note that the attached DLARFP2 (described below) will produce reflector that maps y to [1; -1] when constructed with x as above, and will continue to do so as x(2) -> 0. We don't particularly want to patch LAPACK for our usage, but again, we are a bit concerned about the use of xLARFP underneath xGEQR2. Our customers very frequently have poorly scaled data and will probably run into examples like that shown above. At this point, we are inclined to replace usages of xLARFP with xLARFG. As for general usage within LAPACK, there seem to be a few ways to deal with this behavior: 1. Don't use xLARFP underneath xGEQR2 and introduce a new suite of routines which form QR using xLARFP. 2. Change xLARFP to construct tau and v with certain bounds. One possible approach that occurred to me is the following: When alpha > 0, DLARFP computes tau = sin^2(t)/(1 + cos(t)) where t is the angle between the vector to be reflected and the target. Here, t \in [0, pi/2) (as alpha > 0) and it is apparent that tau may become arbitrarily small. One could factor sin(t) out of tau and instead set tau = 1/(1+cos(t)) which is now bounded (0.5 <= tau <= 1) and one could scale v such that v(1) = sin(t) instead of 1. This scaling then makes v such that |v_k| <= 2. Applying the reflectors then necessitates setting v(1) to sin(t) instead of 1 prior to calling DLARF. This requires either storing sin(t) and carrying it around, or attempting to recover it from tau as v(1) = sin(t) = sqrt(2*tau - 1)/tau. While the latter does not require extra memory, it looks sort of dodgy thanks to a subtraction and a square root. I'm attaching a version of DLARFP which has been modified to take the above approach and which stores v(1) in workspace. I confess that I haven't thought deeply about how this generalizes to complex. 3. Do nothing except please correct the comments in xLARFP which state that 1 <= tau <= 2. Any advice you can provide is welcome. Thanks. Pat.