Date: Wed, 29 Jul 2009 11:36:51 -0600
From: Pat Quillen
To: "lapack@cs.utk.edu" <lapack@cs.utk.edu>
Cc: Bobby Cheng, Duncan Po, Penny Anderson, Cleve Moler, lapack
Subject: [Lapack] xLARFP and scaling.

Dear LAPACKers:

Also while attempting to upgrade to LAPACK 3.2, more eigs tests began to fail,
but this time due to differences from DGEQR2 used by the ARPACK routine DSEUPD.
It may be that there is something peculiar going on in ARPACK (which I'm
discussing with Lehoucq and Sorensen) but this encouraged me to look closely at
xLARFP.

We're somewhat troubled by the fact that xLARFP can produce reflectors
 I - tau * v * v'
with tau being arbitrarily small, and the elements in v being arbitrarily
large.  The reflectors produced by xLARFG have 1 <= tau <= 2 and each element
in v bounded above by 1 in absolute value, both of which seem to be desirable
properties.

Here is a very specific example:

Consider x = [1e100; 5e-62].  Calling DLARFP to construct a reflector gives tau
\approx 1.48e-323!  Applying that reflector to y = [1; 1] gives Hy \approx [1;
-1.3175] although the reflector should reflect over something quite nearly
parallel to the x-axis.  By contrast, DLARFG constructs a reflector with tau
\approx 2 and then maps y to Hy = [-1; 1].  Also, making x(2) smaller (but
nonzero) will cause DLARFG to continue to produce a reflector H such that Hy
(as applied by DLARF) will be [-1;1], whereas DLARFP will stop producing such a
reflector once tau underflows.  Note that the attached DLARFP2 (described
below) will produce reflector that maps y to [1; -1] when constructed with x as
above, and will continue to do so as x(2) -> 0.

We don't particularly want to patch LAPACK for our usage, but again, we are a
bit concerned about the use of xLARFP underneath xGEQR2.  Our customers very
frequently have poorly scaled data and will probably run into examples like
that shown above.  At this point, we are inclined to replace usages of xLARFP
with xLARFG.

As for general usage within LAPACK, there seem to be a few ways to deal with
this behavior:

1.  Don't use xLARFP underneath xGEQR2 and introduce a new suite of routines
which form QR using xLARFP.

2.  Change xLARFP to construct tau and v with certain bounds.  One possible
approach that occurred to me is the following: When alpha > 0, DLARFP computes
tau = sin^2(t)/(1 + cos(t)) where t is the angle between the vector to be
reflected and the target.  Here, t \in [0, pi/2) (as alpha > 0) and it is
apparent that tau may become arbitrarily small.  One could factor sin(t) out of
tau and instead set tau = 1/(1+cos(t)) which is now bounded (0.5 <= tau <= 1)
and one could scale v such that v(1) = sin(t) instead of 1.  This scaling then
makes v such that |v_k| <= 2.  Applying the reflectors then necessitates
setting v(1) to sin(t) instead of 1 prior to calling DLARF.  This requires
either storing sin(t) and carrying it around, or attempting to recover it from
tau as v(1) = sin(t) = sqrt(2*tau - 1)/tau.  While the latter does not require
extra memory, it looks sort of dodgy thanks to a subtraction and a square root.
I'm attaching a version of DLARFP which has been modified to take the above
approach and which stores v(1) in workspace.  I confess that I haven't thought
deeply about how this generalizes to complex.

3.  Do nothing except please correct the comments in xLARFP which state that 1
<= tau <= 2.

Any advice you can provide is welcome.

Thanks.
Pat.