This represents known issue with the LAPACK software package that have not been yet solved. (As of Sunday October 26th 2008.)
- LAPACK does not have any exceptional shift strategy for QZ
- DQDS algorithm (useful to compute the singular values of a matrix)
1 -- LAPACK does not have any exceptional shift strategy for QZ
[1.1-- The problems] LAPACK current shift strategy is known to fail on various pencils. Here are three pencils, (A1,B1), (A2,B2), (A3,B3) where LAPACK fails A1 = [ 0 0 1 ; 0 1 0 ; 1 0 0 ]; B1 = [ 2 0 0 ; 0 0 1 ; 0 1 0 ]; Pencil (A1,B1) has been sent to us by Zbigniew Leyk, Senior Lecturer, Department of Computer Science, Texas A & M University on Jan 30 2007. See: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?t=317 A2 = [ 0 0 0 1 ; 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ]; B2 = [ 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1 ]; Pencil (A2,B2) has been sent to us by Daniel Kressner in 2005 who quote Vasile Sima. A3 = [ 1 1 0; 0 0 -1; 1 0 0 ]; B3 = [ -1 0 -1; 0 -1 0; -1 0 -1 ]; Pencil (A3,B3) has been sent to us by Patrick Alken from University of Colorado at Boulder on Apr 24 2007. See: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?t=382 A4 = [ 1 1e-9 0; 1e-9 1 0; 0 0 1 ]; B4 = [ 1 0 0; 0 1 0; 0 0 1 ]; Pencil (A4,B4) has been sent to us by Micheal Baudin from the Scilab project on Tu Oct 28th 2008. This one pencil make ZGGEV fail. DGGEV handles it correctly. http://bugzilla.scilab.org/show_bug.cgi?id=3652 [1.2-- The fixes] [1.2.a-- Mathworks' patch] Bobby Cheng sent the lapackers the Mathworks patch to fix LAPACK QZ. This patch works fine on the pencils (A1,B1) and (A2,B2). But fails on pencil (A3,B3). As observed by Jim Demmel, matlab qz works fine with pencil (A3,B3). Date: Thu, 15 Sep 2005 20:35:00 +0200 (CEST) From: Daniel Kressner Subject: Re: [Lapackers] Mathworks modif on LAPACK Hi, below are some comments on Mathworks' modifications to DGEBAL and DHGEQZ. > 1)Change SCLFAC from 8 to 2 to be the same as LINPACK. > > File: > cgebal.f > zgebal.f > sgebal.f > dgebal.f I support the point of view taken by Mathworks. Changing SCLFAC from 8 to 2 may result in a few extra iterations of the balancing algorithm, but it may also give a balanced matrix of slightly smaller norm and consequently one or two more accurate digits in the computed eigenvalues. I am not aware that the execution time needed by balancing could be a major concern when solving nonsymmetric eigenvalues. > 3)Change to implicit shift and exceptional shift calculations. > Some problem were not converging. > > Files: > zhgeqz.f (lines 602-672) > chgeqz.f (lines 602-672) > shgeqz.f (lines 657-667, 819-821) > dhgeqz.f (lines 657-667, 819-821) As far as I can see, Mathworks applied four changes to *HGEQZ (not all of them are marked in the code): (1) Exceptional shift strategy As also pointed out by Vasile Sima, the exceptional shift strategy for the QZ algorithm currently implemented in DHGEQZ seems to be less effective than the one for the QR algorithm. For example, it fails for the matrix pair [ 0 0 0 1 ] [ 1 0 0 0 ] A = [ 0 1 0 0 ], B = identity. [ 0 0 1 0 ] LAPACK's exceptional shift is zero (this is the same as the standard shift -> no convergence). Mathworks' exceptional shift is one (much better). (2) Choice of single real shift DHGEQZ applies a single shift QZ algorithm if the computed shifts are real. This seems to be a relict of Ward's combination shift QZ algorithm, which has no meaningful purpose anymore (applying two single shift QZ iterations is more expensive than applying one double shift QZ iteration, even in terms of flops). If two real shifts are computed, Mathworks' change has the effect that always the shift closer to the last diagonal element of A\B is used. If I remember correctly, this is in the spirit of what Wilkinson proposed for symmetric matrices. I support this change and expect that it can lead to slightly faster convergence. (3) Mathworks included B22 = -B22 on line 820 (in DHGEQZ) This is a bug fix and I highly recommend to include it. It seems that DHGEQZ computes wrong complex conjugate eigenvalue pairs if DLASV2 computes negative diagonal elements (I am not sure whether this is an unlikely event in the context of the QZ algorithm). (4) Mathworks uses the workspace to store the number of iterations needed for each eigenvalue. This is probably used for debugging and should not be included in LAPACK. As announced at the Householder meeting, Bo Kagstrom and I are currently working on an extension of Ralph Byers' multishift aggressive QR code to QZ. This code will take the modifs of Mathworks into account. However, I suggest to include these modifs (particularly (1) and (3)) already in the next update of LAPACK. With best regards, Daniel [1.2.b-- David Day's patch] I found in the lapackers archive a patch from David Day. From: David Day (Sandia) Date: Tue Sep 7 09:49:40 2004 Subject: [Lapackers] Re: [Fwd: Convergence failure of LAPACK's zggevx] Dear Jim Demmel: The reported problem does involve the Exceptional shifts in QZ. Changing the QZ exceptional shift to something similar to what I have advixed for real QR and real QZ does fix this particular bug report. In my version of zhgeqz.f lines 603 read ELSE * * Exceptional shift. Chosen for no particularly good reason. * ESHIFT = ESHIFT + DCONJG( ( ASCALE*A( ILAST-1, ILAST ) ) / $ ( BSCALE*B( ILAST-1, ILAST-1 ) ) ) After these lines, I added TEMP = $ ABS(ASCALE*A(ILAST-1,ILAST-2)/(BSCALE*B(ILAST-1,ILAST-1)))+ $ ABS(ASCALE*A(ILAST ,ILAST-1)/(BSCALE*B(ILAST-2,ILAST-2))) TEMP2 = $ ABS(ASCALE*A(ILAST-1,ILAST-2)/(BSCALE*B(ILAST-2,ILAST-2)))+ $ ABS(ASCALE*A(ILAST ,ILAST-1)/(BSCALE*B(ILAST-1,ILAST-1))) ESHIFT = DCMPLX(MIN(TEMP, TEMP2),0) This changes the value of the shift, now set on line 616 SHIFT = ESHIFT Thank you for including me in this discussion. --David M. Day [1.2.c-- Patrick Alken's patch] Date: Wed, 2 May 2007 16:14:55 -0600 From: Patrick AlkenSubject: Re: [Lapack] dggev fails on non-singular system Hello, I have done some testing and I have found that the following patch fixes all the 3x3 and 4x4 matrix systems that I found failures on: Replace (in dhgeqz.f): 629 IF( ( DBLE( MAXIT )*SAFMIN )*ABS( H( ILAST-1, ILAST) ).LT. 630 $ ABS( T( ILAST-1, ILAST-1 ) ) ) THEN 631 ESHIFT = ESHIFT + H( ILAST-1, ILAST ) / 632 $ T( ILAST-1, ILAST-1 ) 633 ELSE 634 ESHIFT = ESHIFT + ONE / ( SAFMIN*DBLE( MAXIT ) ) 635 END IF with: 629 IF( ( DBLE( MAXIT )*SAFMIN )*ABS( H( ILAST, ILAST-1) ).LT. 630 $ ABS( T( ILAST-1, ILAST-1 ) ) ) THEN 631 ESHIFT = ESHIFT + 0.736 * H( ILAST, ILAST-1 ) / 632 $ T( ILAST-1, ILAST-1 ) 633 ELSE 634 ESHIFT = ESHIFT + ONE / ( SAFMIN*DBLE( MAXIT ) ) 635 END IF The idea here is the following: in the previous code, its certainly possible for H(ILAST-1,ILAST) to equal 0, in which case the test will pass but ESHIFT will not be modified at all. I have just replaced it with H(ILAST,ILAST-1) which is guaranteed not to be zero, since its a subdiagonal element and the previous tests will search for zeros on the subdiagonal. The 0.736 coefficient is arbitrary - without it, I still found a small number of convergence failures on 4x4 integer systems. Patrick Alken [1.3-- Fix] Wait for a new QZ with early agressive deflation and multishift.2 -- DQDS algorithm (useful to compute the singular values of a matrix)
Cleve Moler reported that LAPACK DLASQ1 routine (called by DBDSQR when only the singular values are needed) fails on the bidiagonal matrix: set_Moler_200.c. Osni Marques provided us with DLASQ1-6 routines that passes the matrix. I have done some investigation to find out why the 200 by 200 matrix that Cleve Moler sent to us weeks ago leads to a failure in Matlab's "norm". My understanding is that "norm" resorts to LAPACK's dbdsqr, which in turn calls dqds, i.e. dlasq1 (which calls dlasq2 etc), if only singular vectors are desired. Indeed, LAPACK 3.1.1's dlasq1 returns INFO=2 (Matlab prints "Warning: SVD did not converge at index = 2") for Cleve's matrix, meaning "current block of Z not diagonalized after 30*N iterations (in inner while loop)". This is related to the NBIG = 30*( N0-I0+1 ) limit that is set in dlasq2, where I0 and N0 indicate the active (sub)matrix dlasq2 is dealing with. It starts with I0=1 and N0=N if there is no initial splitting; I0 and N0 may change later if splittings occur. These are my findings: 1) Contrary to what I had originally thought, LAPACK 3.1.1's dlasq2 does not contain our last modifications, in particular a mechanism to reverse (flip) the qd-array in some adverse situations. The lack of this mechanism may greatly delay convergence. In fact, LAPACK 3.1.1's dqds works if we allow more than 30*N iterations: without flipping, convergence is achieved in 8073 iterations (to achieve this I changed 30*N to 300*N). In contrast, the latest dqds that we have produced works just fine on Cleve's matrix: convergence is achieved in 969 iterations. Interestingly, LAPACK 3.1.1 converges in 970 iterations if we first flip Cleve's matrix (similarly, "norm" also works fine). 2) In the computation of shifts (by dlasq4), we have TTYPE.EQ.-6, i.e. "Case 6, no information to guide us." This case is based on a variable G that is used for the computation of the shift (S = G*DMIN). Modifications were performed in the dqds routines to make them thread safe (by removing SAVE statements) but in LAPACK 3.1.1 G seems to be set to zero too often. I am afraid that for a sequence of TTYPE.EQ.-6 this may lead to shifts that are not so efficient. My records indicate that I submitted our latest version of dqds for inclusion in LAPACK on 07/11/06 but the changes have not been incorporated (LAPACK 3.1 is dated 11/12/06, and LAPACK 3.1.1 is dated 02/26/07). In any case, I am sending it attached so it can be compared with LAPACK 3.1.1's dqds. I am also attaching Cleve's 200 by 200 matrix as well as its flipped form in case someone (in the cc list) would like to experiment with them. Cheers, Osni The reason for which Osni code is not in LAPACK is that Osni code fail on some of the matrices in the LAPACK testing, for example: set_PB1_30.c. Indeed the double precision routines of Osni are fine, there are only a few problems with the single precision routines. Current status: Hello lapackers, I did not understand upfront what Bobby Cheng was saying with his: "MATLAB switched back to the old algorithm and lost some speed." So I asked Bobby Cheng for their files. MathWorks simply have hacked xBDSQR so that they use the QR algorithm (they simply have commented the lines that call DLASQ1). So they use QR even if they want only the singular values. Matlab does not use the dqds algorithm when computing a norm for example. Julien.