Release date: Fr 11/11/11.

This material is based upon work supported by the National Science Foundation and the Department of Energy under Grant No. NSF-OCI-1032861, NSF-CCF-00444486, NSF-CNS 0325873, NSF-EIA 0122599, NSF-ACI-0090127, DOE-DE-FC02-01ER25478, DOE-DE-FC02-06ER25768.

LAPACK is a software package provided by Univ. of Tennessee, Univ. of California, Berkeley, Univ. of Colorado Denver and NAG Ltd..

1. Support and questions:

2. LAPACK 3.4.0: What’s new

  1. xGEQRT: QR factorization (improved interface). Contribution by Rodney James (UC Denver). xGEQRT is analogous to xGEQRF with a modified interface which enables better performance when the blocked reflectors need to be reused. The companion subroutines xGEMQRT apply the reflectors.

  2. xGEQRT3: recursive QR factorization. Contribution by Rodney James (UC Denver). The recursive QR factorization enables cache-oblivious and enables high performance on tall and skinny matrices. See reference [1] below.

  3. xTPQRT: Communication-Avoiding QR sequential kernels. Contribution by Rodney James (UC Denver). These subroutines are useful for updating a QR factorization and are used in sequential and parallel Communication Avoiding QR. These subroutines support the general case Triangle on top of Pentagon which includes as special cases so-called Triangle on top of Triangle and Triangle on top of Square. This is the right-looking version of the subroutines and the subroutines are blocked. The T matrices and the block size are part of the interface. The companion subroutines xTPMQRT apply the reflectors.

  4. CMAKE build system. We are striving to help our users install our libraries seamlessly on their machines. The CMAKE team contributed to our effort to port LAPACK and ScaLAPACK under the CMAKE build system. Building under Windows has never been easier. This also allows us to release dll for Windows, so users no longer need a Fortran compiler to use LAPACK under Windows.

  5. Doxygen documentation. LAPACK routine documentation has never been more accessible. See http://www.netlib.org/lapack/explore-html/.

  6. New website allowing for easier navigation.

  7. LAPACKE - Standard C language APIs for LAPACK. Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0 release, LAPACKE is directly integrated within the LAPACK library and has been enriched by the full set of LAPACK subroutines. See here.

3. References

[1]

E. Elmroth and F.G. Gustavson. “Applying recursion to serial and parallel QR factorization leads to better performance.” IBM Journal of Research and Development 44(4): 605-624 (2000)

4. External Contributors

  • CMAKE team

  • Intel MKL team

  • Craig Lucas (University of Manchester and NAG)

5. Thanks

Thanks for bug-report/patches/suggestions to: Ming Gu (UC Berkeley), Hatem Ltaif (KAUST, Saudi Arabia), omitrofa (intel team), Tiago Requeijo, Edward Smyth (NAG), and Clint Whaley (University of Texas at San Antonio, USA).

6. Developer list

Principal Investigators
  • Jim Demmel (University of California, Berkeley, USA)

  • Jack Dongarra (University of Tennessee and ORNL, USA)

  • Julien Langou (University of Colorado Denver, USA)

LAPACK developers involved in this release
  • Sven Hammarling (NAG Ltd. and University of Manchester, UK)

  • Igor Kozachenko (University of California, Berkeley, USA)

  • Julie Langou (University of Tennessee, USA)

  • Benjamin Lipshitz (University of California, Berkeley, USA)

  • Rodney James (University of Colorado Denver, USA)

7. More details

7.1. xGEQRT: QR factorization (improved interface)

contributors

Rodney James (UC Denver)

contribution

xGEQRT is analogous to xGEQRF with a modified interface which enables better performance when the blocked reflectors need to be reused. The companion subroutines xGEMQRT apply the reflectors.

public interfaces
A SRC/sgeqrt.f    A SRC/sgemqrt.f   A SRC/sgeqrt2.f
A SRC/cgeqrt.f    A SRC/cgemqrt.f   A SRC/cgeqrt2.f
A SRC/dgeqrt.f    A SRC/dgemqrt.f   A SRC/dgeqrt2.f
A SRC/zgeqrt.f    A SRC/zgemqrt.f   A SRC/zgeqrt2.f

7.2. xGEQRT3: recursive QR factorization

contributors

Rodney James (UC Denver)

reference

Elmroth, E., and Gustavson, F.G. “Applying recursion to serial and parallel QR factorization leads to better performance.” IBM Journal of Research and Development 44(4): 605-624 (2000)

contribution

The recursive QR factorization enables cache-oblivious and enables high performance on tall and skinny matrices.

public interfaces
A SRC/sgeqrt3.f
A SRC/cgeqrt3.f
A SRC/dgeqrt3.f
A SRC/zgeqrt3.f

7.3. xTPQRT: Communication-Avoiding QR sequential kernels

contributors

Rodney James (UC Denver)

contribution

These subroutines are useful for updating a QR factorization and are used in sequential and parallel Communication Avoiding QR. These subroutines support the general case Triangle on top of Pentagon which includes as special cases so-called Triangle on top of Triangle and Triangle on top of Square. This is the right-looking version of the subroutines and the subroutines are blocked. The T matrices and the block size are part of the interface. The companion subroutines xTPMQRT apply the reflectors.

public interfaces
A SRC/stpmqrt.f   A SRC/stpqrt.f    A SRC/stpqrt2.f   A SRC/stprfb.f
A SRC/ctpmqrt.f   A SRC/ctpqrt.f    A SRC/ctpqrt2.f   A SRC/ctprfb.f
A SRC/dtpmqrt.f   A SRC/dtpqrt.f    A SRC/dtpqrt2.f   A SRC/dtprfb.f
A SRC/ztpmqrt.f   A SRC/ztpqrt.f    A SRC/ztpqrt2.f   A SRC/ztprfb.f

7.4. CMAKE build system

contributors

CMAKE team

lapacker

Julie Langou (University of Tennessee)

contribution

We are striving to help our users install our libraries seamlessly on their machines. The CMAKE team contributed to our effort to port LAPACK and ScaLAPACK under the CMAKE build system. Building under Windows has never been easier. This also allows us to release dll for Windows, so users no longer need a Fortran compiler to use LAPACK under Windows.

7.5. Doxygen documentation

lapacker

Julie Langou (University of Tennessee)

contribution

LAPACK routine documentation has never been more accessible. See http://www.netlib.org/lapack/explore-html/.

7.6. New website allowing for easier navigation

lapacker

Julie Langou (University of Tennessee)

7.7. LAPACKE - Standard C language APIs for LAPACK

contributors

Intel MKL team

lapacker

Julie Langou (University of Tennessee)

contribution

Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0 release, LAPACKE is directly integrated within the LAPACK library and has been enriched with the new set of LAPACK subroutines. See here.

8. Bug Fix