Release date: Fr 11/11/11.
This material is based upon work supported by the National Science Foundation and the Department of Energy under Grant No. NSF-OCI-1032861, NSF-CCF-00444486, NSF-CNS 0325873, NSF-EIA 0122599, NSF-ACI-0090127, DOE-DE-FC02-01ER25478, DOE-DE-FC02-06ER25768.
LAPACK is a software package provided by Univ. of Tennessee, Univ. of California, Berkeley, Univ. of Colorado Denver and NAG Ltd..
1. Support and questions:
2. LAPACK 3.4.0: What’s new
-
xGEQRT: QR factorization (improved interface). Contribution by Rodney James (UC Denver). xGEQRT is analogous to xGEQRF with a modified interface which enables better performance when the blocked reflectors need to be reused. The companion subroutines xGEMQRT apply the reflectors.
-
xGEQRT3: recursive QR factorization. Contribution by Rodney James (UC Denver). The recursive QR factorization enables cache-oblivious and enables high performance on tall and skinny matrices. See reference [1] below.
-
xTPQRT: Communication-Avoiding QR sequential kernels. Contribution by Rodney James (UC Denver). These subroutines are useful for updating a QR factorization and are used in sequential and parallel Communication Avoiding QR. These subroutines support the general case Triangle on top of Pentagon which includes as special cases so-called Triangle on top of Triangle and Triangle on top of Square. This is the right-looking version of the subroutines and the subroutines are blocked. The T matrices and the block size are part of the interface. The companion subroutines xTPMQRT apply the reflectors.
-
CMAKE build system. We are striving to help our users install our libraries seamlessly on their machines. The CMAKE team contributed to our effort to port LAPACK and ScaLAPACK under the CMAKE build system. Building under Windows has never been easier. This also allows us to release dll for Windows, so users no longer need a Fortran compiler to use LAPACK under Windows.
-
Doxygen documentation. LAPACK routine documentation has never been more accessible. See http://www.netlib.org/lapack/explore-html/.
-
New website allowing for easier navigation.
-
LAPACKE - Standard C language APIs for LAPACK. Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0 release, LAPACKE is directly integrated within the LAPACK library and has been enriched by the full set of LAPACK subroutines. See here.
3. References
4. External Contributors
-
CMAKE team
-
Intel MKL team
-
Craig Lucas (University of Manchester and NAG)
5. Thanks
Thanks for bug-report/patches/suggestions to: Ming Gu (UC Berkeley), Hatem Ltaif (KAUST, Saudi Arabia), omitrofa (intel team), Tiago Requeijo, Edward Smyth (NAG), and Clint Whaley (University of Texas at San Antonio, USA).
6. Developer list
-
Jim Demmel (University of California, Berkeley, USA)
-
Jack Dongarra (University of Tennessee and ORNL, USA)
-
Julien Langou (University of Colorado Denver, USA)
-
Sven Hammarling (NAG Ltd. and University of Manchester, UK)
-
Igor Kozachenko (University of California, Berkeley, USA)
-
Julie Langou (University of Tennessee, USA)
-
Benjamin Lipshitz (University of California, Berkeley, USA)
-
Rodney James (University of Colorado Denver, USA)
7. More details
7.1. xGEQRT: QR factorization (improved interface)
Rodney James (UC Denver)
xGEQRT is analogous to xGEQRF with a modified interface which enables better performance when the blocked reflectors need to be reused. The companion subroutines xGEMQRT apply the reflectors.
A SRC/sgeqrt.f A SRC/sgemqrt.f A SRC/sgeqrt2.f
A SRC/cgeqrt.f A SRC/cgemqrt.f A SRC/cgeqrt2.f
A SRC/dgeqrt.f A SRC/dgemqrt.f A SRC/dgeqrt2.f
A SRC/zgeqrt.f A SRC/zgemqrt.f A SRC/zgeqrt2.f
7.2. xGEQRT3: recursive QR factorization
Rodney James (UC Denver)
Elmroth, E., and Gustavson, F.G. “Applying recursion to serial and parallel QR factorization leads to better performance.” IBM Journal of Research and Development 44(4): 605-624 (2000)
The recursive QR factorization enables cache-oblivious and enables high performance on tall and skinny matrices.
A SRC/sgeqrt3.f
A SRC/cgeqrt3.f
A SRC/dgeqrt3.f
A SRC/zgeqrt3.f
7.3. xTPQRT: Communication-Avoiding QR sequential kernels
Rodney James (UC Denver)
These subroutines are useful for updating a QR factorization and are used in sequential and parallel Communication Avoiding QR. These subroutines support the general case Triangle on top of Pentagon which includes as special cases so-called Triangle on top of Triangle and Triangle on top of Square. This is the right-looking version of the subroutines and the subroutines are blocked. The T matrices and the block size are part of the interface. The companion subroutines xTPMQRT apply the reflectors.
A SRC/stpmqrt.f A SRC/stpqrt.f A SRC/stpqrt2.f A SRC/stprfb.f
A SRC/ctpmqrt.f A SRC/ctpqrt.f A SRC/ctpqrt2.f A SRC/ctprfb.f
A SRC/dtpmqrt.f A SRC/dtpqrt.f A SRC/dtpqrt2.f A SRC/dtprfb.f
A SRC/ztpmqrt.f A SRC/ztpqrt.f A SRC/ztpqrt2.f A SRC/ztprfb.f
7.4. CMAKE build system
CMAKE team
Julie Langou (University of Tennessee)
We are striving to help our users install our libraries seamlessly on their machines. The CMAKE team contributed to our effort to port LAPACK and ScaLAPACK under the CMAKE build system. Building under Windows has never been easier. This also allows us to release dll for Windows, so users no longer need a Fortran compiler to use LAPACK under Windows.
7.5. Doxygen documentation
Julie Langou (University of Tennessee)
LAPACK routine documentation has never been more accessible. See http://www.netlib.org/lapack/explore-html/.
7.6. New website allowing for easier navigation
Julie Langou (University of Tennessee)
7.7. LAPACKE - Standard C language APIs for LAPACK
Intel MKL team
Julie Langou (University of Tennessee)
Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0 release, LAPACKE is directly integrated within the LAPACK library and has been enriched with the new set of LAPACK subroutines. See here.