In this section, we briefly discuss methods for the computation
of singular values and vectors of dense matrices.
These methods are sometimes called transformation methods.
Some or all of them are implemented in LAPACK,
ScaLAPACK, and in the svd command in
MATLAB.
While it is possible to apply the transformation methods
of §4.2 to one of the Hermitian
matrices ,
, or
, the methods used in
practice are specialized for the SVD and so are more
efficient and accurate.
The most common transformation methods compute the thin SVD in three phases, shown below. (They can be easily modified to compute the full SVD, or just selected singular values and/or singular vectors, but we present just the thin SVD for simplicity.)
Phase 1, reduction to bidiagonal form, is performed by
LAPACK routine xGEBRD and
ScaLAPACK routine PxGEBRD
using a sequence of unitary Householder
reflections on the left and
unitary Householder
reflections on the right, for a cost of
floating point operations.
If singular values only are desired,
phase 2 costs just
,
which is much less than the cost of phase 1, and phase 3 is
omitted.
If all left and right singular vectors are desired,
the cost depends strongly on which algorithm is selected,
as described below.
The algorithms for phase 2 are analogous to those available for the symmetric tridiagonal eigenproblem as discussed in §4.2, with some differences outlined below.
Currently, the fastest available sequential or shared memory parallel dense SVD routine is LAPACK's xGESDD, but a faster one is on the way. The fastest available distributed memory parallel dense SVD routine is ScaLAPACK's PxGESVD, but again a better one is one the way.
Divide-and-conquer is implemented by LAPACK computational routine xBDSDC, which is used by LAPACK driver routine xGESDD to compute the SVD of a dense matrix. xGESDD is currently the LAPACK algorithm of choice for the SVD of dense matrices.
Recent advances in [168,128,358,356]
promise an algorithm that costs work per singular
triplet while guaranteeing orthogonality, but it is not
yet complete. When done, both LAPACK and ScaLAPACK will
provide this algorithm, which should be the fastest in
all cases.
If is banded, phase 1 can be performed
more cheaply by LAPACK computational
routine xGBBRD, but no LAPACK driver routines are available.
Finally, we mention Jacobi's method [114,198]
for the SVD.
This transformation method repeatedly multiplies
on the right by elementary orthogonal matrices (Jacobi rotations)
until
converges to
; the product of the Jacobi rotations
is
. Jacobi is slower than any of the above transformation methods
(it can be made to run within about twice the time of QR [136])
but has the useful property that for certain
it can deliver the tiny singular values, and their singular vectors,
much more accurately than any of the above methods
provided that it is properly implemented [118,115].