next up previous contents index
Next: Further Details: How Error Up: Accuracy and Stability Previous: New Sources of Error

How to Measure Errors


  ScaLAPACK routines return four types of floating-point output arguments:

This section provides measures for errors in these quantities, which we need in order to express error bounds.

  First, consider scalars. Let the scalar tex2html_wrap_inline17436 be an approximation of the true answer tex2html_wrap_inline16235. We can measure the difference between tex2html_wrap_inline16235 and tex2html_wrap_inline17436 either by the absolute error tex2html_wrap_inline17444, or, if tex2html_wrap_inline16235 is nonzero, by the relative error tex2html_wrap_inline17448. Alternatively, it is sometimes more convenient to use tex2html_wrap_inline17450 instead of the standard expression for relative error. If the relative error of tex2html_wrap_inline17436 is, say, tex2html_wrap_inline17454, we say that tex2html_wrap_inline17436 is accurate to 5 decimal digits.      

  To measure the error in vectors, we need to measure the size or norm of a vector x . A popular norm is the magnitude of the largest component, tex2html_wrap_inline17460, which we denote by tex2html_wrap_inline17462. This is read the infinity norm of x. See Table 6.2 for a summary of norms.

Table 6.2: Vector and matrix norms

If tex2html_wrap_inline17482 is an approximation to the exact vector x, we will refer to tex2html_wrap_inline17486 as the absolute error in tex2html_wrap_inline17482 (where p is one of the values in Table 6.2)       and refer to tex2html_wrap_inline17492 as the relative error in tex2html_wrap_inline17482 (assuming tex2html_wrap_inline17496). As with scalars, we will sometimes use tex2html_wrap_inline17498 for the relative error. As above, if the relative error of tex2html_wrap_inline17482 is, say tex2html_wrap_inline17454, we say that tex2html_wrap_inline17482 is accurate to 5 decimal digits. The following example illustrates these ideas.


Thus, we would say that tex2html_wrap_inline17482 approximates x to 2 decimal digits.

  Errors in matrices may also be measured with norms . The most obvious generalization of tex2html_wrap_inline17462 to matrices would appear to be tex2html_wrap_inline17512, but this does not have certain important mathematical properties that make deriving error bounds convenient. Instead, we will use tex2html_wrap_inline17514, where A is an m-by-n matrix, or tex2html_wrap_inline17522; see Table 6.2 for other matrix norms. As before, tex2html_wrap_inline17524 is the absolute error   in tex2html_wrap_inline17526, tex2html_wrap_inline17528 is the relative error   in tex2html_wrap_inline17526, and a relative error in tex2html_wrap_inline17526 of tex2html_wrap_inline17454 means tex2html_wrap_inline17526 is accurate to 5 decimal digits. The following example illustrates these ideas.

so tex2html_wrap_inline17526 is accurate to 1 decimal digit.

We now introduce some related notation we will use in our error bounds. The condition number of a matrix A is defined as   tex2html_wrap_inline17542, where A is square and invertible, and p is tex2html_wrap_inline17548 or one of the other possibilities in Table 6.2. The condition number measures how sensitive tex2html_wrap_inline13063 is to changes in A; the larger the condition number, the more sensitive is tex2html_wrap_inline13063. For example, for the same A as in the last example,
ScaLAPACK error estimation routines typically compute a variable called RCOND , which is the reciprocal of the condition number (or an approximation of the reciprocal). The reciprocal of the condition number is used instead of the condition number itself in order to avoid the possibility of overflow when the condition number is very large.     Also, some of our error bounds will use the vector of absolute values of x, |x| (tex2html_wrap_inline17562), or similarly |A| (tex2html_wrap_inline17566).

  Now we consider errors in subspaces. Subspaces are the outputs of routines that compute eigenvectors and invariant subspaces of matrices. We need a careful definition of error in these cases for the following reason. The nonzero vector x is called a (right) eigenvector of the matrix A with eigenvalue tex2html_wrap_inline12778 if tex2html_wrap_inline17574. From this definition, we see that -x, 2x, or any other nonzero multiple tex2html_wrap_inline17580 of x is also an eigenvector. In other words, eigenvectors are not unique. This means we cannot measure the difference between two supposed eigenvectors tex2html_wrap_inline17482 and x by computing tex2html_wrap_inline17588, because this may be large while tex2html_wrap_inline17590 is small or even zero for some tex2html_wrap_inline17592. This is true even if we normalize x so that tex2html_wrap_inline17596, since both x and -x can be normalized simultaneously. Hence, to define error in a useful way, we need instead to consider the set tex2html_wrap_inline17602 of all scalar multiples tex2html_wrap_inline17604 of x. The set tex2html_wrap_inline17602 is called the subspace spanned by x and is uniquely determined by any nonzero member of tex2html_wrap_inline17602. We will measure the difference between two such sets by the acute angle between them. Suppose tex2html_wrap_inline17614 is spanned by tex2html_wrap_inline17616 and tex2html_wrap_inline17602 is spanned by tex2html_wrap_inline17620. Then the acute angle between tex2html_wrap_inline17614 and tex2html_wrap_inline17602 is defined as    
One can show that tex2html_wrap_inline17626 does not change when either tex2html_wrap_inline17482 or x is multiplied by any nonzero scalar. For example, if
as above, then tex2html_wrap_inline17632 for any nonzero scalars tex2html_wrap_inline14473 and tex2html_wrap_inline17636.

Let us consider another way to interpret the angle tex2html_wrap_inline17638 between tex2html_wrap_inline17614 and tex2html_wrap_inline17602.     Suppose tex2html_wrap_inline17482 is a unit vector (tex2html_wrap_inline17646). Then there is a scalar tex2html_wrap_inline14473 such that
The approximation tex2html_wrap_inline17650 holds when tex2html_wrap_inline17638 is much less than 1 (less than .1 will do nicely). If tex2html_wrap_inline17482 is an approximate eigenvector with error bound tex2html_wrap_inline17656, where x is a true eigenvector, there is another true eigenvector tex2html_wrap_inline17580 satisfying tex2html_wrap_inline17662. For example, if
then tex2html_wrap_inline17664 for tex2html_wrap_inline17666.

Finally, many of our error bounds will contain a factor p(n) (or p(m,n)), which grows as a function of matrix dimension n (or dimensions m and n). It represents a potentially different function for each problem. In practice, the true errors usually grow at most linearly; using p(n)=1 in the error bound formulas will often give a reasonable estimate; p(n)=n is more conservative. Therefore, we will refer to p(n) as a ``modestly growing'' function of n; however. it can occasionally be much larger. For simplicity, the error bounds computed by the code fragments in the following sections will use p(n)=1. This means these computed error bounds may occasionally slightly underestimate the true error. For this reason we refer to these computed error bounds as ``approximate error bounds.''

Further Details: How to Measure Errors 

   The relative error tex2html_wrap_inline17448 in the approximation tex2html_wrap_inline17436 of the true solution tex2html_wrap_inline16235 has a drawback: it often cannot be computed directly, because it depends on the unknown quantity tex2html_wrap_inline17694. However, we can often instead estimate tex2html_wrap_inline17450, since tex2html_wrap_inline17436 is known (it is the output of our algorithm). Fortunately, these two quantities are necessarily close together, provided either one is small, which is the only time they provide a useful bound anyway. For example, tex2html_wrap_inline17700 implies
so they can be used interchangeably.

Table 6.2 contains a variety of norms we will use to measure errors. These norms have the properties that tex2html_wrap_inline17702, and tex2html_wrap_inline17704, where p is one of 1, 2, tex2html_wrap_inline17548, and F. These properties are useful for deriving error bounds.

An error bound that uses a given norm may be changed into an error bound that uses another norm. This is accomplished by multiplying the first error bound by an appropriate function of the problem dimension. Table 6.3 gives the factors tex2html_wrap_inline17716 such that tex2html_wrap_inline17718, where n is the dimension of x.

Table 6.3: Bounding one vector norm in terms of another

Table 6.4 gives the factors tex2html_wrap_inline17746 such that tex2html_wrap_inline17748, where A is m-by-n.

Table 6.4: Bounding one matrix norm in terms of another

The two-norm of A, tex2html_wrap_inline17802, is also called the spectral norm of A and is equal to the largest singular value tex2html_wrap_inline17806 of A. We shall also need to refer to the smallest singular value tex2html_wrap_inline17810 of A; its value can be defined in a similar way to the definition of the two-norm in Table 6.2, namely, as tex2html_wrap_inline17814 when A has at least as many rows as columns, and defined as tex2html_wrap_inline17818 when A has more columns than rows. The two-norm, Frobenius norm  , and singular values of a matrix do not change if the matrix is multiplied by a real orthogonal (or complex unitary) matrix.

Now we define subspaces spanned by more than one vector, and angles between subspaces.     Given a set of k n-dimensional vectors tex2html_wrap_inline17826, they determine a subspace tex2html_wrap_inline17602 consisting of all their possible linear combinations tex2html_wrap_inline17830, tex2html_wrap_inline17832 scalars tex2html_wrap_inline17834. We also say that tex2html_wrap_inline17826 spans tex2html_wrap_inline17602. The difficulty in measuring the difference between subspaces is that the sets of vectors spanning them are not unique. For example, tex2html_wrap_inline17620, tex2html_wrap_inline17842 and tex2html_wrap_inline17844 all determine the same subspace. Therefore, we cannot simply compare the subspaces spanned by tex2html_wrap_inline17846 and tex2html_wrap_inline17826 by comparing each tex2html_wrap_inline17850 to tex2html_wrap_inline17852. Instead, we will measure the angle between the subspaces, which is independent of the spanning set of vectors. Suppose subspace tex2html_wrap_inline17614 is spanned by tex2html_wrap_inline17846 and that subspace tex2html_wrap_inline17602 is spanned by tex2html_wrap_inline17826. If k=1, we instead write more simply tex2html_wrap_inline17616 and tex2html_wrap_inline17620. When k=1, we define the angle tex2html_wrap_inline17870 between tex2html_wrap_inline17614 and tex2html_wrap_inline17602 as the acute angle between tex2html_wrap_inline17482 and x. When k>1, we define the acute angle between tex2html_wrap_inline17614 and tex2html_wrap_inline17602 as the largest acute angle between any vector tex2html_wrap_inline17482 in tex2html_wrap_inline17614 and the closest vector x in tex2html_wrap_inline17602 to tex2html_wrap_inline17482:

ScaLAPACK routines that compute subspaces return vectors tex2html_wrap_inline17846 spanning a subspace tex2html_wrap_inline17614 that are orthonormal. This means the n-by-k matrix tex2html_wrap_inline17904 satisfies tex2html_wrap_inline17906. Suppose also that the vectors tex2html_wrap_inline17826 spanning tex2html_wrap_inline17602 are orthonormal, so tex2html_wrap_inline17912 also satisfies tex2html_wrap_inline17914. Then there is a simple expression for the angle between tex2html_wrap_inline17614 and tex2html_wrap_inline17602:    
For example, if
then tex2html_wrap_inline17920.

As stated above, all our bounds will contain a factor p(n) (or p(m,n)), which measures how roundoff errors can grow as a function of matrix dimension n (or m and m). In practice, the true error usually grows just linearly with n, or even slower, but we can generally prove only much weaker bounds of the form tex2html_wrap_inline17934. This is because we cannot rule out the extremely unlikely possibility of rounding errors all adding together instead of canceling on average. Using tex2html_wrap_inline17934 would give pessimistic and unrealistic bounds, especially for large n, so we content ourselves with describing p(n) as a ``modestly growing'' polynomial function of n. Using p(n)=1 in the error bound formulas will often give a reasonable error estimate. For detailed derivations of various p(n), see [71, 114, 84, 38].

There is also one situation where p(n) can grow as large as tex2html_wrap_inline17950: Gaussian elimination. This typically occurs only on specially constructed matrices presented in numerical analysis courses [114, p. 212,]. However, the expert driver for solving linear systems, PxGESVX    , provides error bounds incorporating p(n), and so this rare possibility can be detected.

next up previous contents index
Next: Further Details: How Error Up: Accuracy and Stability Previous: New Sources of Error

Susan Blackford
Tue May 13 09:21:01 EDT 1997