If we solve a problem using a computer algorithm and see a large error in the result, we might suspect poor conditioning in the original mathematical problem. But algorithms can also be sources of errors. When error in the result of an algorithm exceeds what conditioning can explain, we call the algorithm unstable.
In Example 1.2.4 we showed that finding the roots of a quadratic polynomial ax2+bx+c is poorly conditioned if and only if the roots are close to each other relative to their size. Hence, for the polynomial
Demo 1.4.1 suggests that the venerable quadratic formula is an unstable means of computing roots in finite precision. The roots themselves were not sensitive to the data or arithmetic—it’s the specific computational path we chose that caused the huge growth in errors.
We can confirm this conclusion by finding a different path that avoids subtractive cancellation. A little algebra using (1.4.2) confirms the additional formula x1x2=c/a. So given one root r, we compute the other root using c/ar, which has only multiplication and division and therefore creates no numerical trouble.
The algorithms in Demo 1.4.1 and Demo 1.4.2 are equivalent when using real numbers and exact arithmetic. When results are perturbed by machine representation at each step, though, the effects may depend dramatically on the specific sequence of operations, thanks to the chain rule (1.2.9).
This situation may seem hopelessly complicated. But the elementary operations we take for granted, such as those in Table 1.2.1, are well-conditioned in most circumstances. Exceptions usually occur when ∣f(x)∣ is much smaller than ∣x∣, although not every such case signifies trouble. The most common culprit is simple subtractive cancellation.
A practical characterization of instability is that results are much less accurate than the conditioning of the problem can explain. Typically one should apply an algorithm to test problems whose answers are well-known, or for which other programs are known to work well, in order to spot possible instabilities. In the rest of this book we will see some specific ways in which instability is manifested for different types of problems.
In the presence of poor conditioning for a problem f(x), even just the act of rounding the data to floating point may introduce a large change in the result. It’s not realistic, then, to expect any algorithm f~ to have a small error in the sense f~(x)≈f(x). There is another way to characterize the error, though, that can be a useful alternative measurement. Instead of asking, “Did you get nearly the right answer?”, we ask, “Did you answer nearly the right question?”
Backward error measures the change to the original data that reproduces the result that was found by the algorithm. The situation is illustrated in Figure 1.4.1.
Small backward error is the best we can hope for in a poorly conditioned problem. Without getting into the formal details, know that if an algorithm always produces small backward errors, then it is stable. But the converse is not always true: some stable algorithms may produce a large backward error.
are mathematically equivalent, but they suggest evaluation algorithms that can behave quite differently in floating point.
(a) ✍ Using (1.2.6), find the relative condition number of f. (Because f and g are equivalent, the condition number of g is the same.) Show that it approaches 1 as x→0. (Hence it should be possible to compute the function accurately near zero.)
(b) ⌨ Compute f(10−6) using a sequence of four elementary operations. Using Table 1.2.1, make a table like the one in Demo 1.4.1 that shows the result of each elementary result and the numerical value of the condition number of that step.
(c) ⌨ Repeat part (b) for g(10−6), which has six elementary steps.
(d) ✍ Based on parts (b) and (c), is the numerical value of f(10−6) more accurate, or is g(10−6) more accurate?
Let f(x)=xex−1.
(a) ✍ Find the condition number κf(x). What is the maximum of κf(x) over −1≤x≤1?
(b) ⌨ Use the “obvious” algorithm
(exp(x) - 1) / x
to compute f(x) at x=10−2,10−3,10−4,…,10−11.
(c) ⌨ Create a second algorithm from the first 8 terms of the Maclaurin series, i.e.,
For the steps below, define yi=−4i and xi=cosh(yi) for i=1,…,4. Hence yi=acosh(xi).
(a) Find the relative condition number of evaluating f(x)=acosh(x). (You can use (1.4.7) or look up a formula for f′ in a calculus book.) Evaluate κf at all the xi. (You will find that the problem is well-conditioned at these inputs.)
(b) Use (1.4.7) to approximate f(xi) for all i. Compute the relative accuracy of the results. Why are some of the results so inaccurate?
Apply (1.4.8) to approximate f(xi) for all i, again computing the relative accuracy of the results.
⌨ (Continuation of Exercise 1.3.2. Adapted from Higham (2002).) One drawback of the formula (1.3.3) for sample variance is that you must compute a sum for x before beginning another sum to find s2. Some statistics textbooks quote a single-loop formula