We are ready to consider the conditioning of solving the square linear system Ax=b. In this problem, the data are A and b, and the solution is x. Both data and result are multidimensional, so we will use norms to measure their magnitudes.
The motivation for the definition of relative condition number in Chapter 1 was to quantify the response of the result to perturbations of the data. For simplicity, we start by allowing perturbations to b only while A remains fixed.
It is possible to show that this bound is tight, in the sense that the inequalities are in fact equalities for some choices of b and d. This result motivates a new definition.
The matrix condition number (2.8.5) is equal to the condition number of solving a linear system of equations. Although we derived this fact above only for perturbations of b, a similar statement holds when A is perturbed.
Using a traditional Δ notation for the perturbation in a quantity, we can write the following.
A condition number of 1 is the best we can hope for—in that case, the relative perturbation of the solution has the same size as that of the data. A condition number of size 10t indicates that in floating-point arithmetic, roughly t digits are lost (i.e., become incorrect) in computing the solution x. And if κ(A)>ϵmach−1, then for computational purposes the matrix is effectively singular.
Suppose that Ax=b and x~ is a computed estimate of the solution x. The most natural quantity to study is the error, x−x~. Normally we can’t compute it because we don’t know the exact solution. However, we can compute something related.
Obviously, a zero residual means that x~=x, and we have the exact solution. What happens more generally? Note that Ax~=b−r. That is, x~ solves the linear system problem for a right-hand side that is changed by −r. This is precisely what is meant by backward error.
Hence residual and backward error are the same thing for a linear system. What is the connection to the (forward) error? We can reconnect with (2.8.6) by the definition h=x~−x, in which case
⌨ Refer to Demo 2.8.1 for the definition of a Hilbert matrix. Make a table of the values of κ(Hn) in the 2-norm for n=2,3,…,16. Speculate as to why the growth of κ appears to slow down at n=13.
⌨ The purpose of this problem is to verify, like in Demo 2.8.1, the error bound
Here x~ is a numerical approximation to the exact solution x, and h is an unknown perturbation caused by machine roundoff. We will assume that ∥d∥/∥b∥ is roughly eps().
For each n=10,20,…,70 let A = matrixdepot("prolate",n,0.4) and let x have components xk=k/n for k=1,…,n. Define b=A*x and let x~ be the solution produced numerically by backslash.
Make a table including columns for n, the condition number of A, the observed relative error in x~, and the right-hand side of the inequality above. You should find that the inequality holds in every case.
⌨ Exercise 2.3.7 suggests that the solutions of linear systems
become less accurate as β increases. Using α=0.1 and β=10,100,…,1012, make a table with columns for β, ∣x1−1∣, and the condition number of the matrix.
⌨ Let An denote the (n+1)×(n+1) version of the Vandermonde matrix in Equation (2.1.3) based on the equally spaced interpolation nodes ti=i/n for i=0,…,n. Using the 1-norm, graph κ(An) as a function of n for n=4,5,6,…,20, using a log scale on the y-axis. (The graph is nearly a straight line.)
⌨ The matrix A in (2.6.2) has unpivoted LU factors given in (2.6.3) as a function of parameter ε. For ϵ=10−2,10−4,…,10−10, make a table with columns for ε, κ(A), κ(L), and κ(U). (This shows that solution via unpivoted LU factorization is arbitrarily unstable.)
✍ Define An as the n×n matrix ⎣⎡1−21−2⋱⋱1−21⎦⎤.
(a) Write out A2−1 and A3−1.
(b) Write out An−1 in the general case n>1. (If necessary, look at a few more cases in Julia until you are certain of the pattern.) Make a clear argument why it is correct.
(c) Using the ∞-norm, find κ(An).
✍ (a) Prove that for n×n nonsingular matrices A and B, κ(AB)≤κ(A)κ(B).
(b) Show by means of an example that the result of part (a) cannot be an equality in general.
✍ Let D be a diagonal n×n matrix, not necessarily invertible. Prove that in the 1-norm,