Conditioning of linear systems

We are ready to consider the conditioning of solving the square linear system \(\mathbf{A}\mathbf{x}=\mathbf{b}\). Recall that the condition number is the relative change in the solution divided by a relative change in the data. In this case the data are \(\mathbf{A}\) and \(\mathbf{b}\), and the solution is \(\mathbf{x}\).

For simplicity we start by allowing perturbations to \(\mathbf{b}\) only while \(\mathbf{A}\) remains unchanged. Let \(\mathbf{A}\mathbf{x}=\mathbf{b}\) be perturbed to

\[ \mathbf{A}(\mathbf{x}+\mathbf{h}) = \mathbf{b}+\mathbf{d}.\]

We seek to bound \(\| \mathbf{h} \|\) in terms of \(\| \mathbf{d} \|\):

\[\begin{split}\begin{split} \mathbf{A}\mathbf{x} + \mathbf{A} \mathbf{h} &= \mathbf{b} + \mathbf{d} \\ \mathbf{A} \mathbf{h} &= \mathbf{d}\\ \mathbf{h} &= \mathbf{A}^{-1} \mathbf{d}\\ \| \mathbf{h} \| &\le \| \mathbf{A}^{-1}\| \,\| \mathbf{d} \|, \end{split}\end{split}\]

where we have used \(\mathbf{A}\mathbf{x}=\mathbf{b}\) and (43). Since furthermore \(\mathbf{b}=\mathbf{A}\mathbf{x}\) and therefore \(\| \mathbf{b} \|\le \| \mathbf{A} \|\, \| \mathbf{x} \|\), we derive

\[ \frac{\quad\dfrac{\| \mathbf{h} \|}{\| \mathbf{x} \|}\quad}{\dfrac{\| \mathbf{d} \|}{\| \mathbf{b} \|}} = \frac{\| \mathbf{h} \|\; \| \mathbf{b} \|}{\| \mathbf{d} \|\; \| \mathbf{x} \|} \le \frac{\bigl(\| \mathbf{A}^{-1} \|\, \| \mathbf{d} \|\bigr) \bigl(\| \mathbf{A} \|\,\| \mathbf{x} \|\bigr)}{\| \mathbf{d} \|\,\| \mathbf{x} \|} = \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|.\]

It is possible to show that this bound is tight, in the sense that the inequalities are in fact equalities for some choices of \(\mathbf{b}\) and \(\mathbf{d}\). Motivated by the definition of the condition number as the ratio of relative changes in solution and data, we define the matrix condition number

(48)\[\kappa(\mathbf{A}) = \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|.\]

Note that \(\kappa(\mathbf{A})\) depends on the choice of norm; a subscript on \(\kappa\) such as \(1\), \(2\), or \(\infty\) is used if clarification is needed.

Main result

The matrix condition number (48) is equal to the condition number of solving a linear system of equations. Although we derived this fact only for perturbations of \(\mathbf{b}\), a similar statement holds when \(\mathbf{A}\) is perturbed.

Using a traditional \(\Delta\) notation for the perturbation in a quantity, we can write the following.

Observation

If \(\mathbf{A}(\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b} + \Delta \mathbf{b}\), then

(49)\[\frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{b} \|}{\| \mathbf{b} \|}.\]

If \((\mathbf{A}+\Delta \mathbf{A}) (\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b}\), then

(50)\[\frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{A} \|}{\| \mathbf{A} \|},\]

in the limit \(\| \Delta \mathbf{A} \| \to 0\).

Note that for any induced matrix norm,

\[ 1 = \| \mathbf{I} \| = \| \mathbf{A} \mathbf{A}^{-1} \| \le \| \mathbf{A} \|\, \| \mathbf{A}^{-1} \| = \kappa(\mathbf{A}).\]

A condition number of 1 is the best we can hope for—in that case, the relative perturbation of the solution has the same size as that of the data. A condition number of size \(10^t\) indicates that in floating point arithmetic, roughly \(t\) digits are lost (i.e., become incorrect) in computing the solution \(\mathbf{x}\).

If \(\kappa(\mathbf{A}) > \epsilon_\text{mach}^{-1}\), then for computational purposes the matrix is singular. If \(\mathbf{A}\) is exactly singular, it is customary to say that \(\kappa(\mathbf{A})=\infty\).

Residual and backward error

Suppose that \(\mathbf{A}\mathbf{x}=\mathbf{b}\) and \(\tilde{\mathbf{x}}\) is a computed estimate of the solution \(\mathbf{x}\). The most natural quantity to study is the error, \(\mathbf{x}-\tilde{\mathbf{x}}\). Normally we can’t compute it because we don’t know the exact solution. However, we can certainly compute the residual, defined as

(51)\[ \mathbf{r} = \mathbf{b} - \mathbf{A}\tilde{\mathbf{x}}.\]

Obviously a zero residual means that \(\tilde{\mathbf{x}}=\mathbf{x}\) and we get the exact solution. What does a “small” residual mean? Note that \(\mathbf{A}\tilde{\mathbf{x}}=\mathbf{b}-\mathbf{r}\), so \(\tilde{\mathbf{x}}\) solves the linear system problem for a right-hand side that is changed by \(-\mathbf{r}\). This is precisely what is meant by backward error: the perturbation from the original problem to the one that is solved exactly.

But does a small residual mean that the error is also small? We can reconnect with (49) by the definition \(\mathbf{h} = \tilde{\mathbf{x}}-\mathbf{x}\), in which case \(\mathbf{d} = \mathbf{A}(\mathbf{x}+\mathbf{h})-\mathbf{b}=\mathbf{A}\mathbf{h} = -\mathbf{r}\). Hence (49) is equivalent to

(52)\[ \frac{\| \mathbf{x}-\tilde{\mathbf{x} \|}}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \mathbf{r} \|}{\| \mathbf{b} \|}.\]

Equation (52) says that the relative error can be much larger than the relative residual when the matrix condition number is large. To put it another way: When solving a linear system, all that can be expected is that the backward error, not the error, be small.

Exercises

  1. ⌨ A Hilbert matrix is a square matrix whose \((i,j)\) entry is \(1/(i+j-1)\). The \(n\times n\) version \(\mathbf{H}_n\) can be generated in Julia using

    H = [ 1/(i+j-1) for i in 1:n, j in 1:n ]
    

    The condition number of a Hilbert matrix grows very rapidly as a function of \(n\), showing that even simple, small linear systems can be badly conditioned.

    Make a table of the values of \(\kappa(\mathbf{H}_n)\) in the 2-norm for \(n=2,3,\ldots,16\). Why does the growth of \(\kappa\) appear to slow down at \(n=13\)?

  2. ⌨ The purpose of this problem is to verify, like in Condition number bounds on error, the error bound

    \[\frac{\| \mathbf{x}-\tilde{\mathbf{x} \|}}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \mathbf{h} \|}{\| \mathbf{b} \|}.\]

    Here \(\tilde{\mathbf{x}}\) is a numerical approximation to the exact solution \(\mathbf{x}\), and \(\mathbf{h}\) is an unknown perturbation caused by machine roundoff. We will assume that \(\| \mathbf{d} \|/\| \mathbf{b} \|\) is roughly of size eps().

    For this problem you will need the MatrixDepot package, which can be installed and loaded via

    import Pkg; Pkg.add("MatrixDepot")
    using MatrixDepot
    

    For each \(n=10,20,\ldots,70\) let A = matrixdepot("prolate",n,0.4) and let \(\mathbf{x}\) have components \(x_k=k/n\) for \(k=1,\ldots,n\). Define b=A*x and let \(\tilde{\mathbf{x}}\) be the solution produced numerically by backslash.

    Make a table including columns for \(n\), the relative error in \(\tilde{\mathbf{x}}\), the condition number of \(\mathbf{A}\), and the right-hand side of the inequality above. You should find that the inequality holds in every case.

  3. ⌨ An earlier problem asked you to solve systems

    \[\begin{split}\mathbf{A} = \begin{bmatrix} 1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} \alpha \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}\end{split}\]

    with \(\alpha=0.1\) and \(\beta=10,100,\ldots,10^{12}\). Again make a table of \(\beta\) and \(|x_1-1|\), and add a column for the condition numbers of these matrices.

  4. ⌨ The condition number of the formulation of polynomial interpolation suggested in Polynomial interpolation grows rapidly as the degree of the polynomial increases. Let \(\mathbf{A}_n\) denote the \((n+1)\times(n+1)\) version of the Vandermonde matrix in equation (26) based on the equally spaced interpolation nodes \(t_i=i/n\) for \(i=0,\ldots,n\).

    (a) Using the 1-norm, graph \(\kappa(\mathbf{A}_n)\) as a function of \(n\) for \(n=4,5,6,\ldots,20\), using a log scale on the \(y\)-axis. (The graph is nearly a straight line.)

    (b) Show that if \(\log f(n)\) is a linear function of \(n\) with positive slope, then \(f(n) = C\alpha^n\) for constants \(C>0\) and \(\alpha>1\).

  5. ✍ Define \(\mathbf{A}_n\) as the \(n\times n\) matrix \(\displaystyle \begin{bmatrix} 1 & -2 & & &\\ & 1 & -2 & & \\ & & \ddots & \ddots & \\ & & & 1 & -2 \\ & & & & 1 \end{bmatrix}.\)

    (a) Write out \(\mathbf{A}_2^{-1}\) and \(\mathbf{A}_3^{-1}\).

    (b) Write out \(\mathbf{A}_n^{-1}\) in the general case \(n>1\). (If necessary, look at a few more cases in Julia until you are certain of the pattern.) Make a clear argument why it is correct.

    (c) Using the \(\infty\)-norm, find \(\kappa(\mathbf{A}_n)\).

  6. (a) Prove that for \(n\times n\) nonsingular matrices \(\mathbf{A}\) and \(\mathbf{B}\), \(\kappa(\mathbf{A}\mathbf{B})\le \kappa(\mathbf{A})\kappa(\mathbf{B})\).

    (b) Show by means of an example that the result of part (a) cannot be an equality in general.

  7. ✍ Let \(\mathbf{D}\) be a diagonal \(n\times n\) matrix, not necessarily invertible. Prove that in the 2-norm,

    \[\kappa(\mathbf{D}) = \frac{\max_i |D_{ii}|}{\min_i |D_{ii}|}.\]

    (Hint: See this previous problem.)