Skip to article frontmatterSkip to article content

Linear systems

We now attend to the central problem of this chapter: Given a square, n×nn\times n matrix A\mathbf{A} and an nn-vector b\mathbf{b}, find an nn-vector x\mathbf{x} such that Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}. Writing out these equations, we obtain

A11x1+A12x2++A1nxn=b1,A21x1+A22x2++A2nxn=b2,An1x1+An2x2++Annxn=bn.\begin{split} A_{11}x_1 + A_{12}x_2 + \cdots + A_{1n}x_n &= b_1, \\ A_{21}x_1 + A_{22}x_2 + \cdots + A_{2n}x_n &= b_2, \\ \vdots \\ A_{n1}x_1 + A_{n2}x_2 + \cdots + A_{nn}x_n &= b_n. \end{split}

If A\mathbf{A} is invertible, then the mathematical expression of the solution is x=A1b\mathbf{x}=\mathbf{A}^{-1}\mathbf{b} because

A1b=A1(Ax)=(A1A)x=Ix=x.\begin{split} \mathbf{A}^{-1}\mathbf{b} = \mathbf{A}^{-1} (\mathbf{A} \mathbf{x}) = (\mathbf{A}^{-1}\mathbf{A}) \mathbf{x} = \mathbf{I} \mathbf{x} = \mathbf{x}. \end{split}

When A\mathbf{A} is singular, then Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} may have no solution or infinitely many solutions.

2.3.1Don’t use the inverse

Matrix inverses are indispensable for mathematical discussion and derivations. However, as you may remember from a linear algebra course, they are not trivial to compute from the entries of the original matrix. You might be surprised to learn that matrix inverses play almost no role in scientific computing.

In fact, when we encounter an expression such as x=A1b\mathbf{x} = \mathbf{A}^{-1} \mathbf{b} in computing, we interpret it as “solve the linear system Ax=b\mathbf{A} \mathbf{x} = \mathbf{b}” and apply whatever algorithm is most expedient based on what we know about A\mathbf{A}.

Julia
MATLAB
Python

As demonstrated in Demo 2.1.1, the backslash (the \ symbol, not to be confused with the slash / used in web addresses) invokes a linear system solution.

2.3.2Triangular systems

The solution process is especially easy to demonstrate for a system with a triangular matrix. For example, consider the lower triangular system

[4000310010301112]x=[8501]. \begin{bmatrix} 4 & 0 & 0 & 0 \\ 3 & -1 & 0 & 0 \\ -1 & 0 & 3 & 0 \\ 1 & -1 & -1 & 2 \end{bmatrix} \mathbf{x} = \begin{bmatrix} 8 \\ 5 \\ 0 \\ 1 \end{bmatrix}.

The first row of this system states simply that 4x1=84x_1=8, which is easily solved as x1=8/4=2x_1=8/4=2. Now, the second row states that 3x1x2=53x_1-x_2=5. As x1x_1 is already known, it can be replaced to find that x2=(532)=1x_2 = -(5-3\cdot 2)=1. Similarly, the third row gives x3=(0+12)/3=2/3x_3=(0+1\cdot 2)/3 = 2/3, and the last row yields x4=(112+11+12/3)/2=1/3x_4=(1-1\cdot 2 + 1\cdot 1 + 1\cdot 2/3)/2 = 1/3. Hence the solution is

x=[212/31/3]. \mathbf{x} = \begin{bmatrix} 2 \\ 1 \\ 2/3 \\ 1/3 \end{bmatrix}.

The process just described is called forward substitution. In the 4×44\times 4 lower triangular case of Lx=b\mathbf{L}\mathbf{x}=\mathbf{b} it leads to the formulas

x1=b1L11,x2=b2L21x1L22,x3=b3L31x1L32x2L33,x4=b4L41x1L42x2L43x3L44.\begin{split} x_1 &= \frac{b_1}{L_{11}}, \\ x_2 &= \frac{b_2 - L_{21}x_1}{L_{22}}, \\ x_3 &= \frac{b_3 - L_{31}x_1 - L_{32}x_2}{L_{33}}, \\ x_4 &= \frac{b_4 - L_{41}x_1 - L_{42}x_2 - L_{43}x_3}{L_{44}}. \end{split}

For upper triangular systems Ux=b\mathbf{U}\mathbf{x}=\mathbf{b} an analogous process of backward substitution begins by solving for the last component xn=bn/Unnx_n=b_n/U_{nn} and working backward. For the 4×44\times 4 case we have

[U11U12U13U140U22U23U2400U33U34000U44]x=[b1b2b3b4]. \begin{bmatrix} U_{11} & U_{12} & U_{13} & U_{14} \\ 0 & U_{22} & U_{23} & U_{24} \\ 0 & 0 & U_{33} & U_{34} \\ 0 & 0 & 0 & U_{44} \end{bmatrix} \mathbf{x} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \\ b_4 \end{bmatrix}.

Solving the system backward, starting with x4x_4 first and then proceeding in descending order, gives

x4=b4U44,x3=b3U34x4U33,x2=b2U23x3U24x4U22,x1=b1U12x2U13x3U14x4U11.\begin{split} x_4 &= \frac{b_4}{U_{44}}, \\ x_3 &= \frac{b_3 - U_{34}x_4}{U_{33}}, \\ x_2 &= \frac{b_2 - U_{23}x_3 - U_{24}x_4}{U_{22}}, \\ x_1 &= \frac{b_1 - U_{12}x_2 - U_{13}x_3 - U_{14}x_4}{U_{11}}. \end{split}

It should be clear that forward or backward substitution fails if and only if one of the diagonal entries of the system matrix is zero. We have essentially proved the following theorem.

2.3.3Implementation

Consider how to implement the sequential process implied by Equation (2.3.7). It seems clear that we want to loop through the elements of x\mathbf{x} in order. Within each iteration of that loop, we have an expression whose length depends on the iteration number. This leads to a nested loop structure.

The implementation of backward substitution is much like forward substitution and is given in Function 2.3.2.

The example in Demo 2.3.3 is our first clue that linear system problems may have large condition numbers, making inaccurate solutions inevitable in floating-point arithmetic. We will learn how to spot such problems in Conditioning of linear systems. Before reaching that point, however, we need to discuss how to solve general linear systems, not just triangular ones.

2.3.4Exercises

  1. ✍ Find a vector b\mathbf{b} such that the system [0100]x=b\begin{bmatrix} 0&1\\0&0 \end{bmatrix} \mathbf{x}=\mathbf{b} has no solution.

  2. ✍ Solve the following triangular systems by hand.

    (a) 2x1=4x1x2=23x1+2x2+x3=1\displaystyle \begin{aligned} -2x_1 &= -4 \\ x_1 - x_2 &= 2 \\ 3x_1 + 2x_2 + x_3 &= 1 \end{aligned} \quad (b) [4000120014402551]x=[4135]\displaystyle \begin{bmatrix} 4 & 0 & 0 & 0 \\ 1 & -2 & 0 & 0 \\ -1 & 4 & 4 & 0 \\ 2 & -5 & 5 & 1 \end{bmatrix} \mathbf{x} = \begin{bmatrix} -4 \\ 1 \\ -3 \\ 5 \end{bmatrix}\quad (c) 3x1+2x2+x3=1x2x3=22x3=4\displaystyle \begin{aligned} 3x_1 + 2x_2 + x_3 &= 1 \\ x_2 - x_3 &= 2 \\ 2 x_3 &= -4 \end{aligned}

  3. ⌨ Use Function 2.3.1 or Function 2.3.2 to solve each system from the preceding exercise. Verify that the solution is correct by computing Lx\mathbf{L}\mathbf{x} and subtracting b\mathbf{b}.

  4. ⌨ Use Function 2.3.2 to solve the following systems. Verify that the solution is correct by computing Ux\mathbf{U}\mathbf{x} and subtracting b\mathbf{b}.

    (a)   [310012003]x=[116]\;\begin{bmatrix} 3 & 1 & 0 \\ 0 & -1 & -2 \\ 0 & 0 & 3 \\ \end{bmatrix} \mathbf{x} = \begin{bmatrix} 1 \\ 1 \\ 6 \end{bmatrix}\qquad (b)   [3106012700340005]x=[4115]\;\begin{bmatrix} 3 & 1 & 0 & 6 \\ 0 & -1 & -2 & 7 \\ 0 & 0 & 3 & 4 \\ 0 & 0 & 0 & 5 \end{bmatrix} \mathbf{x} = \begin{bmatrix} 4 \\ 1 \\ 1 \\ 5 \end{bmatrix}

  5. Suppose a string is stretched with tension τ horizontally between two anchors at x=0x=0 and x=1x=1. At each of the n1n-1 equally spaced positions xk=k/nx_k=k/n, k=1,,n1k=1,\ldots,n-1, we attach a little mass mim_i and allow the string to come to equilibrium. This causes vertical displacement of the string. Let qkq_k be the amount of displacement at xkx_k. If the displacements are not too large, then an approximate force balance equation is

    nτ(qkqk1)+nτ(qkqk+1)=mkg,k=1,,n1,n \tau (q_k - q_{k-1}) + n\tau (q_k - q_{k+1}) = m_k g, \qquad k=1,\ldots,n-1,

    where g=9.8g=-9.8 m/s2 is the acceleration due to gravity, and we define q0=0q_0=0 and qn=0q_n=0 due to the anchors. This defines a linear system for q1,,qn1q_1,\ldots,q_{n-1}.

    (a) ✍ Show that the force balance equations can be written as a linear system Aq=f\mathbf{A}\mathbf{q}=\mathbf{f}, where q\mathbf{q} is a vector of the unknown displacements and A\mathbf{A} is a tridiagonal matrix (i.e., Aij=0A_{ij}=0 if ij>1|i-j|>1) of size (n1)×(n1)(n-1)\times(n-1).

    (b) ⌨ Let τ=10\tau=10 N, and mk=(1/10n)m_k=(1/10n) kg for every kk. Using backslash, find the displacements when n=8n=8 and n=40n=40, and superimpose plots of q\mathbf{q} over 0x10\le x \le 1 for the two cases. (Be sure to include the zero values at x=0x=0 and x=1x=1 in your plots.)

    (c) ⌨ Repeat (b) for the case mk=(k/5n2)m_k = (k/5n^2) kg.

  6. ⌨ If BRn×p\mathbf{B}\in\mathbb{R}^{n \times p} has columns b1,,bp\mathbf{b}_1,\ldots,\mathbf{b}_p, then we can pose pp linear systems at once by writing AX=B\mathbf{A} \mathbf{X} = \mathbf{B}, where X\mathbf{X} is n×pn\times p. Specifically, this equation implies Axj=bj\mathbf{A} \mathbf{x}_j = \mathbf{b}_j for j=1,,pj=1,\ldots,p.

    (a) Modify Function 2.3.1 and Function 2.3.2 so that they solve the case where the second input is n×pn\times p for p1p\ge 1.

    (b) If AX=I\mathbf{A} \mathbf{X}=\mathbf{I}, then X=A1\mathbf{X}=\mathbf{A}^{-1}. Use this fact to write a function ltinverse that uses your modified forwardsub to compute the inverse of a lower triangular matrix. Test your function on at least two nontrivial matrices. (We remind you here that this is just an exercise; matrix inverses are rarely a good idea in numerical practice!)

  7. Demo 2.3.3 showed solutions of Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}, where

    A=[110αββ01100001100001100001],b=[α0001].\mathbf{A} = \begin{bmatrix} 1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} \alpha \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}.

    Use Function 2.3.2 to solve with α=0.1\alpha=0.1 and β=10,100,103,,1012\beta=10,100,10^3,\ldots,10^{12}, tabulating the values of β and x11|x_1-1|. (This kind of behavior is explained in Conditioning of linear systems.)