Skip to article frontmatterSkip to article content

LU factorization

A major tool in numerical linear algebra is to factor a given matrix into terms that are individually easier to deal with than the original. In this section we derive a means to express a square matrix using triangular factors, which will allow us to solve a linear system using forward and backward substitution.

2.4.1Outer products

Our derivation of the factorization hinges on an expression of matrix products in terms of vector outer products. If uRm\mathbf{u}\in\real^m and vRn\mathbf{v}\in\real^n, then the outer product of these vectors is the m×nm\times n matrix

uvT=[u1v1u1v2u1vnu2v1u2v2u2vnumv1umv2umvn].\mathbf{u} \mathbf{v}^T = \begin{bmatrix} u_1 v_1 & u_1 v_2 & \cdots & u_1 v_n \\u_2 v_1 & u_2 v_2 & \cdots & u_2 v_n \\ \vdots & \vdots & & \vdots \\ u_m v_1 & u_m v_2 & \cdots & u_m v_n \end{bmatrix}.

We illustrate the connection of outer products to matrix multiplication by a small example.

It is not hard to derive the following generalization of Example 2.4.1 to all matrix products.

2.4.2Triangular product

Equation (2.4.4) has some interesting structure for the product LU\mathbf{L}\mathbf{U}, where L\mathbf{L} is n×nn\times n and lower triangular (i.e., zero above the main diagonal) and U\mathbf{U} is n×nn\times n and upper triangular (zero below the diagonal).

Let the columns of L\mathbf{L} be written as k\boldsymbol{\ell}_k and the rows of U\mathbf{U} be written as ukT\mathbf{u}_k^T. Then the first row of LU\mathbf{L}\mathbf{U} is

e1Tk=1nkukT=k=1n(e1Tk)ukT=L11u1T.\mathbf{e}_1^T \sum_{k=1}^n \boldsymbol{ℓ}_k \mathbf{u}_k^T = \sum_{k=1}^n (\mathbf{e}_1^T \boldsymbol{\ell}_k) \mathbf{u}_k^T = L_{11} \mathbf{u}_1^T.

Likewise, the first column of LU\mathbf{L}\mathbf{U} is

(k=1nkukT)e1=k=1nk(ukTe1)=U111.\left( \sum_{k=1}^n \mathbf{ℓ}_k \mathbf{u}_k^T\right) \mathbf{e}_1 = \sum_{k=1}^n \mathbf{\ell}_k (\mathbf{u}_k^T \mathbf{e}_1) = U_{11}\boldsymbol{\ell}_1.

These two calculations are enough to derive one of the most important algorithms in scientific computing.

2.4.3Triangular factorization

Our goal is to factor a given n×nn\times n matrix A\mathbf{A} as the triangular product A=LU\mathbf{A}=\mathbf{L}\mathbf{U}. It turns out that we have n2+nn^2+n total nonzero unknowns in the two triangular matrices, so we set L11==Lnn=1L_{11}=\cdots = L_{nn}=1, making L\mathbf{L} a unit lower triangular matrix.

We have arrived at the linchpin of solving linear systems.

The outer product algorithm for LU factorization seen in Demo 2.4.3 is coded as Function 2.4.1.

2.4.4Gaussian elimination and linear systems

In your first matrix algebra course, you probably learned a triangularization technique called Gaussian elimination or row elimination to solve a linear system Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}. In most presentations, you form an augmented matrix [A  b][\mathbf{A}\;\mathbf{b}] and do row operations until the system reaches an upper triangular form, followed by backward substitution. LU factorization is equivalent to Gaussian elimination in which no row swaps are performed, and the elimination procedure produces the factors if you keep track of the row multipliers appropriately.

Like Gaussian elimination, the primary use of LU factorization is to solve a linear system. It reduces a given linear system to two triangular ones. From this, solving Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} follows immediately from associativity:

b=Ax=(LU)x=L(Ux).\mathbf{b} = \mathbf{A} \mathbf{x} = (\mathbf{L} \mathbf{U}) \mathbf{x} = \mathbf{L} (\mathbf{U} \mathbf{x}).

Defining z=Ux\mathbf{z} = \mathbf{U} \mathbf{x} leads to the following.

A key advantage of the factorization point of view is that it depends only on the matrix A\mathbf{A}. If systems are to be solved for a single A\mathbf{A} but multiple different versions of b\mathbf{b}, then the factorization approach is more efficient, as we’ll see in Efficiency of matrix computations.

As noted in the descriptions of Function 2.4.1 and Algorithm 2.4.2, the LU factorization as we have seen it so far is not stable for all matrices. In fact, it does not always even exist. The missing element is the row swapping allowed in Gaussian elimination. We will address these issues in Row pivoting.

2.4.5Exercises

  1. ✍ For each matrix, produce an LU factorization by hand.

    (a) [2344510482]\quad \displaystyle \begin{bmatrix} 2 & 3 & 4 \\ 4 & 5 & 10 \\ 4 & 8 & 2 \end{bmatrix}\qquad (b) [6244336112821860107]\quad \displaystyle \begin{bmatrix} 6 & -2 & -4 & 4\\ 3 & -3 & -6 & 1 \\ -12 & 8 & 21 & -8 \\ -6 & 0 & -10 & 7 \end{bmatrix}

  2. ⌨ The matrices

    T(x,y)=[100010xy1],R(θ)=[cosθsinθ0sinθcosθ0001]\mathbf{T}(x,y) = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ x & y & 1 \end{bmatrix},\qquad \mathbf{R}(\theta) = \begin{bmatrix} \cos\theta & \sin \theta & 0 \\ -\sin\theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{bmatrix}

    are used to represent translations and rotations of plane points in computer graphics. For the following, let

    A=T(3,1)R(π/5)T(3,1),z=[221].\mathbf{A} = \mathbf{T}(3,-1)\mathbf{R}(\pi/5)\mathbf{T}(-3,1), \qquad \mathbf{z} = \begin{bmatrix} 2 \\ 2 \\ 1 \end{bmatrix}.

    (a) Find b=Az\mathbf{b} = \mathbf{A}\mathbf{z}.

    (b) Use Function 2.4.1 to find the LU factorization of A\mathbf{A}.

    (c) Use the factors with triangular substitutions in order to solve Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}, and find xz\mathbf{x}-\mathbf{z}.

  3. ⌨ Define

    A=[1000101211000011000011000010],x^=[01/32/314/3],b=Ax^.\mathbf{A}= \begin{bmatrix} 1 & 0 & 0 & 0 & 10^{12} \\ 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 0 \end{bmatrix}, \quad \hat{\mathbf{x}} = \begin{bmatrix} 0 \\ 1/3 \\ 2/3 \\ 1 \\ 4/3 \end{bmatrix}, \quad \mathbf{b} = \mathbf{A}\hat{\mathbf{x}}.

    (a) Using Function 2.4.1 and triangular substitutions, solve the linear system Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}, showing the result. To the nearest integer, how many accurate digits are in the result? (The answer is much less than the full 16 of double precision.)

    (b) Repeat part (a) with 1020 as the element in the upper right corner. (The result is even less accurate. We will study the causes of such low accuracy in Conditioning of linear systems.)

  4. ⌨ Let

    A=[110100011010001101100110110011011001].\mathbf{A} = \begin{bmatrix} 1 & 1 & 0 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 1 & 0 & 1 \\ 1 & 0 & 0 & 1 & 1 & 0 \\ 1 & 1 & 0 & 0 & 1 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 \end{bmatrix}.

    Verify computationally that if A=LU\mathbf{A}=\mathbf{L}\mathbf{U} is the LU factorization, then the elements of L\mathbf{L}, U\mathbf{U}, L1\mathbf{L}^{-1}, and U1\mathbf{U}^{-1} are all integers. Do not rely just on visual inspection of the numbers; perform a more definitive test.

  5. Function 2.4.1 factors A=LU\mathbf{A}=\mathbf{L}\mathbf{U} in such a way that L\mathbf{L} is a unit lower triangular matrix—that is, has all ones on the diagonal. It is also possible to define the factorization so that U\mathbf{U} is a unit upper triangular matrix instead. Write a function lufact2 that uses Function 2.4.1 without modification to produce this version of the factorization. (Hint: Begin with the standard LU factorization of AT\mathbf{A}^T.) Demonstrate on a nontrivial 4×44\times 4 example.

  6. When computing the determinant of a matrix by hand, it’s common to use cofactor expansion and apply the definition recursively. But this is terribly inefficient as a function of the matrix size.

    (a) ✍ Explain using determinant properties why, if A=LU\mathbf{A}=\mathbf{L}\mathbf{U} is an LU factorization,

    det(A)=U11U22Unn=i=1nUii. \det(\mathbf{A}) = U_{11}U_{22}\cdots U_{nn}=\prod_{i=1}^n U_{ii}.

    (b) ⌨ Using the result of part (a), write a function determinant(A) that computes the determinant using Function 2.4.1. Test your function on at least two nontriangular 5×55\times 5 matrices, comparing your result to the result of the standard det function.