As mentioned in LU factorization , the A = L U \mathbf{A}=\mathbf{L}\mathbf{U} A = LU factorization is not stable for every nonsingular A \mathbf{A} A . Indeed, the factorization does not always even exist.
Failure of naive LU factorizationHere is a previously encountered matrix that factors well.
A = [2 0 4 3 ; -4 5 -7 -10 ; 1 15 2 -4.5 ; -2 0 2 -13];
L, U = FNC.lufact(A)
L
If we swap the second and fourth rows of A \mathbf{A} A , the result is still nonsingular. However, the factorization now fails.
A[[2, 4], :] = A[[4, 2], :]
L, U = FNC.lufact(A)
L
The presence of NaN
in the result indicates that some impossible operation was required. The source of the problem is easy to locate. We can find the first outer product in the factorization just fine:
U[1, :] = A[1, :]
L[:, 1] = A[:, 1] / U[1, 1]
A -= L[:, 1] * U[1, :]'
The next step is U[2, :] = A[2, :]
, which is also OK. But then we are supposed to divide by U[2, 2]
, which is zero. The algorithm cannot continue.
Failure of naive LU factorizationHere is a previously encountered matrix that factors well.
A = [
2 0 4 3
-4 5 -7 -10
1 15 2 -4.5
-2 0 2 -13
];
[L, U] = lufact(A);
L
If we swap the second and fourth rows of A \mathbf{A} A , the result is still nonsingular. However, the factorization now fails.
A([2, 4], :) = A([4, 2], :); % swap rows 2 and 4
[L, U] = lufact(A);
L
The presence of NaN
in the result indicates that some impossible operation was required. The source of the problem is easy to locate. We can find the first outer product in the factorization just fine:
U(1, :) = A(1, :);
L(:, 1) = A(:, 1) / U(1, 1)
A = A - L(:, 1) * U(1, :)
The next step is U(2, :) = A(2, :)
, which is also OK. But then we are supposed to divide by U(2, 2)
, which is zero. The algorithm cannot continue.
Failure of naive LU factorizationHere is a previously encountered matrix that factors well.
A = array([
[2, 0, 4, 3],
[-4, 5, -7, -10],
[1, 15, 2, -4.5],
[-2, 0, 2, -13]
])
L, U = FNC.lufact(A)
print(L)
If we swap the second and fourth rows of A \mathbf{A} A , the result is still nonsingular. However, the factorization now fails.
A[[1, 3], :] = A[[3, 1], :]
L, U = FNC.lufact(A)
print(L)
The presence of NaN
in the result indicates that some impossible operation was required. The source of the problem is easy to locate. We can find the first outer product in the factorization just fine:
U[0, :] = A[0, :]
L[:, 0] = A[:, 0] / U[0, 0]
A -= outer(L[:, 0], U[0, :])
print(A)
The next step is U[1, :] = A[1, :]
, which is also OK. But then we are supposed to divide by U[1, 1]
, which is zero. The algorithm cannot continue.
In LU factorization we remarked that LU factorization is equivalent to Gaussian elimination with no row swaps. However, those swaps are necessary in situations like those encountered in Demo 2.6.1 , in order to avoid division by zero. We will find a modification of the outer product procedure that allows us to do the same thing.
2.6.1 Choosing a pivot ¶ The diagonal element of U \mathbf{U} U that appears in the denominator of line 17 of Function 2.4.1 is called the pivot element of its column. In order to avoid a zero pivot, we will use the largest available element in the column we are working on as the pivot. This technique is known as row pivoting .
Row pivoting in LU factorizationHere is the trouble-making matrix from Demo 2.6.1 .
A₁ = [2 0 4 3 ; -2 0 2 -13; 1 15 2 -4.5 ; -4 5 -7 -10]
We now find the largest candidate pivot in the first column. We don’t care about sign, so we take absolute values before finding the max.
The argmax
function returns the location of the largest element of a vector or matrix.
i = argmax( abs.(A₁[:, 1]) )
This is the row of the matrix that we extract to put into U \mathbf{U} U . That guarantees that the division used to find ℓ 1 \boldsymbol{\ell}_1 ℓ 1 will be valid.
L, U = zeros(4,4),zeros(4,4)
U[1, :] = A₁[i, :]
L[:, 1] = A₁[:, 1] / U[1, 1]
A₂ = A₁ - L[:, 1] * U[1, :]'
Observe that A 2 \mathbf{A}_2 A 2 has a new zero row and zero column, but the zero row is the fourth rather than the first. However, we forge on by using the largest possible pivot in column 2 for the next outer product.
@show i = argmax( abs.(A₂[:, 2]) )
U[2, :] = A₂[i, :]
L[:, 2] = A₂[:, 2] / U[2, 2]
A₃ = A₂ - L[:, 2] * U[2, :]'
Now we have zeroed out the third row as well as the second column. We can finish out the procedure.
@show i = argmax( abs.(A₃[:, 3]) )
U[3, :] = A₃[i, :]
L[:, 3] = A₃[:, 3] / U[3, 3]
A₄ = A₃ - L[:, 3] * U[3, :]'
@show i = argmax( abs.(A₄[:, 4]) )
U[4, :] = A₄[i, :]
L[:, 4] = A₄[:, 4] / U[4, 4];
We do have a factorization of the original matrix:
And U \mathbf{U} U has the required structure:
However, the triangularity of L \mathbf{L} L has been broken.
Row pivoting in LU factorizationHere is the trouble-making matrix from Demo 2.6.1 .
A_1 = [2 0 4 3; -2 0 2 -13; 1 15 2 -4.5; -4 5 -7 -10]
We now find the largest candidate pivot in the first column. We don’t care about sign, so we take absolute values before finding the max.
The second output of max
returns the location of the largest element of a vector. The ~
symbol is used to ignore the value of the first output.
[~, i] = max( abs(A_1(:, 1)) )
This is the row of the matrix that we extract to put into U \mathbf{U} U . That guarantees that the division used to find ℓ 1 \boldsymbol{\ell}_1 ℓ 1 will be valid.
L = zeros(4, 4);
U = zeros(4, 4);
U(1, :) = A_1(i, :);
L(:, 1) = A_1(:, 1) / U(1, 1);
A_2 = A_1 - L(:, 1) * U(1, :)
Observe that A 2 \mathbf{A}_2 A 2 has a new zero row and zero column, but the zero row is the fourth rather than the first. However, we forge on by using the largest possible pivot in column 2 for the next outer product.
[~, i] = max( abs(A_2(:, 2)) )
U(2, :) = A_2(i, :);
L(:, 2) = A_2(:, 2) / U(2, 2);
A_3 = A_2 - L(:, 2) * U(2, :)
Now we have zeroed out the third row as well as the second column. We can finish out the procedure.
[~, i] = max( abs(A_3(:, 3)) )
U(3, :) = A_3(i, :);
L(:, 3) = A_3(:, 3) / U(3, 3);
A_4 = A_3 - L(:, 3) * U(3, :)
[~, i] = max( abs(A_4(:, 4)) )
U(4, :) = A_4(i, :);
L(:, 4) = A_4(:, 4) / U(4, 4);
We do have a factorization of the original matrix:
And U \mathbf{U} U has the required structure:
However, the triangularity of L \mathbf{L} L has been broken.
Row pivoting in LU factorizationHere is the trouble-making matrix from Demo 2.6.1 .
A_1 = array([
[2, 0, 4, 3],
[-2, 0, 2, -13],
[1, 15, 2, -4.5],
[-4, 5, -7, -10]
])
We now find the largest candidate pivot in the first column. We don’t care about sign, so we take absolute values before finding the max.
The argmax
function returns the location of the largest element of a vector or matrix.
i = argmax( abs(A_1[:, 0]) )
print(i)
This is the row of the matrix that we extract to put into U \mathbf{U} U . That guarantees that the division used to find ℓ 1 \boldsymbol{\ell}_1 ℓ 1 will be valid.
L, U = eye(4), zeros((4, 4))
U[0, :] = A_1[i, :]
L[:, 0] = A_1[:, 0] / U[0, 0]
A_2 = A_1 - outer(L[:, 0], U[0, :])
print(A_2)
Observe that A 2 \mathbf{A}_2 A 2 has a new zero row and zero column, but the zero row is the fourth rather than the first. However, we forge on by using the largest possible pivot in column 2 for the next outer product.
i = argmax( abs(A_2[:, 1]) )
print(f"new pivot row is {i}")
U[1, :] = A_2[i, :]
L[:, 1] = A_2[:, 1] / U[1, 1]
A_3 = A_2 - outer(L[:, 1], U[1, :])
print(A_3)
Now we have zeroed out the third row as well as the second column. We can finish out the procedure.
i = argmax( abs(A_3[:, 2]) )
print(f"new pivot row is {i}")
U[2, :] = A_3[i, :]
L[:, 2] = A_3[:, 2] / U[2, 2]
A_4 = A_3 - outer(L[:, 2], U[2, :])
print(A_4)
i = argmax( abs(A_4[:, 3]) )
print(f"new pivot row is {i}")
U[3, :] = A_4[i, :]
L[:, 3] = A_4[:, 3] / U[3, 3];
We do have a factorization of the original matrix:
And U \mathbf{U} U has the required structure:
However, the triangularity of L \mathbf{L} L has been broken.
We will return to the loss of triangularity in L \mathbf{L} L momentarily. First, though, there is a question left to answer: what if at some stage, all the elements of the targeted column are zero, i.e., there are no available pivots? Fortunately that loose end ties up nicely, although a proof is a bit beyond our scope here.
A linear system with a singular matrix has either no solution or infinitely many solutions. Either way, a technique other than LU factorization is needed to handle it.
2.6.2 Permutations ¶ Even though the resulting L \mathbf{L} L in Demo 2.6.2 is no longer of unit lower triangular form, it is close. In fact, all that is needed is to reverse the order of its rows.
In principle, if the permutation of rows implied by the pivot locations is applied all at once to the original A \mathbf{A} A , no further pivoting is needed. In practice, this permutation cannot be determined immediately from the original A \mathbf{A} A ; the only way to find it is to run the algorithm. Having obtained it at the end, though, we can use it to state a simple relationship.
Definition 2.6.1 (P
LU factorization)
Given n × n n\times n n × n matrix A \mathbf{A} A , the PLU factorization is a unit lower triangular L \mathbf{L} L , an upper triangular U \mathbf{U} U , and a permutation i 1 , … , i n i_1,\ldots,i_n i 1 , … , i n of the integers 1 , … , n 1,\ldots,n 1 , … , n , such that
A ~ = L U , \tilde{\mathbf{A}} = \mathbf{L}\mathbf{U}, A ~ = LU , where rows 1 , … , n 1,\ldots,n 1 , … , n of A ~ \tilde{\mathbf{A}} A ~ are rows i 1 , … , i n i_1,\ldots,i_n i 1 , … , i n of A \mathbf{A} A .
Function 2.6.2 shows our implementation of PLU factorization.[1]
LU factorization with partial pivotingLU factorization with partial pivotingLU factorization with partial pivoting1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def plufact(A):
"""
plufact(A)
Compute the PLU factorization of square matrix `A`, returning the
triangular factors and a row permutation vector.
"""
n = A.shape[0]
L = np.zeros((n, n))
U = np.zeros((n, n))
p = np.zeros(n, dtype=int)
A_k = np.copy(A)
# Reduction by np.outer products
for k in range(n):
p[k] = np.argmax(abs(A_k[:, k]))
U[k, :] = A_k[p[k], :]
L[:, k] = A_k[:, k] / U[k, k]
if k < n-1:
A_k -= np.outer(L[:, k], U[k, :])
return L[p, :], U, p
Ideally, the PLU factorization takes ∼ 2 3 n 3 \sim \frac{2}{3}n^3 ∼ 3 2 n 3 flops asymptotically, just like LU without pivoting. The implementation in Function 2.6.2 does not achieve this optimal flop count, however. Like Function 2.4.1 , it does unnecessary operations on structurally known zeros for the sake of being easier to understand.
2.6.3 Linear systems ¶ The output of Function 2.6.2 is a factorization of a row-permuted A \mathbf{A} A . Therefore, given a linear system A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b , we have to permute b \mathbf{b} b the same way before applying forward and backward substitution. This is equivalent to changing the order of the equations in a linear system, which does not affect its solution.
PLU factorization for solving linear systemsThe third output of plufact
is the permutation vector we need to apply to A \mathbf{A} A .
A = rand(1:20, 4, 4)
L, U, p = FNC.plufact(A)
A[p,:] - L * U # should be ≈ 0
Given a vector b \mathbf{b} b , we solve A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b by first permuting the entries of b \mathbf{b} b and then proceeding as before.
b = rand(4)
z = FNC.forwardsub(L,b[p])
x = FNC.backsub(U,z)
A residual check is successful:
PLU factorization for solving linear systemsThe third output of plufact
is the permutation vector we need to apply to A \mathbf{A} A .
A = randi(20, 4, 4);
[L, U, p] = plufact(A);
A(p, :) - L * U % should be ≈ 0
Given a vector b \mathbf{b} b , we solve A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b by first permuting the entries of b \mathbf{b} b and then proceeding as before.
b = rand(4, 1);
z = forwardsub(L, b(p));
x = backsub(U, z)
A residual check is successful:
PLU factorization for solving linear systemsThe third output of plufact
is the permutation vector we need to apply to A \mathbf{A} A .
A = random.randn(4, 4)
L, U, p = FNC.plufact(A)
A[p, :] - L @ U # should be ≈ 0
Given a vector b \mathbf{b} b , we solve A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b by first permuting the entries of b \mathbf{b} b and then proceeding as before.
b = random.randn(4)
z = FNC.forwardsub(L, b[p])
x = FNC.backsub(U, z)
A residual check is successful:
The lu
function from the built-in package LinearAlgebra
returns the same three outputs as Function 2.6.2 . If you only request one output, it will be a factorization object that can be used with a backslash. This is useful when you want to solve with multiple versions of b \mathbf{b} b but do the factorization only once.
Built-in PLU factorizationWith the syntax A \ b
, the matrix A
is PLU -factored, followed by two triangular solves.
A = randn(500, 500) # 500x500 with normal random entries
A \ rand(500) # force compilation
@elapsed for k=1:50; A \ rand(500); end
In Efficiency of matrix computations we showed that the factorization is by far the most costly part of the solution process. A factorization object allows us to do that costly step only once per unique matrix.
factored = lu(A) # store factorization result
factored \ rand(500) # force compilation
@elapsed for k=1:50; factored \ rand(500); end
Built-in PLU factorizationWith the syntax A \ b
, the matrix A
is PLU -factored, followed by two triangular solves.
A = randn(500, 500); % 500x500 with normal random entries
tic; for k=1:50; A \ rand(500, 1); end; toc
In Efficiency of matrix computations we showed that the factorization is by far the most costly part of the solution process. A factorization object allows us to do that costly step only once per unique matrix.
[L, U, p] = lu(A, 'vector'); % keep factorization result
tic
for k=1:50
b = rand(500, 1);
U \ (L \ b(p));
end
toc
Built-in PLU factorizationIn linalg.solve
, the matrix A
is PLU -factored, followed by two triangular solves. If we want to do those steps seamlessly, we can use the lu_factor
and lu_solve
from scipy.linalg
.
from scipy.linalg import lu_factor, lu_solve
A = random.randn(500, 500)
b = ones(500)
LU, perm = lu_factor(A)
x = lu_solve((LU, perm), b)
Why would we ever bother with this? In Efficiency of matrix computations we showed that the factorization is by far the most costly part of the solution process. A factorization object allows us to do that costly step only once per matrix, but solve with multiple right-hand sides.
start = timer()
for k in range(50): linalg.solve(A, random.rand(500))
print(f"elapsed time for 50 full solves: {timer() - start}")
start = timer()
LU, perm = lu_factor(A)
for k in range(50): lu_solve((LU, perm), random.rand(500))
print(f"elapsed time for 50 shortcut solves: {timer() - start}")
2.6.4 Stability ¶ There is one detail of the row pivoting algorithm that might seem arbitrary: why choose the pivot of largest magnitude in a column, rather than, say, the uppermost nonzero in the column? The answer is numerical stability.
Let
A = [ − ϵ 1 1 − 1 ] . \mathbf{A} =
\begin{bmatrix}
-\epsilon & 1 \\ 1 & -1
\end{bmatrix}. A = [ − ϵ 1 1 − 1 ] . If ϵ = 0 \epsilon=0 ϵ = 0 , LU factorization without pivoting fails for A \mathbf{A} A . But if ϵ ≠ 0 \epsilon\neq 0 ϵ = 0 , we can go without pivoting, at least in principle.
Stability of PLU factorizationWe construct a linear system for this matrix with ϵ = 1 0 − 12 \epsilon=10^{-12} ϵ = 1 0 − 12 and exact solution [ 1 , 1 ] [1,1] [ 1 , 1 ] :
ϵ = 1e-12
A = [-ϵ 1; 1 -1]
b = A * [1, 1]
We can factor the matrix without pivoting and solve for x \mathbf{x} x .
L, U = FNC.lufact(A)
x = FNC.backsub( U, FNC.forwardsub(L, b) )
Note that we have obtained only about 5 accurate digits for x 1 x_1 x 1 . We could make the result even more inaccurate by making ε even smaller:
ϵ = 1e-20; A = [-ϵ 1; 1 -1]
b = A * [1, 1]
L, U = FNC.lufact(A)
x = FNC.backsub( U, FNC.forwardsub(L, b) )
This effect is not due to ill conditioning of the problem—a solution with PLU factorization works perfectly:
Stability of PLU factorizationWe construct a linear system for this matrix with ϵ = 1 0 − 12 \epsilon=10^{-12} ϵ = 1 0 − 12 and exact solution [ 1 , 1 ] [1, 1] [ 1 , 1 ] :
ep = 1e-12
A = [-ep 1; 1 -1];
b = A * [1; 1];
We can factor the matrix without pivoting and solve for x \mathbf{x} x .
[L, U] = lufact(A);
x = backsub( U, forwardsub(L, b) )
Note that we have obtained only about 5 accurate digits for x 1 x_1 x 1 . We could make the result even more inaccurate by making ε even smaller:
ep = 1e-20; A = [-ep 1; 1 -1];
b = A * [1; 1];
[L, U] = lufact(A);
x = backsub( U, forwardsub(L, b) )
This effect is not due to ill conditioning of the problem—a solution with PLU factorization works perfectly:
Stability of PLU factorizationWe construct a linear system for this matrix with ϵ = 1 0 − 12 \epsilon=10^{-12} ϵ = 1 0 − 12 and exact solution [ 1 , 1 ] [1,1] [ 1 , 1 ] :
ep = 1e-12
A = array([[-ep, 1], [1, -1]])
b = A @ array([1, 1])
We can factor the matrix without pivoting and solve for x \mathbf{x} x .
L, U = FNC.lufact(A)
print(FNC.backsub( U, FNC.forwardsub(L, b) ))
Note that we have obtained only about 5 accurate digits for x 1 x_1 x 1 . We could make the result even more inaccurate by making ε even smaller:
ep = 1e-20;
A = array([[-ep, 1], [1, -1]])
b = A @ array([1, 1])
L, U = FNC.lufact(A)
print(FNC.backsub( U, FNC.forwardsub(L, b) ))
This effect is not due to ill conditioning of the problem—a solution with PLU factorization works perfectly:
The factors of this A \mathbf{A} A without pivoting are found to be
L = [ 1 0 − ϵ − 1 1 ] , U = [ − ϵ 1 0 ϵ − 1 − 1 ] . \mathbf{L} =
\begin{bmatrix}
1 & 0 \\ -\epsilon^{-1} & 1
\end{bmatrix}, \qquad
\mathbf{U} =
\begin{bmatrix}
-\epsilon & 1 \\ 0 & \epsilon^{-1}-1
\end{bmatrix}. L = [ 1 − ϵ − 1 0 1 ] , U = [ − ϵ 0 1 ϵ − 1 − 1 ] . For reasons we will quantify in Conditioning of linear systems , the solution of A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b is well-conditioned, but the problems of solving L z = b \mathbf{L}\mathbf{z}=\mathbf{b} Lz = b and U x = z \mathbf{U}\mathbf{x}=\mathbf{z} Ux = z have condition numbers essentially 1 / ϵ 2 1/\epsilon^2 1/ ϵ 2 each. Thus, for small ε, solution of the original linear system by unpivoted LU factorization is highly unstable.
Somewhat surprisingly, solving A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b via PLU factorization is technically also unstable. In fact, examples of unstable solutions are well-known, but they have been nonexistent in practice. While there is a lot of evidence and some reasoning about why this is the case, the situation is not completely understood. Yet PLU factorization remains the algorithm of choice for general linear systems.
2.6.5 Exercises ¶ ✍ Perform by hand the pivoted LU factorization of each matrix.
(a) [ 2 3 4 4 5 10 4 8 2 ] , \quad \displaystyle \begin{bmatrix}
2 & 3 & 4 \\
4 & 5 & 10 \\
4 & 8 & 2
\end{bmatrix},\qquad ⎣ ⎡ 2 4 4 3 5 8 4 10 2 ⎦ ⎤ ,
(b) [ 1 4 5 − 5 − 1 0 − 1 − 5 1 3 − 1 2 1 − 1 5 − 1 ] \quad \displaystyle \begin{bmatrix}
1 & 4 & 5 & -5 \\
-1 & 0 & -1 & -5 \\
1 & 3 & -1 & 2 \\
1 & -1 & 5 & -1
\end{bmatrix} ⎣ ⎡ 1 − 1 1 1 4 0 3 − 1 5 − 1 − 1 5 − 5 − 5 2 − 1 ⎦ ⎤ .
✍ Let A \mathbf{A} A be a square matrix and b \mathbf{b} b be a column vector of compatible length. Here is correct Julia code to solve A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b :
L,U,p = lu(A)
x = U \ (L\b[p])
Suppose instead you replace the last line above with
Mathematically in terms of L \mathbf{L} L , U \mathbf{U} U , p \mathbf{p} p , and b \mathbf{b} b , what vector is found?
✍ Suppose that A
is a 4 × 6 4\times 6 4 × 6 matrix in Julia and you define
Show that B = P A Q \mathbf{B} = \mathbf{P} \mathbf{A} \mathbf{Q} B = PAQ for certain matrices P \mathbf{P} P and Q \mathbf{Q} Q .
✍ An n × n n\times n n × n permutation matrix P \mathbf{P} P is a reordering of the rows of an identity matrix such that P A \mathbf{P} \mathbf{A} PA has the effect of moving rows 1 , 2 , … , n 1,2,\ldots,n 1 , 2 , … , n of A \mathbf{A} A to new positions i 1 , i 2 , … , i n i_1,i_2,\ldots,i_n i 1 , i 2 , … , i n . Then P \mathbf{P} P can be expressed as
P = e i 1 e 1 T + e i 2 e 2 T + ⋯ + e i n e n T . \mathbf{P} = \mathbf{e}_{i_1}\mathbf{e}_1^T + \mathbf{e}_{i_2}\mathbf{e}_2^T + \cdots + \mathbf{e}_{i_n}\mathbf{e}_n^T. P = e i 1 e 1 T + e i 2 e 2 T + ⋯ + e i n e n T . (a) For the case n = 4 n=4 n = 4 and i 1 = 3 i_1=3 i 1 = 3 , i 2 = 2 i_2=2 i 2 = 2 , i 3 = 4 i_3=4 i 3 = 4 , i 4 = 1 i_4=1 i 4 = 1 , write out separately, as matrices, all four of the terms in the sum. Then add them together to find P \mathbf{P} P .
(b) Use the formula in the general case to show that P − 1 = P T \mathbf{P}^{-1}=\mathbf{P}^T P − 1 = P T .