A common situation in computation is that a problem has certain properties or structure that can be used to get a faster or more accurate solution. There are many properties of a matrix that can affect LU factorization. For example, an n × n n \times n n × n matrix A A A is diagonally dominant if
∣ A i i ∣ > ∑ j = 1 j ≠ i n ∣ A i j ∣ for each i = 1 , … , n . |A_{ii}| > \sum_{\substack{j=1\\ j \neq i}}^{n} |A_{ij}| \hskip 0.25in \text{for each } i=1,\ldots,n. ∣ A ii ∣ > j = 1 j = i ∑ n ∣ A ij ∣ for each i = 1 , … , n . It turns out that a diagonally dominant matrix is guaranteed to be invertible, and row pivoting is not required for stability.
We next consider three important types of matrices that cause the LU factorization to be specialized in some important way.
2.9.1 Banded matrices ¶ A matrix A \mathbf{A} A has upper bandwidth b u b_u b u if j − i > b u j-i > b_u j − i > b u implies A i j = 0 A_{ij}=0 A ij = 0 , and lower bandwidth b ℓ b_\ell b ℓ if i − j > b ℓ i-j > b_\ell i − j > b ℓ implies A i j = 0 A_{ij}=0 A ij = 0 . We say the total bandwidth is b u + b ℓ + 1 b_u+b_\ell+1 b u + b ℓ + 1 . When b u = b ℓ = 1 b_u=b_\ell=1 b u = b ℓ = 1 , we have the important case of a tridiagonal matrix .
Example 2.9.1 (Banded matrices)
Example 2.9.1 Here is a small tridiagonal matrix. Note that there is one fewer element on the super- and subdiagonals than on the main diagonal.
Tip
Use fill
to create an array of a given size, with each element equal to a provided value.
A = diagm( -1 => [4, 3, 2, 1, 0],
0 => [2, 2, 0, 2, 1, 2],
1 => fill(-1, 5) )
6×6 Matrix{Int64}:
2 -1 0 0 0 0
4 2 -1 0 0 0
0 3 0 -1 0 0
0 0 2 2 -1 0
0 0 0 1 1 -1
0 0 0 0 0 2
We can extract the elements on any diagonal using the diag
function. The main or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.
Tip
The diag
function extracts the elements from a specified diagonal of a matrix.
@show diag_main = diag(A);
@show diag_minusone = diag(A, -1);
diag_main = diag(A) = [2, 2, 0, 2, 1, 2]
diag_minusone = diag(A, -1) = [4, 3, 2, 1, 0]
The lower and upper bandwidths of A \mathbf{A} A are repeated in the factors from the unpivoted LU factorization.
6×6 LowerTriangular{Float64, Matrix{Float64}}:
1.0 ⋅ ⋅ ⋅ ⋅ ⋅
2.0 1.0 ⋅ ⋅ ⋅ ⋅
0.0 0.75 1.0 ⋅ ⋅ ⋅
0.0 0.0 2.66667 1.0 ⋅ ⋅
0.0 0.0 0.0 0.214286 1.0 ⋅
0.0 0.0 0.0 0.0 0.0 1.0
6×6 UpperTriangular{Float64, Matrix{Float64}}:
2.0 -1.0 0.0 0.0 0.0 0.0
⋅ 4.0 -1.0 0.0 0.0 0.0
⋅ ⋅ 0.75 -1.0 0.0 0.0
⋅ ⋅ ⋅ 4.66667 -1.0 0.0
⋅ ⋅ ⋅ ⋅ 1.21429 -1.0
⋅ ⋅ ⋅ ⋅ ⋅ 2.0
Example 2.9.1 Here is a small tridiagonal matrix. Note that there is one fewer element on the super- and subdiagonals than on the main diagonal.
A = [ 2 -1 0 0 0 0
4 2 -1 0 0 0
0 3 0 -1 0 0
0 0 2 2 -1 0
0 0 0 1 1 -1
0 0 0 0 0 2 ];
We can extract the elements on any diagonal using the diag
function. The main or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.
Tip
The diag
function extracts the elements from a specified diagonal of a matrix.
diag_main = diag(A, 0)'
diag_plusone = diag(A, 1)'
diag_minusone = diag(A,-1)'
A = A + diag([5 8 6 7], 2)
The lower and upper bandwidths of A \mathbf{A} A are repeated in the factors from the unpivoted LU factorization.
Example 2.9.1 Here is a matrix with both lower and upper bandwidth equal to one. Such a matrix is called tridiagonal.
A = array([
[2, -1, 0, 0, 0, 0],
[4, 2, -1, 0, 0, 0],
[0, 3, 0, -1, 0, 0],
[0, 0, 2, 2, -1, 0],
[0, 0, 0, 1, 1, -1],
[0, 0, 0, 0, 0, 2 ]
])
We can extract the elements on any diagonal using the diag
command. The “main” or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.
We can also construct matrices by specifying a diagonal with the diag
function.
A = A + diag([pi, 8, 6, 7], 2)
print(A)
[[ 2. -1. 3.14159265 0. 0. 0. ]
[ 4. 2. -1. 8. 0. 0. ]
[ 0. 3. 0. -1. 6. 0. ]
[ 0. 0. 2. 2. -1. 7. ]
[ 0. 0. 0. 1. 1. -1. ]
[ 0. 0. 0. 0. 0. 2. ]]
L, U = FNC.lufact(A)
print(L)
[[1. 0. 0. 0. 0. 0. ]
[2. 1. 0. 0. 0. 0. ]
[0. 0.75 1. 0. 0. 0. ]
[0. 0. 0.36614016 1. 0. 0. ]
[0. 0. 0. 0.21915497 1. 0. ]
[0. 0. 0. 0. 0. 1. ]]
[[ 2. -1. 3.14159265 0. 0. 0. ]
[ 0. 4. -7.28318531 8. 0. 0. ]
[ 0. 0. 5.46238898 -7. 6. 0. ]
[ 0. 0. 0. 4.56298115 -3.19684099 7. ]
[ 0. 0. 0. 0. 1.70060359 -2.53408479]
[ 0. 0. 0. 0. 0. 2. ]]
Observe above that the lower and upper bandwidths of A \mathbf{A} A are preserved in the factor matrices.
If row pivoting is not used, the L \mathbf{L} L and U \mathbf{U} U factors preserve the lower and upper bandwidths of A \mathbf{A} A . This fact implies computational savings in both the factorization and the triangular substitutions because the zeros appear predictably and we can skip multiplication and addition with them.
In order to exploit the savings offered by sparsity, we would need to make modifications to Function 2.4.1 and the triangular substitution routines. Alternatively, we can get Julia to take advantage of the structure automatically by converting the matrix into a special type called sparse . Sparse matrices are covered in more detail in Chapters 7 and 8.
2.9.2 Symmetric matrices ¶ Symmetric matrices arise frequently in applications because many types of interactions, such as gravitation and social-network befriending, are inherently symmetric. Symmetry in linear algebra simplifies many properties and algorithms. As a rule of thumb, if your matrix has symmetry, you want to exploit and preserve it.
In A = L U \mathbf{A}=\mathbf{L}\mathbf{U} A = LU we arbitrarily required the diagonal elements of L \mathbf{L} L , but not U \mathbf{U} U , to be one. That breaks symmetry, so we need to modify the goal to
A = L D L T , \mathbf{A}=\mathbf{L}\mathbf{D}\mathbf{L}^T, A = LD L T , where L \mathbf{L} L is unit lower triangular and D \mathbf{D} D is diagonal. To find an algorithm for this factorization, we begin by generalizing (2.4.4) a bit without furnishing proof.
Theorem 2.9.1 (Linear combination of outer products)
Let D \mathbf{D} D be an n × n n\times n n × n diagonal matrix with diagonal elements d 1 , d 2 , … , d n d_1,d_2,\ldots,d_n d 1 , d 2 , … , d n , and suppose A \mathbf{A} A and B \mathbf{B} B are n × n n\times n n × n as well. Write the columns of A \mathbf{A} A as a 1 , … , a n \mathbf{a}_1,\dots,\mathbf{a}_n a 1 , … , a n and the rows of B \mathbf{B} B as b 1 T , … , b n T \mathbf{b}_1^T,\dots,\mathbf{b}_n^T b 1 T , … , b n T . Then
A D B = ∑ k = 1 n d k a k b k T . \mathbf{A}\mathbf{D}\mathbf{B} = \sum_{k=1}^n d_k \mathbf{a}_k \mathbf{b}_k^T. ADB = k = 1 ∑ n d k a k b k T . Let’s derive the LDLT factorization for a small example.
Example 2.9.2 (Symmetric LDL
T factorization)
Example 2.9.2 We begin with a symmetric A \mathbf{A} A .
A₁ = [ 2 4 4 2
4 5 8 -5
4 8 6 2
2 -5 2 -26 ];
We won’t use pivoting, so the pivot element is at position (1,1). This will become the first element on the diagonal of D \mathbf{D} D . Then we divide by that pivot to get the first column of L \mathbf{L} L .
L = diagm(ones(4))
d = zeros(4)
d[1] = A₁[1, 1]
L[:, 1] = A₁[:, 1] / d[1]
A₂ = A₁ - d[1] * L[:, 1] * L[:, 1]'
4×4 Matrix{Float64}:
0.0 0.0 0.0 0.0
0.0 -3.0 0.0 -9.0
0.0 0.0 -2.0 -2.0
0.0 -9.0 -2.0 -28.0
We are now set up the same way for the submatrix in rows and columns 2–4.
d[2] = A₂[2, 2]
L[:, 2] = A₂[:, 2] / d[2]
A₃ = A₂ - d[2] * L[:, 2] * L[:, 2]'
4×4 Matrix{Float64}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 -2.0 -2.0
0.0 0.0 -2.0 -1.0
We continue working our way down the diagonal.
d[3] = A₃[3, 3]
L[:, 3] = A₃[:, 3] / d[3]
A₄ = A₃ - d[3] * L[:, 3] * L[:, 3]'
d[4] = A₄[4, 4]
@show d;
L
d = [2.0, -3.0, -2.0, 1.0]
4×4 Matrix{Float64}:
1.0 -0.0 -0.0 0.0
2.0 1.0 -0.0 0.0
2.0 -0.0 1.0 0.0
1.0 3.0 1.0 1.0
We have arrived at the desired factorization, which we can validate:
opnorm(A₁ - (L * diagm(d) * L'))
Example 2.9.2 We begin with a symmetric A \mathbf{A} A .
A_1 = [ 2 4 4 2
4 5 8 -5
4 8 6 2
2 -5 2 -26 ];
We won’t use pivoting, so the pivot element is at position (1,1). This will become the first element on the diagonal of D \mathbf{D} D . Then we divide by that pivot to get the first column of L \mathbf{L} L .
L = eye(4);
d = zeros(4, 1);
d(1) = A_1(1, 1);
L(:, 1) = A_1(:, 1) / d(1);
A_2 = A_1 - d(1) * L(:, 1) * L(:, 1)'
We are now set up the same way for the submatrix in rows and columns 2–4.
d(2) = A_2(2, 2);
L(:, 2) = A_2(:, 2) / d(2);
A_3 = A_2 - d(2) * L(:, 2) * L(:, 2)'
We continue working our way down the diagonal.
d(3) = A_3(3, 3);
L(:, 3) = A_3(:, 3) / d(3);
A_4 = A_3 - d(3) * L(:, 3) * L(:, 3)'
d(4) = A_4(4, 4);
d
L
We have arrived at the desired factorization, which we can validate:
norm(A_1 - (L * diag(d) * L'))
Example 2.9.2 We begin with a symmetric A \mathbf{A} A .
A_1 = array([
[2, 4, 4, 2],
[4, 5, 8, -5],
[4, 8, 6, 2],
[2, -5, 2, -26]
])
We won’t use pivoting, so the pivot element is at position (1,1). This will become the first element on the diagonal of D \mathbf{D} D . Then we divide by that pivot to get the first column of L \mathbf{L} L .
L = eye(4)
d = zeros(4)
d[0] = A_1[0, 0]
L[:, 0] = A_1[:, 0] / d[0]
A_2 = A_1 - d[0] * outer(L[:, 0], L[:, 0])
print(A_2)
[[ 0. 0. 0. 0.]
[ 0. -3. 0. -9.]
[ 0. 0. -2. -2.]
[ 0. -9. -2. -28.]]
We are now set up the same way for the submatrix in rows and columns 2–4.
d[1] = A_2[1, 1]
L[:, 1] = A_2[:, 1] / d[1]
A_3 = A_2 - d[1] * outer(L[:, 1], L[:, 1])
print(A_3)
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. -2. -2.]
[ 0. 0. -2. -1.]]
We continue working our way down the diagonal.
d[2] = A_3[2, 2]
L[:, 2] = A_3[:, 2] / d[2]
A_4 = A_3 - d[2] * outer(L[:, 2], L[:, 2])
print(A_4)
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 1.]]
We have arrived at the desired factorization.
d[3] = A_4[3, 3]
print("diagonal of D:")
print(d)
print("L:")
print(L)
diagonal of D:
[ 2. -3. -2. 1.]
L:
[[ 1. -0. -0. 0.]
[ 2. 1. -0. 0.]
[ 2. -0. 1. 0.]
[ 1. 3. 1. 1.]]
This should be comparable to machine roundoff:
print(norm(A_1 - (L @ diag(d) @ L.T), 2) / norm(A_1))
In practice we don’t actually have to carry out any arithmetic in the upper triangle of A \mathbf{A} A as we work, since the operations are always the mirror image of those in the lower triangle. As a result, it can be shown that LDLT factorization takes about half as much work as the standard LU.
Just as pivoting is necessary to stabilize LU factorization, the LDLT factorization without pivoting may be unstable or even fail to exist. We won’t go into the details, because our interest is in specializing the factorization to matrices that also possess another important property.
2.9.3 Symmetric positive definite matrices ¶ Suppose that A \mathbf{A} A is n × n n\times n n × n and x ∈ R n \mathbf{x}\in\mathbb{R}^n x ∈ R n . Observe that x T A x \mathbf{x}^T\mathbf{A}\mathbf{x} x T Ax is the product of 1 × n 1\times n 1 × n , n × n n\times n n × n , and n × 1 n\times 1 n × 1 matrices, so it is a scalar, sometimes referred to as a quadratic form . It can be expressed as
The definiteness property is usually difficult to check directly from the definition. There are some equivalent conditions, though. For instance, a symmetric matrix is positive definite if and only if its eigenvalues are all real positive numbers. SPD matrices have important properties and appear in applications in which the definiteness is known for theoretical reasons.
Let us consider what definiteness means to the LDLT factorization. We compute
0 < x T A x = x T L D L T x = z T D z , 0 < \mathbf{x}^T\mathbf{A}\mathbf{x} = \mathbf{x}^T \mathbf{L} \mathbf{D} \mathbf{L}^T \mathbf{x} = \mathbf{z}^T \mathbf{D} \mathbf{z}, 0 < x T Ax = x T LD L T x = z T Dz , where z = L T x \mathbf{z}=\mathbf{L}^T \mathbf{x} z = L T x . Note that since L \mathbf{L} L is unit lower triangular, it is nonsingular, so x = L − T z \mathbf{x}=\mathbf{L}^{-T}\mathbf{z} x = L − T z . By taking z = e k \mathbf{z}=\mathbf{e}_k z = e k for k = 1 , … , n k=1,\ldots,n k = 1 , … , n , we can read the equalities from right to left and conclude that D k k > 0 D_{kk}>0 D kk > 0 for all k k k . That permits us to write a kind of square root formula:[1]
D = [ D 11 D 22 ⋱ D n n ] = [ D 11 D 22 ⋱ D n n ] 2 = ( D 1 / 2 ) 2 . \mathbf{D} =
\begin{bmatrix}
D_{11} & & & \\
& D_{22} & & \\
& & \ddots & \\
& & & D_{nn}
\end{bmatrix}
= \begin{bmatrix}
\sqrt{D_{11}} & & & \\
& \sqrt{D_{22}} & & \\
& & \ddots & \\
& & & \sqrt{D_{nn}}
\end{bmatrix}^{\,2}
= \bigl( \mathbf{D}^{1/2} \bigr)^2. D = ⎣ ⎡ D 11 D 22 ⋱ D nn ⎦ ⎤ = ⎣ ⎡ D 11 D 22 ⋱ D nn ⎦ ⎤ 2 = ( D 1/2 ) 2 . Now we have A = L D 1 / 2 D 1 / 2 L T = R T R \mathbf{A}=\mathbf{L}\mathbf{D}^{1/2}\mathbf{D}^{1/2}\mathbf{L}^T= \mathbf{R}^T \mathbf{R} A = L D 1/2 D 1/2 L T = R T R , where R = D 1 / 2 L T \mathbf{R} =\mathbf{D}^{1/2}\mathbf{L}^T R = D 1/2 L T is an upper triangular matrix whose diagonal entries are positive.
While the unpivoted LDLT factorization is not stable and not even always possible, in the SPD case one can prove that pivoting is not necessary for the existence nor the stability of the Cholesky factorization.
The speed and stability of the Cholesky factorization make it the top choice for solving linear systems with SPD matrices. As a side benefit, the Cholesky algorithm fails (in the form of an imaginary square root or division by zero) if and only if the matrix A \mathbf{A} A is not positive definite. This is often the best way to test the definiteness of a symmetric matrix about which nothing else is known.
Example 2.9.3 (Cholesky factorization)
Example 2.9.3 A randomly chosen matrix is extremely unlikely to be symmetric. However, there is a simple way to symmetrize one.
A = rand(1.0:9.0, 4, 4)
B = A + A'
4×4 Matrix{Float64}:
2.0 7.0 16.0 8.0
7.0 14.0 13.0 6.0
16.0 13.0 2.0 9.0
8.0 6.0 9.0 14.0
Similarly, a random symmetric matrix is unlikely to be positive definite. The Cholesky algorithm always detects a non-PD matrix by quitting with an error.
Tip
The cholesky
function computes a Cholesky factorization if possible, or throws an error for a non-positive-definite matrix. However, it does not check for symmetry.
cholesky(B) # throws an error
PosDefException: matrix is not positive definite; Factorization failed.
Stacktrace:
[1] checkpositivedefinite
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/factorization.jl:68 [inlined]
[2] #cholesky!#163
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:269 [inlined]
[3] cholesky!
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:267 [inlined]
[4] #cholesky!#164
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:301 [inlined]
[5] cholesky! (repeats 2 times)
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:295 [inlined]
[6] _cholesky
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:411 [inlined]
[7] cholesky(A::Matrix{Float64}, ::NoPivot; check::Bool)
@ LinearAlgebra ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:401
[8] cholesky
@ ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:401 [inlined]
[9] cholesky(A::Matrix{Float64})
@ LinearAlgebra ~/.julia/juliaup/julia-1.11.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/cholesky.jl:401
[10] top-level scope
@ In[139]:1
It’s not hard to manufacture an SPD matrix to try out the Cholesky factorization.
B = A' * A
cf = cholesky(B)
Cholesky{Float64, Matrix{Float64}}
U factor:
4×4 UpperTriangular{Float64, Matrix{Float64}}:
7.93725 8.44121 4.91354 9.5751
⋅ 6.53804 9.86898 3.39163
⋅ ⋅ 4.29656 4.53397
⋅ ⋅ ⋅ 1.50247
What’s returned is a factorization object. Another step is required to extract the factor as a matrix:
4×4 UpperTriangular{Float64, Matrix{Float64}}:
7.93725 8.44121 4.91354 9.5751
⋅ 6.53804 9.86898 3.39163
⋅ ⋅ 4.29656 4.53397
⋅ ⋅ ⋅ 1.50247
Here we validate the factorization:
opnorm(R' * R - B) / opnorm(B)
Example 2.9.3 A randomly chosen matrix is extremely unlikely to be symmetric. However, there is a simple way to symmetrize one.
A = magic(4) + eye(4);
B = A + A'
Similarly, a random symmetric matrix is unlikely to be positive definite. The Cholesky algorithm always detects a non-PD matrix by quitting with an error.
Tip
The chol
function computes a Cholesky factorization if possible, or throws an error for a non-positive-definite matrix.
chol(B) % throws an error
Error using chol
Matrix must be positive definite.
It’s not hard to manufacture an SPD matrix to try out the Cholesky factorization.
Here we validate the factorization:
norm(R' * R - B) / norm(B)
Example 2.9.3 A randomly chosen matrix is extremely unlikely to be symmetric. However, there is a simple way to symmetrize one.
A = 1.0 + floor(9 * random.rand(4, 4))
B = A + A.T
print(B)
[[ 8. 12. 10. 7.]
[12. 6. 12. 6.]
[10. 12. 14. 4.]
[ 7. 6. 4. 8.]]
Similarly, a random symmetric matrix is unlikely to be positive definite. The Cholesky algorithm always detects a non-PD matrix by quitting with an error.
from numpy.linalg import cholesky
cholesky(B) # raises an exception, not positive definite
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
Cell In[138], line 2
1 from numpy . linalg import cholesky
----> 2 cholesky ( B ) # raises an exception, not positive definite
File ~/mambaforge/envs/myst/lib/python3.13/site-packages/numpy/linalg/_linalg.py:848 , in cholesky (a, upper)
845 signature = ' D->D ' if isComplexType(t) else ' d->d '
846 with errstate(call = _raise_linalgerror_nonposdef, invalid = ' call ' ,
847 over = ' ignore ' , divide = ' ignore ' , under = ' ignore ' ):
--> 848 r = gufunc ( a , signature = signature )
849 return wrap(r . astype(result_t, copy = False ))
File ~/mambaforge/envs/myst/lib/python3.13/site-packages/numpy/linalg/_linalg.py:107 , in _raise_linalgerror_nonposdef (err, flag)
106 def _raise_linalgerror_nonposdef (err, flag):
--> 107 raise LinAlgError( " Matrix is not positive definite " )
LinAlgError : Matrix is not positive definite
It’s not hard to manufacture an SPD matrix to try out the Cholesky factorization:
B = A.T @ A
R = cholesky(B)
print(R)
[[11.22497216 0. 0. 0. ]
[ 6.05792148 4.27803545 0. 0. ]
[11.58132048 -0.27085567 3.71478843 0. ]
[ 5.79066024 -0.25230391 0.77209511 3.43634484]]
print(norm(R @ R.T - B) / norm(B))
2.9.4 Exercises ¶ ✍ For each matrix, use (2.9.1) to determine whether it is diagonally dominant.
A = [ 3 1 0 1 0 − 2 0 1 − 1 0 4 − 1 0 0 0 6 ] , B = [ 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 ] , C = [ 2 − 1 0 0 − 1 2 − 1 0 0 − 1 2 − 1 0 0 − 1 2 ] . \mathbf{A} =
\begin{bmatrix}
3 & 1 & 0 & 1 \\
0 & -2 & 0 & 1 \\
-1 & 0 & 4 & -1 \\
0 & 0 & 0 & 6
\end{bmatrix},
\quad
\mathbf{B} =
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0
\end{bmatrix},
\quad \mathbf{C} =
\begin{bmatrix}
2 & -1 & 0 & 0 \\
-1 & 2 & -1 & 0 \\
0 & -1 & 2 & -1 \\
0 & 0 & -1 & 2
\end{bmatrix}. A = ⎣ ⎡ 3 0 − 1 0 1 − 2 0 0 0 0 4 0 1 1 − 1 6 ⎦ ⎤ , B = ⎣ ⎡ 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 ⎦ ⎤ , C = ⎣ ⎡ 2 − 1 0 0 − 1 2 − 1 0 0 − 1 2 − 1 0 0 − 1 2 ⎦ ⎤ . ⌨ For each matrix, use inspection or cholesky
in Julia to determine whether it is SPD .
A = [ 1 0 − 1 0 4 5 − 1 5 10 ] , B = [ 1 0 1 0 4 5 − 1 5 10 ] , C = [ 1 0 1 0 4 5 1 5 1 ] . \mathbf{A} =
\begin{bmatrix}
1 & 0 & -1 \\ 0 & 4 & 5 \\ -1 & 5 & 10
\end{bmatrix},
\qquad
\mathbf{B} =
\begin{bmatrix}
1 & 0 & 1 \\ 0 & 4 & 5 \\ -1 & 5 & 10
\end{bmatrix},
\qquad
\mathbf{C} =
\begin{bmatrix}
1 & 0 & 1 \\ 0 & 4 & 5 \\ 1 & 5 & 1
\end{bmatrix}. A = ⎣ ⎡ 1 0 − 1 0 4 5 − 1 5 10 ⎦ ⎤ , B = ⎣ ⎡ 1 0 − 1 0 4 5 1 5 10 ⎦ ⎤ , C = ⎣ ⎡ 1 0 1 0 4 5 1 5 1 ⎦ ⎤ . ✍ Show that the diagonal entries of a symmetric positive definite matrix are positive numbers. (Hint: Apply certain special cases of (2.9.5) .)
⌨ Using Function 2.4.1 as a guide, write a function
function luband(A,upper,lower)
that accepts upper and lower bandwidth values and returns LU factors (without pivoting) in a way that avoids doing arithmetic using the locations that are known to stay zero. (Hint: Refer to the more efficient form of lufact
given in Efficiency of matrix computations .)
Test your function on the matrix with elements
A i j = { 1 i + j , − 1 ≤ i − j ≤ 2 , 0 otherwise. A_{ij} = \begin{cases} \frac{1}{i+j}, & -1 \le i-j \le 2,\\
0 & \text{otherwise.} \end{cases} A ij = { i + j 1 , 0 − 1 ≤ i − j ≤ 2 , otherwise. ⌨ The Tridiagonal
matrix type invokes a specialized algorithm for solving a linear system.
(a) Set n=1000
and t=0
. In a loop that runs 50 times, generate a linear system via
A = triu( tril(rand(n,n),1), -1)
b = ones(n)
Using @elapsed
, increment t
by the time it takes to perform A\b
. Print out the final value of t
.
(b) Repeat the experiment of part (a), but generate the matrix via
A = Tridiagonal(rand(n,n))
What is the ratio of running times for part (a) and (b)?
(c) Now perform the experiment of part (b) for n = 1000 , 1200 , 1400 , … , 3000 n=1000,1200,1400,\ldots,3000 n = 1000 , 1200 , 1400 , … , 3000 , keeping the total time for each value of n n n in a vector. Plot running time as a function of n n n on a log-log scale. Is the time most like O ( n ) O(n) O ( n ) , O ( n 2 ) O(n^2) O ( n 2 ) , or O ( n 3 ) O(n^3) O ( n 3 ) ? (If the answer is unclear, try increasing the number of solves per value of n n n to 100 or more.)
✍ Prove that if A \mathbf{A} A is any real invertible square matrix, then A T A \mathbf{A}^T\mathbf{A} A T A is SPD . (Hint: First show that x T A T A x ≥ 0 \mathbf{x}^T\mathbf{A}^T\mathbf{A}\mathbf{x} \ge 0 x T A T Ax ≥ 0 for all x \mathbf{x} x . Then explain why zero is ruled out if x ≠ 0 \mathbf{x}\neq \boldsymbol{0} x = 0 .)