Functions¶
Forward substitution
1 2 3 4 5 6 7 8 9 10 11 12 13
def forwardsub(L,b): """ forwardsub(L,b) Solve the lower-triangular linear system with matrix L and right-hand side vector b. """ n = len(b) x = np.zeros(n) for i in range(n): s = L[i,:i] @ x[:i] x[i] = ( b[i] - s ) / L[i, i] return x
Backward substitution
1 2 3 4 5 6 7 8 9 10 11 12 13
def backsub(U,b): """ backsub(U,b) Solve the upper-triangular linear system with matrix U and right-hand side vector b. """ n = len(b) x = np.zeros(n) for i in range(n-1, -1, -1): s = U[i, i+1:] @ x[i+1:] x[i] = ( b[i] - s ) / U[i, i] return x
LU factorization (not stable)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
def lufact(A): """ lufact(A) Compute the LU factorization of square matrix `A`, returning the factors. """ n = A.shape[0] # detect the dimensions from the input L = np.eye(n) # ones on main diagonal, np.zeros elsewhere U = np.zeros((n, n)) A_k = np.copy(A) # make a working np.copy # Reduction by np.outer products for k in range(n-1): U[k, :] = A_k[k, :] L[:, k] = A_k[:, k] / U[k,k] A_k -= np.outer(L[:,k], U[k,:]) U[n-1, n-1] = A_k[n-1, n-1] return L, U
About the code
Line 11 of Function 2.4.1 points out a subtle issue. Array variables are really just references to blocks of memory. Such a reference is much more efficient to pass around than the complete contents of the array. However, it means that a statement A_k = A
would just clone the array reference of A
into the new variable. Any changes made to entries of A_k
would then also be made to entries of A
, because they refer to the same location in memory. In this context, we don’t want to change the original matrix, so we use copy
here to create an independent copy of the array contents and a new reference to them.
LU factorization with partial pivoting
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
def plufact(A): """ plufact(A) Compute the PLU factorization of square matrix `A`, returning the triangular factors and a row permutation vector. """ n = A.shape[0] L = np.zeros((n, n)) U = np.zeros((n, n)) p = np.zeros(n, dtype=int) A_k = np.copy(A) # Reduction by np.outer products for k in range(n): p[k] = np.argmax(abs(A_k[:, k])) U[k, :] = A_k[p[k], :] L[:, k] = A_k[:, k] / U[k, k] if k < n-1: A_k -= np.outer(L[:, k], U[k, :]) return L[p, :], U, p
Examples¶
# from scipy import *
from numpy import *
from matplotlib.pyplot import *
from numpy.linalg import solve
import scipy.sparse as sparse
from timeit import default_timer as timer
import FNC
Section 2.1¶
Linear system for polynomial interpolation
We create two vectors for data about the population of China. The first has the years of census data and the other has the population, in millions of people.
year = arange(1980, 2020, 10) # from 1980 to 2020 by 10
pop = array([984.736, 1148.364, 1263.638, 1330.141])
It’s convenient to measure time in years since 1980.
t = year - 1980
y = pop
Now we have four data points , so and we seek an interpolating cubic polynomial. We construct the associated Vandermonde matrix:
V = vander(t)
print(V)
To solve a linear system for the vector of polynomial coefficients, we use solve
(imported from numpy.linalg
):
c = solve(V, y)
print(c)
The algorithms used by solve
are the main topic of this chapter. As a check on the solution, we can compute the residual , which should be small (near machine precision).
Matrix multiplication in NumPy is done with @
or matmul
.
print(y - V @ c)
By our definitions, the coefficients in c
are given in descending order of power in . We can use the resulting polynomial to estimate the population of China in 2005:
p = poly1d(c) # construct a polynomial
print(p(2005 - 1980)) # apply the 1980 time shift
The official figure was 1303.72, so our result is rather good.
We can visualize the interpolation process. First, we plot the data as points. Then we add a plot of the interpolant, taking care to shift the variable back to actual years.
scatter(year, y, color="k", label="data");
tt = linspace(0, 30, 300) # 300 times from 1980 to 2010
plot(1980 + tt, p(tt), label="interpolant");
xlabel("year");
ylabel("population (millions)");
title("Population of China");
legend();
Section 2.2¶
Matrix operations
A vector is created using square brackets and commas to enclose and separate its entries.
x = array([3, 3, 0, 1, 0 ])
print(x.shape)
To construct a matrix, you nest the brackets to create a “vector of vectors”. The inner vectors are the rows.
A = array([
[1, 2, 3, 4, 5],
[50, 40, 30, 20, 10],
[pi, sqrt(2), exp(1), (1+sqrt(5))/2, log(3)]
])
print(A)
print(A.shape)
In this text, we treat all vectors as equivalent to matrices with a single column. That isn’t true in NumPy, because even an array has two dimensions, unlike a vector.
array([[3], [1], [2]]).shape
You can concatenate arrays with compatible dimensions using hstack
and vstack
.
print( hstack([A, A]) )
print( vstack([A, A]) )
Transposing a matrix is done by appending .T
to it.
print(A.T)
For matrices with complex values, we usually want instead the adjoint or hermitian, which is .conj().T
.
print((x + 1j).conj().T)
There are many convenient shorthand ways of building vectors and matrices other than entering all of their entries directly or in a loop. To get a vector with evenly spaced entries between two endpoints, you have two options.
print(arange(1, 7, 2)) # from 1 to 7 (not inclusive), step by 2
print(linspace(-1, 1, 5)) # from -1 to 1 (inclusive), with 5 total values
The practical difference between these is whether you want to specify the step size in arange
or the number of points in linspace
.
Accessing an element is done by giving one (for a vector) or two index values in square brackets. In Python, indexing always starts with zero, not 1.
A = array([
[1, 2, 3, 4, 5],
[50, 40, 30, 20, 10],
linspace(-5, 5, 5)
])
x = array([3, 2, 0, 1, -1 ])
print("row 2, col 3 of A:", A[1, 2])
print("first element of x:", x[0])
The indices can be ranges, in which case a slice or block of the matrix is accessed. You build these using a colon in the form start:stop
. However, the last value of this range is stop-1
, not stop
.
print(A[1:3, 0:2]) # rows 2 and 3, cols 1 and 2
If start
or stop
is omitted, the range extends to the first or last index.
print(x[1:]) # elements 2 through the end
print(A[:2, 0]) # first two rows in column 1
Notice in the last case above that even when the slice is in the shape of a column vector, the result is just a vector with one dimension and neither row nor column shape.
There are more variations on the colon ranges. A negative value means to count from the end rather than the beginning. And a colon by itself means to include everything from the relevant dimension.
print(A[:-1, :]) # all rows up to the last, all columns
Finally, start:stop:step
means to step size or stride other than one. You can mix this with the other variations.
print(x[::2]) # all the odd indexes
print(A[:, ::-1]) # reverse the columns
The matrix and vector senses of addition, subtraction, and scalar multiplication and division are all handled by the usual symbols. Two matrices of the same size (what NumPy calls shape) are operated on elementwise.
print(A - 2 * ones([3, 5])) # subtract two from each element
If one operand has a smaller number of dimensions than the other, Python tries to broadcast it in the “missing” dimension(s), and the operation proceeds if the resulting shapes are identical.
print(A - 2) # subtract two from each element
u = array([1, 2, 3, 4, 5])
print(A - u) # repeat this row for every row of A
v = array([1, 2, 3])
print(A - v) # broadcasting this would be 3x3, so it's an error
print(A - v.reshape([3, 1])) # broadcasts to each column of A
Matrix–matrix and matrix–vector products are computed using @
or matmul
.
B = diag([-1, 0, -5]) # create a diagonal 3x3
print(B @ A) # matrix product
is undefined for these matrix sizes.
print(A @ B) # incompatible sizes
The multiplication operator *
is reserved for elementwise multiplication. Both operands have to be the same size, after any potential broadcasts.
print(B * A) # not the same size, so it's an error
print((A / 2) * A) # elementwise
To raise to a power elementwise, use a double star. This will broadcast as well.
print(B)
print(B**3)
print(x)
print(2.0**x)
Most of the mathematical functions, such as cos, sin, log, exp and sqrt, expecting scalars as operands will be broadcast to arrays.
print(cos(pi * x))
Section 2.3¶
Solving linear systems
For a square matrix , the command solve(A, B)
is mathematically equivalent to .
A = array([[1, 0, -1], [2, 2, 1], [-1, -3, 0]])
b = array([1, 2, 3])
x = solve(A, b)
print(x)
One way to check the answer is to compute a quantity known as the residual. It is (ideally) close to machine precision(relative to the elements in the data).
residual = b - A @ x
print(residual)
If the matrix is singular, you may get an error.
A = array([[0, 1], [0, 0]])
b = array([1, -1])
solve(A, b) # error, singular matrix
A linear system with a singular matrix might have no solution or infinitely many solutions, but in either case, a numerical solution becomes trickier. Detecting singularity is a lot like checking whether two floating-point numbers are exactly equal: because of roundoff, it could be missed. We’re headed toward a more robust way to fully describe this situation.
Triangular systems of equations
It’s easy to get just the lower triangular part of any matrix using the tril
function.
A = 1 + floor(9 * random.rand(5, 5))
L = tril(A)
print(L)
We’ll set up and solve a linear system with this matrix.
b = ones(5)
x = FNC.forwardsub(L, b)
print(x)
It’s not clear how accurate this answer is. However, the residual should be zero or comparable to .
b - L @ x
Next we’ll engineer a problem to which we know the exact answer.
alpha = 0.3;
beta = 2.2;
U = diag(ones(5)) + diag([-1, -1, -1, -1], k=1)
U[0, 3:5] = [ alpha - beta, beta ]
print(U)
x_exact = ones(5)
b = array([alpha, 0, 0, 0, 1])
x = FNC.backsub(U, b)
print("error:", x - x_exact)
Everything seems OK here. But another example, with a different value for β, is more troubling.
alpha = 0.3;
beta = 1e12;
U = diag(ones(5)) + diag([-1, -1, -1, -1], k=1)
U[0, 3:5] = [ alpha - beta, beta ]
b = array([alpha, 0, 0, 0, 1])
x = FNC.backsub(U, b)
print("error:", x - x_exact)
It’s not so good to get 4 digits of accuracy after starting with sixteen! But the source of the error is not hard to track down. Solving for performs in the first row. Since is so much smaller than , this a recipe for losing digits to subtractive cancellation.
Section 2.4¶
Triangular outer products
We explore the outer product formula for two random triangular matrices.
from numpy.random import randint
L = tril(randint(1, 10, size=(3, 3)))
print(L)
U = triu(randint(1, 10, size=(3, 3)))
print(U)
Here are the three outer products appearing in the sum in (2.4.4):
print(outer(L[:, 0], U[0, :]))
print(outer(L[:, 1], U[1, :]))
print(outer(L[:, 2], U[2, :]))
Simply because of the triangular zero structures, only the first outer product contributes to the first row and first column of the entire product.
LU factorization
For illustration, we work on a matrix. We name it with a subscript in preparation for what comes.
A_1 = array([
[2, 0, 4, 3],
[-4, 5, -7, -10],
[1, 15, 2, -4.5],
[-2, 0, 2, -13]
])
L = eye(4)
U = zeros((4, 4));
Now we appeal to (2.4.5). Since , we see that the first row of is just the first row of .
U[0, :] = A_1[0, :]
print(U)
From (2.4.6), we see that we can find the first column of from the first column of .
L[:, 0] = A_1[:, 0] / U[0, 0]
print(L)
We have obtained the first term in the sum (2.4.4) for , and we subtract it away from .
A_2 = A_1 - outer(L[:, 0], U[0, :])
Now If we ignore the first row and first column of the matrices in this equation, then in what remains we are in the same situation as at the start. Specifically, only has any effect on the second row and column, so we can deduce them now.
U[1, :] = A_2[1, :]
L[:, 1] = A_2[:, 1] / U[1, 1]
print(L)
If we subtract off the latest outer product, we have a matrix that is zero in the first two rows and columns.
A_3 = A_2 - outer(L[:, 1], U[1, :])
Now we can deal with the lower right submatrix of the remainder in a similar fashion.
U[2, :] = A_3[2, :]
L[:, 2] = A_3[:, 2] / U[2, 2]
A_4 = A_3 - outer(L[:, 2], U[2, :])
Finally, we pick up the last unknown in the factors.
U[3, 3] = A_4[3, 3]
We now have all of ,
print(L)
and all of ,
print(U)
We can verify that we have a correct factorization of the original matrix by computing the backward error:
A_1 - L @ U
In floating point, we cannot expect the difference to be exactly zero as we found in this toy example. Instead, we would be satisfied to see that each element of the difference above is comparable in size to machine precision.
Solving a linear system by LU factors
Here are the data for a linear system .
A = array([
[2, 0, 4, 3],
[-4, 5, -7, -10],
[1, 15, 2, -4.5],
[-2, 0, 2, -13]
])
b = array([4, 9, 9, 4])
We apply Function 2.4.1 and then do two triangular solves.
L, U = FNC.lufact(A)
z = FNC.forwardsub(L, b)
x = FNC.backsub(U, z)
A check on the residual assures us that we found the solution.
b - A @ x
Section 2.5¶
Floating-point operations in matrix-vector multiplication
Here is a straightforward implementation of matrix-vector multiplication.
n = 6
A = random.rand(n, n)
x = ones(n)
y = zeros(n)
for i in range(n):
for j in range(n):
y[i] += A[i, j] * x[j] # 2 flops
Each of the loops implies a summation of flops. The total flop count for this algorithm is
Since the matrix has elements, all of which have to be involved in the product, it seems unlikely that we could get a flop count that is smaller than in general.
Let’s run an experiment with the built-in matrix-vector multiplication. We assume that flops dominate the computation time and thus measure elapsed time.
N = 400 * arange(1, 11)
t = []
print(" n t")
for i, n in enumerate(N):
A = random.randn(n, n)
x = random.randn(n)
start = timer()
for j in range(50): A @ x
t.append(timer() - start)
print(f"{n:5} {t[-1]:10.3e}")
The reason for doing multiple repetitions at each value of above is to avoid having times so short that the resolution of the timer is a factor.
Looking at the timings just for and , they have ratio:
print(t[9] / t[4])
If the run time is dominated by flops, then we expect this ratio to be
Asymptotics in log-log plots
Let’s repeat the experiment of the previous example for more, and larger, values of .
N = arange(400, 6200, 200)
t = zeros(len(N))
for i, n in enumerate(N):
A = random.randn(n,n)
x = random.randn(n)
start = timer()
for j in range(20): A@x
t[i] = timer() - start
Plotting the time as a function of on log-log scales is equivalent to plotting the logs of the variables, but is formatted more neatly.
fig, ax = subplots()
ax.loglog(N, t, "-o", label="observed")
ylabel("elapsed time (sec)");
xlabel("$n$");
title("Timing of matrix-vector multiplications");
You can see that while the full story is complicated, the graph is trending to a straight line of positive slope. For comparison, we can plot a line that represents growth exactly. (All such lines have slope equal to 2.)
ax.loglog(N, t[-1] * (N/N[-1])**2, "--", label="$O(n^2)$")
ax.legend(); fig
Floating-point operations in LU factorization
We’ll test the conclusion of flops experimentally using the lu
function imported from scipi.linalg
.
from scipy.linalg import lu
N = arange(200, 2600, 200)
t = zeros(len(N))
for i, n in enumerate(N):
A = random.randn(n,n)
start = timer()
for j in range(5): lu(A)
t[i] = timer() - start
We plot the timings on a log-log graph and compare it to . The result could vary significantly from machine to machine, but in theory the data should start to parallel the line as .
loglog(N, t, "-o", label="obseved")
loglog(N, t[-1] * (N / N[-1])**3, "--", label="$O(n^3)$")
legend();
xlabel("$n$");
ylabel("elapsed time (sec)");
title("Timing of LU factorizations");
Section 2.6¶
Failure of naive LU factorization
Here is a previously encountered matrix that factors well.
A = array([
[2, 0, 4, 3],
[-4, 5, -7, -10],
[1, 15, 2, -4.5],
[-2, 0, 2, -13]
])
L, U = FNC.lufact(A)
print(L)
If we swap the second and fourth rows of , the result is still nonsingular. However, the factorization now fails.
A[[1, 3], :] = A[[3, 1], :]
L, U = FNC.lufact(A)
print(L)
The presence of NaN
in the result indicates that some impossible operation was required. The source of the problem is easy to locate. We can find the first outer product in the factorization just fine:
U[0, :] = A[0, :]
L[:, 0] = A[:, 0] / U[0, 0]
A -= outer(L[:, 0], U[0, :])
print(A)
The next step is U[1, :] = A[1, :]
, which is also OK. But then we are supposed to divide by U[1, 1]
, which is zero. The algorithm cannot continue.
Row pivoting in LU factorization
Here is the trouble-making matrix from Demo 2.6.1.
A_1 = array([
[2, 0, 4, 3],
[-2, 0, 2, -13],
[1, 15, 2, -4.5],
[-4, 5, -7, -10]
])
We now find the largest candidate pivot in the first column. We don’t care about sign, so we take absolute values before finding the max.
The argmax
function returns the location of the largest element of a vector or matrix.
i = argmax( abs(A_1[:, 0]) )
print(i)
This is the row of the matrix that we extract to put into . That guarantees that the division used to find will be valid.
L, U = eye(4), zeros((4, 4))
U[0, :] = A_1[i, :]
L[:, 0] = A_1[:, 0] / U[0, 0]
A_2 = A_1 - outer(L[:, 0], U[0, :])
print(A_2)
Observe that has a new zero row and zero column, but the zero row is the fourth rather than the first. However, we forge on by using the largest possible pivot in column 2 for the next outer product.
i = argmax( abs(A_2[:, 1]) )
print(f"new pivot row is {i}")
U[1, :] = A_2[i, :]
L[:, 1] = A_2[:, 1] / U[1, 1]
A_3 = A_2 - outer(L[:, 1], U[1, :])
print(A_3)
Now we have zeroed out the third row as well as the second column. We can finish out the procedure.
i = argmax( abs(A_3[:, 2]) )
print(f"new pivot row is {i}")
U[2, :] = A_3[i, :]
L[:, 2] = A_3[:, 2] / U[2, 2]
A_4 = A_3 - outer(L[:, 2], U[2, :])
print(A_4)
i = argmax( abs(A_4[:, 3]) )
print(f"new pivot row is {i}")
U[3, :] = A_4[i, :]
L[:, 3] = A_4[:, 3] / U[3, 3];
We do have a factorization of the original matrix:
A_1 - L @ U
And has the required structure:
print(U)
However, the triangularity of has been broken.
print(L)
Row permutation in LU factorization
Here again is the matrix from Demo 2.6.2.
A = array([
[2, 0, 4, 3],
[-2, 0, 2, -13],
[1, 15, 2, -4.5],
[-4, 5, -7, -10]
])
As the factorization proceeded, the pivots were selected from rows 4, 3, 2, and finally 1 (with NumPy indices being one less). If we were to put the rows of into that order, then the algorithm would run exactly like the plain LU factorization from LU factorization.
B = A[[3, 2, 1, 0], :]
L, U = FNC.lufact(B);
We obtain the same as before:
print(U)
And has the same rows as before, but arranged into triangular order:
print(L)
PLU factorization for solving linear systems
The third output of plufact
is the permutation vector we need to apply to .
A = random.randn(4, 4)
L, U, p = FNC.plufact(A)
A[p, :] - L @ U # should be ≈ 0
Given a vector , we solve by first permuting the entries of and then proceeding as before.
b = random.randn(4)
z = FNC.forwardsub(L, b[p])
x = FNC.backsub(U, z)
A residual check is successful:
b - A @ x
Built-in PLU factorization
In linalg.solve
, the matrix A
is PLU-factored, followed by two triangular solves. If we want to do those steps seamlessly, we can use the lu_factor
and lu_solve
from scipy.linalg
.
from scipy.linalg import lu_factor, lu_solve
A = random.randn(500, 500)
b = ones(500)
LU, perm = lu_factor(A)
x = lu_solve((LU, perm), b)
Why would we ever bother with this? In Efficiency of matrix computations we showed that the factorization is by far the most costly part of the solution process. A factorization object allows us to do that costly step only once per matrix, but solve with multiple right-hand sides.
start = timer()
for k in range(50): linalg.solve(A, random.rand(500))
print(f"elapsed time for 50 full solves: {timer() - start}")
start = timer()
LU, perm = lu_factor(A)
for k in range(50): lu_solve((LU, perm), random.rand(500))
print(f"elapsed time for 50 shortcut solves: {timer() - start}")
Stability of PLU factorization
We construct a linear system for this matrix with and exact solution :
ep = 1e-12
A = array([[-ep, 1], [1, -1]])
b = A @ array([1, 1])
We can factor the matrix without pivoting and solve for .
L, U = FNC.lufact(A)
print(FNC.backsub( U, FNC.forwardsub(L, b) ))
Note that we have obtained only about 5 accurate digits for . We could make the result even more inaccurate by making ε even smaller:
ep = 1e-20;
A = array([[-ep, 1], [1, -1]])
b = A @ array([1, 1])
L, U = FNC.lufact(A)
print(FNC.backsub( U, FNC.forwardsub(L, b) ))
This effect is not due to ill conditioning of the problem—a solution with PLU factorization works perfectly:
print(solve(A, b))
Section 2.7¶
Vector norms
The norm
function from numpy.linalg
computes vector norms.
from numpy.linalg import norm
x = array([2, -3, 1, -1])
print(norm(x)) # 2-norm by default
print(norm(x, inf))
print(norm(x, 1))
Matrix norms
from numpy.linalg import norm
A = array([ [2, 0], [1, -1] ])
The default matrix norm is not the 2-norm. Instead, you must provide the 2 explicitly.
print(norm(A, 2))
You can get the 1-norm as well.
print(norm(A, 1))
The 1-norm is equivalent to
print(max( sum(abs(A), axis=0)) ) # sum down the rows
Similarly, we can get the ∞-norm and check our formula for it.
print(norm(A, inf))
print(max( sum(abs(A), axis=1)) ) # sum across columns
Here we illustrate the geometric interpretation of the 2-norm. First, we will sample a lot of vectors on the unit circle in .
theta = linspace(0, 2*pi, 601)
x = vstack([cos(theta), sin(theta)]) # 601 unit columns
The linear function defines a mapping from to . We can apply A
to every column of x
simply by using a matrix multiplication.
y = A @ x
We plot the unit circle on the left and the image of all mapped vectors on the right:
subplot(1,2,1)
plot(x[0, :], x[1, :])
axis("equal")
title("Unit circle")
xlabel("$x_1$")
ylabel("$x_2$")
subplot(1,2,2)
plot(y[0, :], y[1, :])
plot(norm(A, 2) * x[0, :], norm(A,2) * x[1, :],"--")
axis("equal")
title("Image under map")
xlabel("$y_1$")
ylabel("$y_2$");
As seen on the right-side plot, the image of the transformed vectors is an ellipse that just touches the circle of radius .
Section 2.8¶
Matrix condition number
The function cond
from numpy.linalg
is used to computes matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the case.
A = array([
[1/(i + j + 2) for j in range(6)]
for i in range(6)
])
print(A)
from numpy.linalg import cond
kappa = cond(A)
print(f"kappa is {kappa:.3e}")
Next we engineer a linear system problem to which we know the exact answer.
x_exact = 1.0 + arange(6)
b = A @ x_exact
Now we perturb the data randomly with a vector of norm 10-12.
dA = random.randn(6, 6)
dA = 1e-12 * (dA / norm(dA, 2))
db = random.randn(6)
db = 1e-12 * (db / norm(db, 2))
We solve the perturbed problem using built-in pivoted LU and see how the solution was changed.
x = solve(A + dA, b + db)
dx = x - x_exact
Here is the relative error in the solution.
print(f"relative error is {norm(dx) / norm(x_exact):.2e}")
And here are upper bounds predicted using the condition number of the original matrix.
print(f"b_bound: {kappa * 1e-12 / norm(b):.2e}")
print(f"A_bound: {kappa * 1e-12 / norm(A, 2):.2e}")
Even if we don’t make any manual perturbations to the data, machine epsilon does when we solve the linear system numerically.
x = solve(A, b)
print(f"relative error: {norm(x - x_exact) / norm(x_exact):.2e}")
print(f"rounding bound: {kappa / 2**52:.2e}")
Because , it’s possible to lose 8 digits of accuracy in the process of passing from and to . That’s independent of the algorithm; it’s inevitable once the data are expressed in double precision.
Larger Hilbert matrices are even more poorly conditioned.
A = array([ [1/(i+j+2) for j in range(14)] for i in range(14) ])
kappa = cond(A)
print(f"kappa is {kappa:.3e}")
Before we compute the solution, note that κ exceeds 1/eps
. In principle we therefore might end up with an answer that is completely wrong (i.e., a relative error greater than 100%).
print(f"rounding bound: {kappa / 2**52:.2e}")
x_exact = 1.0 + arange(14)
b = A @ x_exact
x = solve(A, b)
We got an answer. But in fact, the error does exceed 100%:
print(f"relative error: {norm(x - x_exact) / norm(x_exact):.2e}")
Section 2.9¶
Banded matrices
Here is a matrix with both lower and upper bandwidth equal to one. Such a matrix is called tridiagonal.
A = array([
[2, -1, 0, 0, 0, 0],
[4, 2, -1, 0, 0, 0],
[0, 3, 0, -1, 0, 0],
[0, 0, 2, 2, -1, 0],
[0, 0, 0, 1, 1, -1],
[0, 0, 0, 0, 0, 2 ]
])
We can extract the elements on any diagonal using the diag
command. The “main” or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.
print( diag(A) )
print( diag(A, 1) )
print( diag(A, -1) )
We can also construct matrices by specifying a diagonal with the diag
function.
A = A + diag([pi, 8, 6, 7], 2)
print(A)
L, U = FNC.lufact(A)
print(L)
print(U)
Observe above that the lower and upper bandwidths of are preserved in the factor matrices.
Timing banded LU
We’ll use a large banded matrix to observe the speedup possible in LU factorization. If we use an ordinary dense matrix, then there’s no way to exploit a banded structure:
n = 8000
main = 1 + arange(n)
plusone = linspace(n-1, 1, n-1)
minusone = ones(n-1)
A = diag(main) + diag(plusone,1) + diag(minusone,1)
from scipy.linalg import lu
start = timer()
lu(A)
print(f"time for dense banded: {timer() - start:.5f}")
If instead we construct a proper sparse matrix, the speedup can be dramatic.
from scipy.sparse import diags
from scipy.sparse.linalg import splu
A = diags([main, plusone, minusone], [0, 1, -1], format="csc")
start = timer()
splu(A)
print(f"time for sparse banded: {timer() - start:.5f}")
Symmetric LDLT factorization
We begin with a symmetric .
A_1 = array([
[2, 4, 4, 2],
[4, 5, 8, -5],
[4, 8, 6, 2],
[2, -5, 2, -26]
])
We won’t use pivoting, so the pivot element is at position (1,1). This will become the first element on the diagonal of . Then we divide by that pivot to get the first column of .
L = eye(4)
d = zeros(4)
d[0] = A_1[0, 0]
L[:, 0] = A_1[:, 0] / d[0]
A_2 = A_1 - d[0] * outer(L[:, 0], L[:, 0])
print(A_2)
We are now set up the same way for the submatrix in rows and columns 2–4.
d[1] = A_2[1, 1]
L[:, 1] = A_2[:, 1] / d[1]
A_3 = A_2 - d[1] * outer(L[:, 1], L[:, 1])
print(A_3)
We continue working our way down the diagonal.
d[2] = A_3[2, 2]
L[:, 2] = A_3[:, 2] / d[2]
A_4 = A_3 - d[2] * outer(L[:, 2], L[:, 2])
print(A_4)
We have arrived at the desired factorization.
d[3] = A_4[3, 3]
print("diagonal of D:")
print(d)
print("L:")
print(L)
This should be comparable to machine roundoff:
print(norm(A_1 - (L @ diag(d) @ L.T), 2) / norm(A_1))
Cholesky factorization
A randomly chosen matrix is extremely unlikely to be symmetric. However, there is a simple way to symmetrize one.
A = 1.0 + floor(9 * random.rand(4, 4))
B = A + A.T
print(B)
Similarly, a random symmetric matrix is unlikely to be positive definite. The Cholesky algorithm always detects a non-PD matrix by quitting with an error.
from numpy.linalg import cholesky
cholesky(B)
It’s not hard to manufacture an SPD matrix to try out the Cholesky factorization:
B = A.T @ A
R = cholesky(B)
print(R)
print(norm(R @ R.T - B) / norm(B))