Skip to article frontmatterSkip to article content

Chapter 1

MATLAB implementations

Functions

Horner’s algorithm for evaluating a polynomial
horner.m
1
2
3
4
5
6
7
8
9
10
11
12
13
function y = horner(c,x)
% HORNER   Evaluate a polynomial using Horner's rule. 
% Input:
%   c     Coefficients of polynomial, in descending order (vector)
%   x     Evaluation point (scalar)
% Output:
%   y     Value of the polynomial at x (scalar)

n = length(c);
y = c(1);
for k = 2:n
  y = x*y + c(k);
end

Examples

Section 1.1

Absolute and relative accuracy

Recall the grade-school approximation to the number π.

```{index} MATLAB; format


```{card}
The number of digits displayed is controlled by `format`, but the underlying values are not affected by it.
format long
p = 22/7

Not all the digits displayed for p are the same as those of π.

The value of pi is predefined.

The absolute and relative accuracies of the approximation are as follows.

abs_accuracy = abs(p - pi)
rel_accuracy = abs(p - pi) / pi

Here we calculate the number of accurate digits in p.

The log function is for the natural log. For other common bases, use log10 or log2.

format short
accurate_digits = -log10(rel_accuracy)
Floating-point representation

In MATLAB, values are double-precision floats unless declared otherwise.

fprintf('1 has type: %s', class(1))
fprintf('1.0 has type: %s', class(1.0))

The spacing between floating-point values in [2n,2n+1)[2^n,2^{n+1}) is 2nϵmach2^n \epsilon_\text{mach}, where ϵmach\epsilon_\text{mach} is machine epsilon. Its value is predefined as eps.

While you can assign a different value to eps, doing so does not change any arithmetic. It’s generally a bad idea.

eps

Because double precision allocates 52 bits to the significand, the default value of machine epsilon is 2-52.

log2(eps)

The spacing between adjacent floating-point values is proportional to the magnitude of the value itself. This is how relative precision is kept roughly constant throughout the range of values. You can get the adjusted spacing by calling eps with a value.

eps(1.618)
eps(161.8)
x = 161.8 + 0.1*eps(161.8);
x - 161.8

A common mistake is to think that ϵmach\epsilon_\text{mach} is the smallest floating-point number. It’s only the smallest relative to 1. The correct perspective is that the scaling of values is limited by the exponent, not the mantissa. The actual range of positive values in double precision is

format short e
[realmin, realmax]
Floating-point arithmetic oddity

There is no double-precision number between 1 and 1+ϵmach1+\epsilon_\text{mach}. Thus the following difference is zero despite its appearance.

( 1 + eps / 2 ) - 1

However, the spacing between floats in [1/2,1)[1/2,1) is ϵmach/2\macheps/2, so both 1ϵmach/21-\macheps/2 and its negative are represented exactly:

1 - (1 - eps / 2)

This is now the expected result. But we have found a rather shocking breakdown of the associative law of addition!

Section 1.2

Conditioning of polynomial roots

The polynomial p(x)=13(x1)(x1ϵ)p(x) = \frac{1}{3}(x-1)(x-1-\epsilon) has roots 1 and 1+ϵ1+\epsilon. For small values of ε, the roots are ill-conditioned.

ep = 1e-6;
a = 1/3;             % coefficients of p...
b = (-2 - ep) / 3;   % ...
c = (1 + ep) / 3;    % ...in ascending order

Here are the roots as computed by the quadratic formula.

d = sqrt(b^2 - 4*a*c);
format long   % show all digits
r1 = (-b - d) / (2*a)
r2 = (-b + d) / (2*a)

The display of r2 suggests that the last five digits or so are inaccurate. The relative errors are

Putting values inside square brackets creates a vector.

format short e
err = abs(r1 - 1) ./ abs(1)
err = abs(r2 - (1 + ep)) ./ abs(1 + ep)

The condition number of each root is

κ(ri)=rir1r21ϵ. \kappa(r_i) = \frac{|r_i|}{|r_1-r_2|} \approx \frac{1}{\epsilon}.

Thus, relative error in the data at the level of roundoff can grow in the result to be roughly

eps / ep

This matches the observation pretty well.

Section 1.3

Using a function

Here we show how to use Function 1.3.1 to evaluate a polynomial. Let us define a vector of the coefficients of p(x)=(x1)3=x33x2+3x1p(x)=(x-1)^3=x^3-3x^2+3x-1, in ascending degree order.

c = [1, -3, 3, 1]

Now we evaluate p(1.6)p(1.6) using the function horner.

horner(c, 1.6)

The result above is the value of p(1.6)p(1.6).

Section 1.4

Instability of the quadratic formula

We apply the quadratic formula to find the roots of a quadratic via (1.4.1).

A number in scientific notation is entered as 1.23e4 rather than as 1.23 * 10^4.

a = 1;  b = -(1e6 + 1e-6);  c = 1;
x1 = (-b + sqrt(b^2 - 4*a*c)) / (2*a)
x2 = (-b - sqrt(b^2 - 4*a*c)) / (2*a)

The first value is correct to all stored digits, but the second has fewer than six accurate digits:

error = abs(x2 - 1e-6) / 1e-6 
accurate_digits = -log10(error)

The instability is easily explained. Since a=c=1a=c=1, we treat them as exact numbers. First, we compute the condition numbers with respect to bb for each elementary step in finding the “good” root:

CalculationResultκ
u1=b2u_1 = b^21.000000000002000×10121.000000000002000\times 10^{12}2
u2=u14u_2 = u_1 - 49.999999999980000×10119.999999999980000\times 10^{11}1.00\approx 1.00
u3=u2u_3 = \sqrt{u_2}999999.99999900001/2
u4=u3bu_4 = u_3 - b20000000.500\approx 0.500
u5=u4/2u_5 = u_4/210000001

Using (1.2.9), the chain rule for condition numbers, the conditioning of the entire chain is the product of the individual steps, so there is essentially no growth of relative error here. However, if we use the quadratic formula for the “bad” root, the next-to-last step becomes u4=(u3)bu_4=(-u_3) - b, and now κ=u3/u45×1011\kappa=|u_3|/|u_4|\approx 5\times 10^{11}. So we can expect to lose 11 digits of accuracy, which is what we observed. The key issue is the subtractive cancellation in this one step.

Stable alternative to the quadratic formula

We repeat the rootfinding experiment of Demo 1.4.1 with an alternative algorithm.

a = 1;  b = -(1e6 + 1e-6);  c = 1;

First, we find the “good” root using the quadratic formula.

x1 = (-b + sqrt(b^2 - 4*a*c)) / (2*a);

Then we use the identity x1x2=cax_1x_2=\frac{c}{a} to compute the smaller root:

x2 = c / (a*x1)

This matches the exact root to the displayed digits; to be sure we have an accurate result, we compute its relative error.

abs(x2 - 1e-6) / 1e-6
Backward error

Our first step is to construct a polynomial with six known roots.

The ' operator is used for transposition. Here, we want to make r a column vector.

r = [-2 ,-1, 1, 1, 3, 6]';
p = poly(r)

Now we use a standard numerical method for finding those roots, pretending that we don’t know them already. This corresponds to y~\tilde{y} in Definition 1.4.1.

rr = sort(roots(p))

Here are the relative errors in each of the computed roots.

The ./ operator is used for element-wise division.

disp("Root errors:") 
abs(r - rr) ./ r

It seems that the forward error is acceptably close to machine epsilon for double precision in all cases except the double root at x=1x=1. This is not a surprise, though, given the poor conditioning at such roots.

Let’s consider the backward error. The data in the rootfinding problem is the polynomial coefficients. We can apply poly to find the coefficients of the polynomial (that is, the data) whose roots were actually computed by the numerical algorithm. This corresponds to x~\tilde{x} in Definition 1.4.1.

pp = poly(rr)

We find that in a relative sense, these coefficients are very close to those of the original, exact polynomial:

disp("Coefficient errors:") 
abs(p - pp) ./ abs(p)

In summary, even though there are some computed roots relatively far from their correct values, they are nevertheless the roots of a polynomial that is very close to the original.