In calculus you learn that the elegant way to evaluate a definite integral is to apply the Fundamental Theorem of Calculus and find an antiderivative. The connection is so profound and pervasive that it’s easy to overlook that a definite integral is a numerical quantity existing independently of antidifferentiation. However, most conceivable integrands have no antiderivative in terms of familiar functions.
Example 5.6.1 (Numerical integration)
Example 5.6.1 The antiderivative of e x e^x e x is, of course, itself. That makes evaluation of ∫ 0 1 e x d x \int_0^1 e^x\,dx ∫ 0 1 e x d x by the Fundamental Theorem trivial.
The Julia package QuadGK
has an all-purpose numerical integrator that estimates the value without finding the antiderivative first. As you can see here, it’s often just as accurate.
using QuadGK
Q, errest = quadgk(x -> exp(x), 0, 1)
@show Q;
The numerical approach is also far more robust. For example, e sin x e^{\,\sin x} e s i n x has no useful antiderivative. But numerically, it’s no more difficult.
Q, errest = quadgk(x -> exp(sin(x)), 0, 1)
@show Q;
When you look at the graphs of these functions, what’s remarkable is that one of these areas is basic calculus while the other is almost impenetrable analytically. From a numerical standpoint, they are practically the same problem.
using Plots
plot([exp, x -> exp(sin(x))], 0, 1, fill=0, layout=(2, 1),
xlabel=L"x", ylabel=[L"e^x" L"e^{\sin(x)}"], ylim=[0, 2.7])
Example 5.6.1 The antiderivative of e x e^x e x is, of course, itself. That makes evaluation of ∫ 0 1 e x d x \int_0^1 e^x\,dx ∫ 0 1 e x d x by the Fundamental Theorem trivial.
format long
exact = exp(1) - 1
MATLAB has numerical integrator integral
that estimates the value without finding the antiderivative first. As you can see here, it can be as accurate as floating-point precision allows.
integral(@(x) exp(x), 0, 1)
The numerical approach is also far more robust. For example, e sin x e^{\,\sin x} e s i n x has no useful antiderivative. But numerically, it’s no more difficult.
integral(@(x) exp(sin(x)), 0, 1)
When you look at the graphs of these functions, what’s remarkable is that one of these areas is basic calculus while the other is almost impenetrable analytically. From a numerical standpoint, they are practically the same problem.
x = linspace(0, 1, 201)';
subplot(2,1,1), fill([x; 1; 0], [exp(x); 0;0 ], [1, 0.9, 0.9])
title('exp(x)')
ylabel('f(x)')
subplot(2, 1, 2), fill([x; 1; 0], [exp(sin(x)); 0; 0], [1, 0.9, 0.9])
title('exp(sin(x))')
xlabel('x'), ylabel(('f(x)'));
Example 5.6.1 The antiderivative of e x e^x e x is, of course, itself. That makes evaluation of ∫ 0 1 e x d x \int_0^1 e^x\,dx ∫ 0 1 e x d x by the Fundamental Theorem trivial.
The module scipy.integrate
has multiple functions that estimate the value of an integral numerically without finding the antiderivative first. As you can see here, it’s often just as accurate.
from scipy.integrate import quad
Q, errest = quad(exp, 0, 1, epsabs=1e-13, epsrel=1e-13)
print(Q)
The numerical approach is also far more robust. For example, e sin x e^{\,\sin x} e s i n x has no useful antiderivative. But numerically, it’s no more difficult.
Q, errest = quad(lambda x: exp(sin(x)), 0, 1, epsabs=1e-13, epsrel=1e-13)
print(Q)
When you look at the graphs of these functions, what’s remarkable is that one of these areas is basic calculus while the other is almost impenetrable analytically. From a numerical standpoint, they are practically the same problem.
x = linspace(0, 1, 300)
subplot(1, 2, 1)
plot(x, exp(x))
ylim([0, 2.7]), title("exp(x)")
subplot(1, 2, 2)
plot(x, exp(sin(x)))
ylim([0, 2.7]), title("exp(sin(x))");
Numerical integration, which also goes by the older name quadrature , is performed by combining values of the integrand sampled at nodes. In this section we will assume equally spaced nodes using the definitions
t i = a + i h , h = b − a n , i = 0 , … , n . t_i = a +i h, \quad h=\frac{b-a}{n}, \qquad i=0,\ldots,n. t i = a + ih , h = n b − a , i = 0 , … , n . Definition 5.6.1 (Numerical integration formula)
A numerical integration formula is a list of weights w 0 , … , w n w_0,\ldots,w_n w 0 , … , w n chosen so that for all f f f in some class of functions,
∫ a b f ( x ) d x ≈ h ∑ i = 0 n w i f ( t i ) = h [ w 0 f ( t 0 ) + w 1 f ( t 1 ) + ⋯ w n f ( t n ) ] , \begin{split}
\int_a^b f(x)\, dx \approx h \sum_{i=0}^n w_if(t_i) = h \bigl[ w_0f(t_0)+w_1f(t_1)+\cdots w_nf(t_n) \bigr],
\end{split} ∫ a b f ( x ) d x ≈ h i = 0 ∑ n w i f ( t i ) = h [ w 0 f ( t 0 ) + w 1 f ( t 1 ) + ⋯ w n f ( t n ) ] , with the t i t_i t i defined in (5.6.1) . The weights are independent of f f f and h h h .
Numerical integration formulas can be applied to sequences of data values even if no function is explicitly known to generate them. For our presentation and implementations, however, we assume that f f f is known and can be evaluated anywhere.
A straightforward way to derive integration formulas is to mimic the approach taken for finite differences: find an interpolant and operate exactly on it.
One of the most important integration formulas results from integration of the piecewise linear interpolant (see Piecewise linear interpolation ). Using the cardinal basis form of the interpolant in (5.2.3) , we have
∫ a b f ( x ) d x ≈ ∫ a b ∑ i = 0 n f ( t i ) H i ( x ) d x = ∑ i = 0 n f ( t i ) [ ∫ a b H i ( x ) ] d x . \int_a^b f(x) \, dx \approx \int_a^b \sum_{i=0}^n f(t_i) H_i(x)\, dx = \sum_{i=0}^n f(t_i) \left[ \int_a^b H_i(x)\right]\, dx. ∫ a b f ( x ) d x ≈ ∫ a b i = 0 ∑ n f ( t i ) H i ( x ) d x = i = 0 ∑ n f ( t i ) [ ∫ a b H i ( x ) ] d x . Thus we can identify the weights as w i = h − 1 ∫ a b H i ( x ) d x w_i = h^{-1} \int_a^b H_i(x)\, dx w i = h − 1 ∫ a b H i ( x ) d x . Using areas of triangles, it’s trivial to derive that
w i = { 1 , i = 1 , … , n − 1 , 1 2 , i = 0 , n . w_i = \begin{cases}
1, & i=1,\ldots,n-1,\\
\frac{1}{2}, & i=0,n.
\end{cases} w i = { 1 , 2 1 , i = 1 , … , n − 1 , i = 0 , n . Putting everything together, the resulting formula is
∫ a b f ( x ) d x ≈ T f ( n ) = h [ 1 2 f ( t 0 ) + f ( t 1 ) + f ( t 2 ) + ⋯ + f ( t n − 1 ) + 1 2 f ( t n ) ] . \begin{split}
\int_a^b f(x)\, dx \approx T_f(n) &= h\left[
\frac{1}{2}f(t_0) + f(t_1) + f(t_2) + \cdots + f(t_{n-1}) +
\frac{1}{2}f(t_n) \right].
\end{split} ∫ a b f ( x ) d x ≈ T f ( n ) = h [ 2 1 f ( t 0 ) + f ( t 1 ) + f ( t 2 ) + ⋯ + f ( t n − 1 ) + 2 1 f ( t n ) ] . Geometrically, as illustrated in Figure 5.6.1 , the trapezoid formula sums of the areas of trapezoids approximating the region under the curve y = f ( x ) y=f(x) y = f ( x ) .[1]
The trapezoid formula is the Swiss Army knife of integration formulas. A short implementation is given as Function 5.6.1 .
Figure 5.6.1: Trapezoid formula for integration. The piecewise linear interpolant defines trapezoids that approximate the region under the curve.
Trapezoid formula for numerical integration1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
"""
trapezoid(f, a, b, n)
Apply the trapezoid integration formula for integrand `f` over
interval [`a`,`b`], broken up into `n` equal pieces. Returns
the estimate, a vector of nodes, and a vector of integrand values at the
nodes.
"""
function trapezoid(f, a, b, n)
h = (b - a) / n
t = range(a, b, length = n + 1)
y = f.(t)
T = h * (sum(y[2:n]) + 0.5 * (y[1] + y[n+1]))
return T, t, y
end
Trapezoid formula for numerical integration1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function [T,t,y] = trapezoid(f,a,b,n)
%TRAPEZOID Trapezoid formula for numerical integration.
% Input:
% f integrand (function)
% a,b interval of integration (scalars)
% n number of interval divisions
% Output:
% T approximation to the integral of f over (a,b)
% t vector of nodes used
% y vector of function values at nodes
h = (b-a)/n;
t = a + h*(0:n)';
y = f(t);
T = h * ( sum(y(2:n)) + 0.5*(y(1) + y(n+1)) );
Trapezoid formula for numerical integration1
2
3
4
5
6
7
8
9
10
11
def trapezoid(f, a, b, n):
"""
trapezoid(f, a, b, n)
Apply the trapezoid integration formula for integrand f over interval [a,b], broken up into n equal pieces. Returns estimate, vector of nodes, and vector of integrand values at the nodes.
"""
h = (b - a) / n
t = np.linspace(a, b, n + 1)
y = f(t)
T = h * (np.sum(y[1:-1]) + 0.5 * (y[0] + y[-1]))
return T, t, y
Like finite-difference formulas, numerical integration formulas have a truncation error.
Definition 5.6.3 (Truncation error of a numerical integration formula)
For the numerical integration formula (5.6.2) , the truncation error is
τ f ( h ) = ∫ a b f ( x ) d x − h ∑ i = 0 n w i f ( t i ) . \tau_f(h) = \int_a^b f(x) \, dx - h \sum_{i=0}^{n} w_i f(t_i). τ f ( h ) = ∫ a b f ( x ) d x − h i = 0 ∑ n w i f ( t i ) . The order of accuracy is as defined in Definition 5.5.2 .
In Theorem 5.2.2 we stated that the pointwise error in a piecewise linear interpolant with equal node spacing h h h is bounded by O ( h 2 ) O(h^2) O ( h 2 ) as h → 0 h\rightarrow 0 h → 0 . Using I I I to stand for the exact integral of f f f and p p p to stand for the piecewise linear interpolant, we obtain
I − T f ( n ) = I − ∫ a b p ( x ) d x = ∫ a b [ f ( x ) − p ( x ) ] d x ≤ ( b − a ) max x ∈ [ a , b ] ∣ f ( x ) − p ( x ) ∣ = O ( h 2 ) . \begin{split}
I - T_f(n) = I - \int_a^b p(x)\, dx &= \int_a^b \bigl[f(x)-p(x)\bigr] \, dx \\
&\le (b-a) \max_{x\in[a,b]} |f(x)-p(x)| = O(h^2).
\end{split} I − T f ( n ) = I − ∫ a b p ( x ) d x = ∫ a b [ f ( x ) − p ( x ) ] d x ≤ ( b − a ) x ∈ [ a , b ] max ∣ f ( x ) − p ( x ) ∣ = O ( h 2 ) . A more thorough statement of the truncation error is known as the Euler–Maclaurin formula ,
∫ a b f ( x ) d x = T f ( n ) − h 2 12 [ f ′ ( b ) − f ′ ( a ) ] + h 4 740 [ f ′ ′ ′ ( b ) − f ′ ′ ′ ( a ) ] + O ( h 6 ) = T f ( n ) − ∑ k = 1 ∞ B 2 k h 2 k ( 2 k ) ! [ f ( 2 k − 1 ) ( b ) − f ( 2 k − 1 ) ( a ) ] , \begin{split}
\int_a^b f(x)\, dx &= T_f(n) - \frac{h^2}{12} \left[ f'(b)-f'(a) \right] + \frac{h^4}{740} \left[ f'''(b)-f'''(a) \right] + O(h^6) \\
&= T_f(n) - \sum_{k=1}^\infty \frac{B_{2k}h^{2k}}{(2k)!} \left[ f^{(2k-1)}(b)-f^{(2k-1)}(a) \right],
\end{split} ∫ a b f ( x ) d x = T f ( n ) − 12 h 2 [ f ′ ( b ) − f ′ ( a ) ] + 740 h 4 [ f ′′′ ( b ) − f ′′′ ( a ) ] + O ( h 6 ) = T f ( n ) − k = 1 ∑ ∞ ( 2 k )! B 2 k h 2 k [ f ( 2 k − 1 ) ( b ) − f ( 2 k − 1 ) ( a ) ] , where the B 2 k B_{2k} B 2 k are constants known as Bernoulli numbers . Unless we happen to be fortunate enough to have a function with f ′ ( b ) = f ′ ( a ) f'(b)=f'(a) f ′ ( b ) = f ′ ( a ) , we should expect truncation error at second order and no better.
Example 5.6.2 (Trapezoid integration)
Example 5.6.2 We will approximate the integral of the function f ( x ) = e sin 7 x f(x)=e^{\sin 7x} f ( x ) = e s i n 7 x over the interval [ 0 , 2 ] [0,2] [ 0 , 2 ] .
f = x -> exp(sin(7 * x));
a = 0;
b = 2;
In lieu of the exact value, we use the QuadGK
package to find an accurate result.
Tip
If a function has multiple return values, you can use an underscore _
to indicate a return value you want to ignore.
Q, _ = quadgk(f, a, b, atol=1e-14, rtol=1e-14);
println("Integral = $Q")
Integral = 2.6632197827615394
Here is the trapezoid result at n = 40 n=40 n = 40 , and its error.
T, t, y = FNC.trapezoid(f, a, b, 40)
@show (T, Q - T);
(T, Q - T) = (2.662302935602287, 0.0009168471592522209)
In order to check the order of accuracy, we increase n n n by orders of magnitude and observe how the error decreases.
n = [10^n for n in 1:5]
err = zeros(length(n))
for (k, n) in enumerate(n)
T, t, y = FNC.trapezoid(f, a, b, n)
err[k] = Q - T
end
@pt :header=["n", "error"] [n err]
Each increase by a factor of 10 in n n n cuts the error by a factor of about 100, which is consistent with second-order convergence. Another check is that a log-log graph should give a line of slope -2 as n → ∞ n\to\infty n → ∞ .
plot(n, abs.(err);
m=:o, label="results",
xaxis=(:log10, L"n"), yaxis=(:log10, "error"),
title="Convergence of trapezoidal integration")
# Add line for perfect 2nd order.
plot!(n, 3e-3 * (n / n[1]) .^ (-2), l=:dash, label=L"O(n^{-2})")
Example 5.6.2 We will approximate the integral of the function f ( x ) = e sin 7 x f(x)=e^{\sin 7x} f ( x ) = e s i n 7 x over the interval [ 0 , 2 ] [0,2] [ 0 , 2 ] .
f = @(x) exp(sin(7 * x));
a = 0; b = 2;
In lieu of the exact value, we use the integral
function to find an accurate result.
I = integral(f, a, b, abstol=1e-14, reltol=1e-14);
fprintf("Integral = %.15f", I)
Here is the trapezoid result at n = 40 n=40 n = 40 , and its error.
T = trapezoid(f, a, b, 40);
fprintf("Trapezoid error = %.2e", I - T)
In order to check the order of accuracy, we increase n n n by orders of magnitude and observe how the error decreases.
n = 10 .^ (1:5)';
err = zeros(size(n));
for i = 1:length(n)
T = trapezoid(f, a, b, n(i));
err(i) = I - T;
end
table(n, err, variableNames=["n", "Trapezoid error"])
Each increase by a factor of 10 in n n n cuts the error by a factor of about 100, which is consistent with second-order convergence. Another check is that a log-log graph should give a line of slope -2 as n → ∞ n\to\infty n → ∞ .
clf
loglog(n, abs(err), "-o", displayname="trapezoid")
hold on
loglog(n, 0.1 * abs(err(end)) * (n / n(end)).^(-2), "k--", displayname="O(n^{-2})")
xlabel("n"); ylabel("error")
title("Convergence of trapezoidal integration")
legend();
Example 5.6.2 We will approximate the integral of the function f ( x ) = e sin 7 x f(x)=e^{\sin 7x} f ( x ) = e s i n 7 x over the interval [ 0 , 2 ] [0,2] [ 0 , 2 ] .
f = lambda x: exp(sin(7 * x))
a, b = 0, 2
In lieu of the exact value, we will use the quad
function to find an accurate result.
from scipy.integrate import quad
I, errest = quad(f, a, b, epsabs=1e-13, epsrel=1e-13)
print(f"Integral = {I:.14f}")
Integral = 2.66321978276154
Here is the trapezoid result at n = 40 n=40 n = 40 , and its error.
T, t, y = FNC.trapezoid(f, a, b, 40)
print(f"Trapezoid estimate is {T:.14f} with error {I - T:.2e}")
Trapezoid estimate is 2.66230293560229 with error 9.17e-04
In order to check the order of accuracy, we increase n n n by orders of magnitude and observe how the error decreases.
n_ = 40 * 2 ** arange(6)
err = zeros(size(n_))
print(" n error")
for k, n in enumerate(n_):
T, t, y = FNC.trapezoid(f, a, b, n)
err[k] = I - T
print(f"{n:6d} {err[k]:8.3e} ")
n error
40 9.168e-04
80 2.301e-04
160 5.757e-05
320 1.440e-05
640 3.599e-06
1280 8.998e-07
Each increase by a factor of 10 in n n n cuts the error by a factor of about 100, which is consistent with second-order convergence. Another check is that a log-log graph should give a line of slope -2 as n → ∞ n\to\infty n → ∞ .
loglog(n_, abs(err), "-o", label="results")
loglog(n_, 3e-3 * (n_ / n_[0]) ** (-2), "--", label="2nd order")
gca().invert_xaxis()
xlabel("$n$")
ylabel("error")
legend()
title("Convergence of trapezoidal integration");
If evaluations of f f f are computationally expensive, we want to get as much accuracy as possible from them by using a higher-order formula. There are many routes for doing so; for example, we could integrate a not-a-knot cubic spline interpolant. However, splines are difficult to compute by hand, and as a result different methods were developed before computers came on the scene.
Knowing the structure of the error allows the use of extrapolation to improve accuracy. Suppose a quantity A 0 A_0 A 0 is approximated by an algorithm A ( h ) A(h) A ( h ) with an
error expansion
Crucially, it is not necessary to know the values of the error constants c k c_k c k , merely that they exist and are independent of h h h .
Using I I I for the exact integral of f f f , the trapezoid formula has
I = T f ( n ) + c 2 h 2 + c 4 h 4 + ⋯ , I = T_f(n) + c_2 h^2 + c_4 h^{4} + \cdots, I = T f ( n ) + c 2 h 2 + c 4 h 4 + ⋯ , as proved by the Euler–Maclaurin formula (5.6.9) . The error constants depend on f f f and can’t be evaluated in general, but we know that this expansion holds. For convenience we recast the error expansion in terms of n = O ( h − 1 ) n=O(h^{-1}) n = O ( h − 1 ) :
I = T f ( n ) + c 2 n − 2 + c 4 n − 4 + ⋯ . I = T_f(n) + c_2 n^{-2} + c_4 n^{-4} + \cdots. I = T f ( n ) + c 2 n − 2 + c 4 n − 4 + ⋯ . We now make the simple observation that
I = T f ( 2 n ) + 1 4 c 2 n − 2 + 1 16 c 4 n − 4 + ⋯ . I = T_f(2n) + \tfrac{1}{4} c_2 n^{-2} + \tfrac{1}{16} c_4 n^{-4} + \cdots. I = T f ( 2 n ) + 4 1 c 2 n − 2 + 16 1 c 4 n − 4 + ⋯ . It follows that if we combine (5.6.12) and (5.6.13) correctly, we can cancel out the second-order term in the error. Specifically, define
S f ( 2 n ) = 1 3 [ 4 T f ( 2 n ) − T f ( n ) ] . S_f(2n) = \frac{1}{3} \Bigl[ 4 T_f(2n) - T_f(n) \Bigr]. S f ( 2 n ) = 3 1 [ 4 T f ( 2 n ) − T f ( n ) ] . (We associate 2 n 2n 2 n rather than n n n with the extrapolated result because of the total number of nodes needed.) Then
The formula (5.6.14) is called Simpson’s formula , or Simpson’s rule . A different presentation and derivation are considered in Exercise 4 .
Equation (5.6.15) is another particular error expansion in the form (5.6.10) , so we can extrapolate again! The details change only a little. Considering that
I = S f ( 4 n ) = 1 16 b 4 n − 4 + 1 64 b 6 n − 6 + ⋯ , I = S_f(4n) = \tfrac{1}{16} b_4 n^{-4} + \tfrac{1}{64} b_6 n^{-6} + \cdots, I = S f ( 4 n ) = 16 1 b 4 n − 4 + 64 1 b 6 n − 6 + ⋯ , the proper combination this time is
R f ( 4 n ) = 1 15 [ 16 S f ( 4 n ) − S f ( 2 n ) ] , R_f(4n) = \frac{1}{15} \Bigl[ 16 S_f(4n) - S_f(2n) \Bigr], R f ( 4 n ) = 15 1 [ 16 S f ( 4 n ) − S f ( 2 n ) ] , which is sixth-order accurate. Clearly the process can be repeated to get eighth-order accuracy and beyond. Doing so goes by the name of Romberg integration , which we will not present in full generality.
5.6.3 Node doubling ¶ Note in (5.6.17) that R f ( 4 n ) R_f(4n) R f ( 4 n ) depends on S f ( 2 n ) S_f(2n) S f ( 2 n ) and S f ( 4 n ) S_f(4n) S f ( 4 n ) , which in turn depend on T f ( n ) T_f(n) T f ( n ) , T f ( 2 n ) T_f(2n) T f ( 2 n ) , and T f ( 4 n ) T_f(4n) T f ( 4 n ) . There is a useful benefit realized by doubling of the nodes in each application of the trapezoid formula. As shown in Figure 5.6.2 , when doubling n n n , only about half of the nodes are new ones, and previously computed function values at the other nodes can be reused.
Figure 5.6.2: Dividing the node spacing by half introduces new nodes only at midpoints, allowing the function values at existing nodes to be reused for extrapolation.
Specifically, we have
T f ( 2 m ) = 1 2 m [ 1 2 f ( a ) + 1 2 f ( b ) + ∑ i = 1 2 m − 1 f ( a + i 2 m ) ] = 1 2 m [ 1 2 f ( a ) + 1 2 f ( b ) ] + 1 2 m ∑ k = 1 m − 1 f ( a + 2 k 2 m ) + 1 2 m ∑ k = 1 m f ( a + 2 k − 1 2 m ) = 1 2 m [ 1 2 f ( a ) + 1 2 f ( b ) + ∑ k = 1 m − 1 f ( a + k m ) ] + 1 2 m ∑ k = 1 m f ( a + 2 k − 1 2 m ) = 1 2 T f ( m ) + 1 2 m ∑ k = 1 m − 1 f ( t 2 k − 1 ) , \begin{split}
T_f(2m) & = \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b) + \sum_{i=1}^{2m-1} f\Bigl( a + \frac{i}{2m} \Bigr) \right]\\[1mm]
& = \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b)\right] + \frac{1}{2m} \sum_{k=1}^{m-1} f\Bigl( a+\frac{2k}{2m} \Bigr) + \frac{1}{2m} \sum_{k=1}^{m} f\Bigl( a+\frac{2k-1}{2m} \Bigr) \\[1mm]
&= \frac{1}{2m} \left[ \frac{1}{2} f(a) + \frac{1}{2} f(b) + \sum_{k=1}^{m-1} f\Bigl( a+\frac{k}{m} \Bigr) \right] + \frac{1}{2m} \sum_{k=1}^{m} f\Bigl( a+\frac{2k-1}{2m} \Bigr) \\[1mm]
&= \frac{1}{2} T_f(m) + \frac{1}{2m} \sum_{k=1}^{m-1} f\left(t_{2k-1} \right),
\end{split} T f ( 2 m ) = 2 m 1 [ 2 1 f ( a ) + 2 1 f ( b ) + i = 1 ∑ 2 m − 1 f ( a + 2 m i ) ] = 2 m 1 [ 2 1 f ( a ) + 2 1 f ( b ) ] + 2 m 1 k = 1 ∑ m − 1 f ( a + 2 m 2 k ) + 2 m 1 k = 1 ∑ m f ( a + 2 m 2 k − 1 ) = 2 m 1 [ 2 1 f ( a ) + 2 1 f ( b ) + k = 1 ∑ m − 1 f ( a + m k ) ] + 2 m 1 k = 1 ∑ m f ( a + 2 m 2 k − 1 ) = 2 1 T f ( m ) + 2 m 1 k = 1 ∑ m − 1 f ( t 2 k − 1 ) , where the nodes referenced in the last line are relative to n = 2 m n=2m n = 2 m . Hence in passing from n = m n=m n = m to n = 2 m n=2m n = 2 m , new integrand evaluations are needed only at the odd-numbered nodes of the finer grid.
Example 5.6.3 (Integration by extrapolation)
Example 5.6.3 We estimate ∫ 0 2 x 2 e − 2 x d x \displaystyle\int_0^2 x^2 e^{-2x}\, dx ∫ 0 2 x 2 e − 2 x d x using extrapolation. First we use quadgk
to get an accurate value.
f = x -> x^2 * exp(-2x);
a = 0;
b = 2;
Q, _ = quadgk(f, a, b, atol=1e-14, rtol=1e-14)
@show Q;
We start with the trapezoid formula on n = N n=N n = N nodes.
N = 20; # the coarsest formula
n = N;
h = (b - a) / n;
t = h * (0:n);
y = f.(t);
We can now apply weights to get the estimate T f ( N ) T_f(N) T f ( N ) .
T = [h * (sum(y[2:n]) + y[1] / 2 + y[n+1] / 2)]
1-element Vector{Float64}:
0.19041144993926784
Now we double to n = 2 N n=2N n = 2 N , but we only need to evaluate f f f at every other interior node and apply (5.6.18) .
n = 2n;
h = h / 2;
t = h * (0:n);
T = [T; T[end] / 2 + h * sum(f.(t[2:2:n]))]
2-element Vector{Float64}:
0.19041144993926784
0.19045880585951175
We can repeat the same code to double n n n again.
n = 2n;
h = h / 2;
t = h * (0:n);
T = [T; T[end] / 2 + h * sum(f.(t[2:2:n]))]
3-element Vector{Float64}:
0.19041144993926784
0.19045880585951175
0.1904703513046443
Let us now do the first level of extrapolation to get results from Simpson’s formula. We combine the elements T[i]
and T[i+1]
the same way for i = 1 i=1 i = 1 and i = 2 i=2 i = 2 .
S = [(4T[i+1] - T[i]) / 3 for i in 1:2]
2-element Vector{Float64}:
0.19047459116625973
0.19047419978635513
With the two Simpson values S f ( N ) S_f(N) S f ( N ) and S f ( 2 N ) S_f(2N) S f ( 2 N ) in hand, we can do one more level of extrapolation to get a sixth-order accurate result.
We can make a triangular table of the errors:
Tip
The value nothing
equals nothing except nothing
.
err = [T .- Q [nothing; S .- Q] [nothing; nothing; R - Q]]
@pt :header=["order 2", "order 4", "order 6"] err
If we consider the computational time to be dominated by evaluations of f f f , then we have obtained a result with about twice as many accurate digits as the best trapezoid result, at virtually no extra cost.
Example 5.6.3 We estimate ∫ 0 2 x 2 e − 2 x d x \displaystyle\int_0^2 x^2 e^{-2x}\, dx ∫ 0 2 x 2 e − 2 x d x using extrapolation. First we use quadgk
to get an accurate value.
f = @(x) x.^2 .* exp(-2 * x);
a = 0; b = 2;
format long
I = integral(f, a, b, abstol=1e-14, reltol=1e-14)
We start with the trapezoid formula on n = N n=N n = N nodes.
N = 20; % the coarsest formula
n = N; h = (b - a) / n;
t = h * (0:n)';
y = f(t);
We can now apply weights to get the estimate T f ( N ) T_f(N) T f ( N ) .
T = h * ( sum(y(2:n)) + y(1) / 2 + y(n+1) / 2 )
Now we double to n = 2 N n=2N n = 2 N , but we only need to evaluate f f f at every other interior node and apply (5.6.18) .
n = 2*n; h = h / 2;
t = h * (0:n)';
T(2) = T(1) / 2 + h * sum( f(t(2:2:n)) )
We can repeat the same code to double n n n again.
n = 2*n; h = h / 2;
t = h * (0:n)';
T(3) = T(2) / 2 + h * sum( f(t(2:2:n)) )
Let us now do the first level of extrapolation to get results from Simpson’s formula. We combine the elements T[i]
and T[i+1]
the same way for i = 1 i=1 i = 1 and i = 2 i=2 i = 2 .
S = (4 * T(2:3) - T(1:2)) / 3
With the two Simpson values S f ( N ) S_f(N) S f ( N ) and S f ( 2 N ) S_f(2N) S f ( 2 N ) in hand, we can do one more level of extrapolation to get a sixth-order accurate result.
R = (16*S(2) - S(1)) / 15
We can make a triangular table of the errors:
err2 = T(:) - I;
err4 = [NaN; S(:) - I];
err6 = [NaN; NaN; R - I];
format short e
table(err2, err4, err6, variablenames=["order 2", "order 4", "order 6"])
If we consider the computational time to be dominated by evaluations of f f f , then we have obtained a result with about twice as many accurate digits as the best trapezoid result, at virtually no extra cost.
Example 5.6.3 We estimate ∫ 0 2 x 2 e − 2 x d x \displaystyle\int_0^2 x^2 e^{-2x}\, dx ∫ 0 2 x 2 e − 2 x d x using extrapolation. First we use quadgk
to get an accurate value.
from scipy.integrate import quad
f = lambda x: x**2 * exp(-2 * x)
a = 0
b = 2
I, errest = quad(f, a, b, epsabs=1e-13, epsrel=1e-13)
print(f"Integral = {I:.14f}")
Integral = 0.19047417361161
We start with the trapezoid formula on n = N n=N n = N nodes.
N = 20 # the coarsest formula
n = N
h = (b - a) / n
t = h * arange(n + 1)
y = f(t)
We can now apply weights to get the estimate T f ( N ) T_f(N) T f ( N ) .
T = zeros(3)
T[0] = h * (sum(y[1:-1]) + y[0] / 2 + y[-1] / 2)
print(f"error (2nd order): {I - T[0]:.2e}")
error (2nd order): 6.27e-05
Now we double to n = 2 N n=2N n = 2 N , but we only need to evaluate f f f at every other interior node and apply (5.6.18) .
n = 2 * n
h = h / 2
t = h * arange(n + 1)
T[1] = T[0] / 2 + h * sum(f(t[1:-1:2]))
print("error (2nd order):", I - T[:2])
error (2nd order): [6.27236723e-05 1.53677521e-05]
As expected for a second-order estimate, the error went down by a factor of about 4. We can repeat the same code to double n n n again.
n = 2 * n
h = h / 2
t = h * arange(n + 1)
T[2] = T[1] / 2 + h * sum(f(t[1:-1:2]))
print("error (2nd order):", I - T[:3])
error (2nd order): [6.27236723e-05 1.53677521e-05 3.82230697e-06]
Let us now do the first level of extrapolation to get results from Simpson’s formula. We combine the elements T[i]
and T[i+1]
the same way for i = 1 i=1 i = 1 and i = 2 i=2 i = 2 .
S = array([(4 * T[i + 1] - T[i]) / 3 for i in range(2)])
print("error (4th order):", I - S)
error (4th order): [-4.17554646e-07 -2.61747412e-08]
With the two Simpson values S f ( N ) S_f(N) S f ( N ) and S f ( 2 N ) S_f(2N) S f ( 2 N ) in hand, we can do one more level of extrapolation to get a sixth-order accurate result.
R = (16 * S[1] - S[0]) / 15
print("error (6th order):", I - R)
error (6th order): -8.274758656057202e-11
We can make a triangular table of the errors:
err = nan * ones((3, 3))
err[0, :] = I - T
err[1, 1:] = I - S
err[2, 2] = I - R
results = PrettyTable(["2nd order", "4th order", "6th order"])
results.add_rows(err.T)
print(results)
+------------------------+-------------------------+------------------------+
| 2nd order | 4th order | 6th order |
+------------------------+-------------------------+------------------------+
| 6.272367234608223e-05 | nan | nan |
| 1.5367752102174448e-05 | -4.1755464580406354e-07 | nan |
| 3.822306969658573e-06 | -2.6174741207807273e-08 | -8.274758656057202e-11 |
+------------------------+-------------------------+------------------------+
If we consider the computational time to be dominated by evaluations of f f f , then we have obtained a result with about twice as many accurate digits as the best trapezoid result, at virtually no extra cost.
5.6.4 Exercises ¶ ⌨ For each integral below, use Function 5.6.1 to estimate the integral for n = 10 ⋅ 2 k n=10\cdot 2^k n = 10 ⋅ 2 k nodes for k = 1 , 2 , … , 10 k=1,2,\ldots,10 k = 1 , 2 , … , 10 . Make a log-log plot of the errors and confirm or refute second-order accuracy. (These integrals were taken from Bailey et al. (2005) .)
(a) ∫ 0 1 x log ( 1 + x ) d x = 1 4 \displaystyle \int_0^1 x\log(1+x)\, dx = \frac{1}{4} ∫ 0 1 x log ( 1 + x ) d x = 4 1
(b) ∫ 0 1 x 2 tan − 1 x d x = π − 2 + 2 log 2 12 \displaystyle \int_0^1 x^2 \tan^{-1}x\, dx = \frac{\pi-2+2\log 2}{12} ∫ 0 1 x 2 tan − 1 x d x = 12 π − 2 + 2 log 2
(c) ∫ 0 π / 2 e x cos x d x = e π / 2 − 1 2 \displaystyle \int_0^{\pi/2}e^x \cos x\, dx = \frac{e^{\pi/2}-1}{2} ∫ 0 π /2 e x cos x d x = 2 e π /2 − 1
(d) ∫ 0 1 x log ( x ) d x = − 4 9 \displaystyle \int_0^1 \sqrt{x} \log(x) \, dx = -\frac{4}{9} ∫ 0 1 x log ( x ) d x = − 9 4 (Note: Although the integrand has the limiting value zero as x → 0 x\to 0 x → 0 , it cannot be evaluated naively at x = 0 x=0 x = 0 . You can start the integral at x = ϵ mach x=\macheps x = ϵ mach instead.)
(e) ∫ 0 1 1 − x 2 d x = π 4 \displaystyle \int_0^1 \sqrt{1-x^2}\,\, dx = \frac{\pi}{4} ∫ 0 1 1 − x 2 d x = 4 π
✍ The Euler–Maclaurin error expansion (5.6.9) for the trapezoid formula implies that if we could cancel out the term due to f ′ ( b ) − f ′ ( a ) f'(b)-f'(a) f ′ ( b ) − f ′ ( a ) , we would obtain fourth-order accuracy. We should not assume that f ′ f' f ′ is available, but approximating it with finite differences can achieve the same goal. Suppose the forward difference formula (5.4.13) is used for f ′ ( a ) f'(a) f ′ ( a ) , and its reflected backward difference is used for f ′ ( b ) f'(b) f ′ ( b ) . Show that the resulting modified trapezoid formula is
G f ( h ) = T f ( h ) − h 24 [ 3 ( f ( t n ) + f ( t 0 ) ) − 4 ( f ( t n − 1 ) + f ( t 1 ) ) + ( f ( t n − 2 ) + f ( t 2 ) ) ] , G_f(h) = T_f(h) - \frac{h}{24} \left[ 3\Bigl( f(t_n)+f(t_0) \Bigr) -4\Bigr( f(t_{n-1}) + f(t_1) \Bigr) + \Bigl( f(t_{n-2})+f(t_2) \Bigr) \right], G f ( h ) = T f ( h ) − 24 h [ 3 ( f ( t n ) + f ( t 0 ) ) − 4 ( f ( t n − 1 ) + f ( t 1 ) ) + ( f ( t n − 2 ) + f ( t 2 ) ) ] , which is known as a Gregory integration formula .
⌨ Repeat each integral in Exercise 1 above using Gregory integration (5.6.19) instead of the trapezoid formula. Compare the observed errors to fourth-order convergence.
✍ Simpson’s formula can be derived without appealing to extrapolation.
(a) Show that
p ( x ) = β + γ − α 2 h x + α − 2 β + γ 2 h 2 x 2 p(x) = \beta + \frac{\gamma-\alpha}{2h}\, x + \frac{\alpha-2\beta+\gamma}{2h^2}\, x^2 p ( x ) = β + 2 h γ − α x + 2 h 2 α − 2 β + γ x 2 interpolates the three points ( − h , α ) (-h,\alpha) ( − h , α ) , ( 0 , β ) (0,\beta) ( 0 , β ) , and ( h , γ ) (h,\gamma) ( h , γ ) .
(b) Find
∫ − h h p ( s ) d s , \int_{-h}^h p(s)\, ds, ∫ − h h p ( s ) d s , where p p p is the quadratic polynomial from part (a), in terms of h h h , α, β, and γ.
(c) Assume equally spaced nodes in the form t i = a + i h t_i=a+ih t i = a + ih , for h = ( b − a ) / n h=(b-a)/n h = ( b − a ) / n and i = 0 , … , n i=0,\ldots,n i = 0 , … , n . Suppose f f f is approximated by p ( x ) p(x) p ( x ) over the subinterval [ t i − 1 , t i + 1 ] [t_{i-1},t_{i+1}] [ t i − 1 , t i + 1 ] . Apply the result from part (b) to find
∫ t i − 1 t i + 1 f ( x ) d x ≈ h 3 [ f ( t i − 1 ) + 4 f ( t i ) + f ( t i + 1 ) ] . \int_{t_{i-1}}^{t_{i+1}} f(x)\, dx \approx \frac{h}{3} \bigl[ f(t_{i-1}) + 4f(t_i) + f(t_{i+1}) \bigr]. ∫ t i − 1 t i + 1 f ( x ) d x ≈ 3 h [ f ( t i − 1 ) + 4 f ( t i ) + f ( t i + 1 ) ] . (Use the change of variable s = x − t i s=x-t_i s = x − t i .)
(d) Now also assume that n = 2 m n=2m n = 2 m for an integer m m m . Derive Simpson’s formula,
∫ a b f ( x ) d x ≈ h 3 [ f ( t 0 ) + 4 f ( t 1 ) + 2 f ( t 2 ) + 4 f ( t 3 ) + 2 f ( t 4 ) + ⋯ + 2 f ( t n − 2 ) + 4 f ( t n − 1 ) + f ( t n ) ] . \begin{split}
\int_a^b f(x)\, dx \approx \frac{h}{3}\bigl[ &f(t_0) + 4f(t_1) + 2f(t_2) + 4f(t_3) + 2f(t_4) + \cdots\\
&+ 2f(t_{n-2}) + 4f(t_{n-1}) + f(t_n) \bigr].
\end{split} ∫ a b f ( x ) d x ≈ 3 h [ f ( t 0 ) + 4 f ( t 1 ) + 2 f ( t 2 ) + 4 f ( t 3 ) + 2 f ( t 4 ) + ⋯ + 2 f ( t n − 2 ) + 4 f ( t n − 1 ) + f ( t n ) ] . ✍ Show that the Simpson formula (5.6.23) is equivalent to S f ( n / 2 ) S_f(n/2) S f ( n /2 ) , given the definition of S f S_f S f in (5.6.14) .
⌨ For each integral in Exercise 1 above, apply the Simpson formula (5.6.23) and compare the errors to fourth-order convergence.
⌨ For n = 10 , 20 , 30 , … , 200 n=10,20,30,\ldots,200 n = 10 , 20 , 30 , … , 200 , compute the trapezoidal approximation to
∫ 0 1 1 2.01 + sin ( 6 π x ) − cos ( 2 π x ) d x ≈ 0.9300357672424684. \int_{0}^1 \frac{1}{2.01+\sin (6\pi x)-\cos(2\pi x)} \,d x \approx 0.9300357672424684. ∫ 0 1 2.01 + sin ( 6 π x ) − cos ( 2 π x ) 1 d x ≈ 0.9300357672424684. Make two separate plots of the absolute error as a function of n n n , one using a log-log scale and the other using log-linear. The graphs suggest that the error asymptotically behaves as C α n C \alpha^n C α n for some C > 0 C>0 C > 0 and some 0 < α < 1 0<\alpha<1 0 < α < 1 . How does this result relate to (5.6.9) ?
⌨ For each integral in Exercise 1 above, extrapolate the trapezoidal results two levels to get sixth-order accurate results, and compute the errors for each value.
✍ Find a formula like (5.6.17) that extrapolates two values of R f R_f R f to obtain an eighth-order accurate one.