The derivation and analysis of methods for initial-value problems usually assumes a fixed step size h h h . While the error behavior O ( h p ) O(h^p) O ( h p ) is guaranteed by Theorem 6.2.1 as h → 0 h\rightarrow 0 h → 0 , this bound comes with an unknowable constant, and it is not very useful as a guide to the numerical value of the error at any particular value of h h h . Furthermore, as we saw in Adaptive integration for numerical integration, in many problems a fixed step size is far from the most efficient strategy.
In response we will employ the basic strategy of Adaptive integration : estimate the error and adapt the step size in order to reach an accuracy goal. Unlike the integration problem, though, the “integrand” of an IVP is dependent on the solution itself, so the details differ greatly.
6.5.1 Step size prediction ¶ Suppose that, starting from a given value u i u_i u i and using a step size h h h , we run one step of two different RK methods simultaneously: one method with order p p p , producing u i + 1 u_{i+1} u i + 1 , and the other method with order p + 1 p+1 p + 1 , producing u ~ i + 1 \tilde{u}_{i+1} u ~ i + 1 . In most circumstances, we can expect that u ~ i + 1 \tilde{\mathbf{u}}_{i+1} u ~ i + 1 is a much better approximation to the solution than u i + 1 \mathbf{u}_{i+1} u i + 1 is. So it seems reasonable to use
E i ( h ) = ∣ u ~ i + 1 − u i + 1 ∣ E_i(h)=|\tilde{\mathbf{u}}_{i+1} - \mathbf{u}_{i+1}| E i ( h ) = ∣ u ~ i + 1 − u i + 1 ∣ as an estimate of the actual local error made by the p p p th-order method. For a vector IVP , we would use a norm rather than an absolute value.
Now we ask: looking back, what step size should we have taken to meet an error target of size ε? Let’s speculate, given the behavior of local truncation error as h → 0 h\rightarrow 0 h → 0 , that E i ( h ) ≈ C h p + 1 E_i(h)\approx C h^{p+1} E i ( h ) ≈ C h p + 1 for an unknown constant C C C . If we had used a step size q h q h q h for some q > 0 q>0 q > 0 , then trivially, we would expect
E i ( q h ) ≈ C q p + 1 h p + 1 . E_i(qh)\approx C q^{p+1}h^{p+1}. E i ( q h ) ≈ C q p + 1 h p + 1 . Our best guess for q q q would therefore be to set E i ( q h ) ≈ ϵ E_i(qh)\approx \epsilon E i ( q h ) ≈ ϵ , or
q ≈ ( ϵ E i ) 1 / ( p + 1 ) . q \approx \left(\frac{\epsilon}{E_i}\right)^{1/(p+1)}. q ≈ ( E i ϵ ) 1/ ( p + 1 ) . Perhaps, though, we should aim to control the contribution to global error, which is closer to E i ( q h ) / ( q h ) E_i(qh)/(q h) E i ( q h ) / ( q h ) . Then we end up with
q ≤ ( ϵ E i ) 1 / p . q \le \left(\frac{\epsilon}{E_i}\right)^{1/p}. q ≤ ( E i ϵ ) 1/ p . Experts have different recommendations about whether to use (6.5.3) or (6.5.4) . Even though (6.5.4) appears to be more in keeping with our assumptions about global errors, modern practice seems to favor (6.5.3) .
We now have an outline of an algorithm.
Many details remain unspecified at this point, but we first address step 1.
Suppose, for example, we choose to use a pair of second- and third-order RK methods to get the u i + 1 \mathbf{u}_{i+1} u i + 1 and u ~ i + 1 \tilde{\mathbf{u}}_{i+1} u ~ i + 1 needed in Algorithm 6.5.1 . Then we seem to need at least 2 + 3 = 5 2+3=5 2 + 3 = 5 evaluations of f ( t , y ) f(t,y) f ( t , y ) for each attempted time step. This is more than double the computational work needed by the second-order method without adaptivity.
Fortunately, the marginal cost of adaptivity can be substantially reduced by using embedded Runge–Kutta formulas. Embedded RK formulas are a pair of RK methods whose stages share the same internal f f f evaluations, combining them differently in order to get estimates of two different orders of accuracy.
A good example of an embedded method is the Bogacki–Shampine (BS23) formula, given by the table
0 1 2 1 2 3 4 0 3 4 1 2 9 1 3 4 9 2 9 1 3 4 9 0 7 24 1 4 1 3 1 8 \begin{array}{r|cccc}
0 & \rule{0pt}{2.75ex} & & & \\
\frac{1}{2} & \frac{1}{2} & \rule{0pt}{2.75ex} & & \\
\frac{3}{4} & 0 & \frac{3}{4} & \rule{0pt}{2.75ex} & \\
1 & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & \rule{0pt}{2.75ex} \\[2pt] \hline
\rule{0pt}{2.75ex} & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & 0 \\[2pt] \hline
\rule{0pt}{2.75ex} & \frac{7}{24} & \frac{1}{4} & \frac{1}{3} & \frac{1}{8}
\end{array} 0 2 1 4 3 1 2 1 0 9 2 9 2 24 7 4 3 3 1 3 1 4 1 9 4 9 4 3 1 0 8 1 The top part of the table describes four stages in the usual RK fashion. The last two rows describe how to construct a third-order estimate u ~ i + 1 \tilde{\mathbf{u}}_{i+1} u ~ i + 1 and a second-order estimate u i + 1 \mathbf{u}_{i+1} u i + 1 by taking different combinations of those stages.
6.5.3 Implementation ¶ Our implementation of an embedded second/third-order (RK23) code is given in Function 6.5.2 .
Adaptive IVP solver based on embedded RK formulas1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
"""
rk23(ivp, tol)
Apply an adaptive embedded RK formula pair to solve given IVP with
estimated error `tol`. Returns a vector of times and a vector of
solution values.
"""
function rk23(ivp, tol)
# Initialize for the first time step.
a, b = ivp.tspan
t = [a]
u = [float(ivp.u0)]
i = 1
h = 0.5 * tol^(1 / 3)
s₁ = ivp.f(ivp.u0, ivp.p, a)
# Time stepping.
while t[i] < b
# Detect underflow of the step size.
if t[i] + h == t[i]
@warn "Stepsize too small near t=$(t[i])"
break # quit time stepping loop
end
# New RK stages.
s₂ = ivp.f(u[i] + (h / 2) * s₁, ivp.p, t[i] + h / 2)
s₃ = ivp.f(u[i] + (3h / 4) * s₂, ivp.p, t[i] + 3h / 4)
u_new3 = u[i] + h * (2s₁ + 3s₂ + 4s₃) / 9 # 3rd order solution
s₄ = ivp.f(u_new3, ivp.p, t[i] + h)
err = h * (-5s₁ / 72 + s₂ / 12 + s₃ / 9 - s₄ / 8) # 2nd/3rd difference
E = norm(err, Inf) # error estimate
maxerr = tol * (1 + norm(u[i], Inf)) # relative/absolute blend
# Accept the proposed step?
if E < maxerr # yes
push!(t, t[i] + h)
push!(u, u_new3)
i += 1
s₁ = s₄ # use FSAL property
end
# Adjust step size.
q = 0.8 * (maxerr / E)^(1 / 3) # conservative optimal step factor
q = min(q, 4) # limit stepsize growth
h = min(q * h, b - t[i]) # don't step past the end
end
return t, u
end
About the code
The check t[i]+h == t[i]
on line 19 is to detect when h h h has become so small that it no longer changes the floating-point value of t i t_i t i . This may be a sign that the underlying exact solution has a singularity near t = t i t=t_i t = t i , but in any case, the solver must halt by using a break
statement to exit the loop.
On line 30, we use a combination of absolute and relative tolerances to judge the acceptability of a solution value, as in (5.7.6) . In lines 41--43 we underestimate the step factor q q q a bit and prevent a huge increase in the step size, since a rejected step is expensive, and then we make sure that our final step doesn’t take us past the end of the domain.
Finally, line 37 exploits a subtle property of the BS23 formula called first same as last (FSAL).
While (6.5.5) calls for four stages to find the paired second- and third-order estimates, the final stage computed in stepping from t i t_i t i to t i + 1 t_{i+1} t i + 1 is identical to the first stage needed to step from t i + 1 t_{i+1} t i + 1 to t i + 2 t_{i+2} t i + 2 . By repurposing s₄
as s₁
for the next pass, one of the stage evaluations comes for free, and only three evaluations of f f f are needed per successful step.
Adaptive IVP solver based on embedded RK formulas1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
function [t, u] = rk23(ivp, a, b, tol)
% RK23 Adaptive IVP solver based on embedded RK formulas.
% Input:
% ivp structure defining the IVP
% a, b endpoints of time interval (scalars)
% tol global error target (positive scalar)
% Output:
% t selected nodes (vector, length n+1)
% u solution values (array, m by (n+1))
du_dt = ivp.ODEFcn;
u0 = ivp.InitialValue;
p = ivp.Parameters;
% Initialize for the first time step.
t = a;
u(:, 1) = u0(:); i = 1;
h = 0.5 * tol^(1/3);
s1 = du_dt(t(1), u(:, 1) ,p);
% Time stepping.
while t(i) < b
% Detect underflow of the step size.
if t(i) + h == t(i)
warning('Stepsize too small near t=%.6g.',t(i))
break % quit time stepping loop
end
% New RK stages.
s2 = du_dt(t(i) + h/2, u(:, i) + (h/2) * s1, p);
s3 = du_dt(t(i) + 3*h/4, u(:, i) + (3*h/4) * s2, p);
unew2 = u(:, i) + h * (2*s1 + 3*s2 + 4*s3) / 9; % 2rd order solution
s4 = du_dt(t(i) + h, unew2, p );
err = h * (-5*s1/72 + s2/12 + s3/9 - s4/8); % 2nd/3rd order difference
E = norm(err, Inf); % error estimate
maxerr = tol * (1 + norm(u(:, i), Inf)); % relative/absolute blend
% Accept the proposed step?
if E < maxerr % yes
t(i+1) = t(i) + h;
u(:, i+1) = unew2;
i = i+1;
s1 = s4; % use FSAL property
end
% Adjust step size.
q = 0.8 * (maxerr/E)^(1/3); % conservative optimal step factor
q = min(q, 4); % limit stepsize growth
h = min(q*h, b - t(i)); % don't step past the end
end
About the code
The check t(i) + h == t(i)
on line 24 is to detect when h h h has become so small that it no longer changes the floating-point value of t i t_i t i . This may be a sign that the underlying exact solution has a singularity near t = t i t=t_i t = t i , but in any case, the solver must halt by using a break
statement to exit the loop.
On line 36, we use a combination of absolute and relative tolerances to judge the acceptability of a solution value, as in (5.7.6) . In lines 47--49 we underestimate the step factor q q q a bit and prevent a huge increase in the step size, since a rejected step is expensive, and then we make sure that our final step doesn’t take us past the end of the domain.
Finally, line 43 exploits a subtle property of the BS23 formula called first same as last (FSAL).
While (6.5.5) calls for four stages to find the paired second- and third-order estimates, the final stage computed in stepping from t i t_i t i to t i + 1 t_{i+1} t i + 1 is identical to the first stage needed to step from t i + 1 t_{i+1} t i + 1 to t i + 2 t_{i+2} t i + 2 . By repurposing s4
as s1
for the next pass, one of the stage evaluations comes for free, and only three evaluations of f f f are needed per successful step.
Adaptive IVP solver based on embedded RK formulas1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def euler(du_dt, tspan, u0, n):
"""
euler(du_dt, tspan, u0, n)
Apply Euler's method to solve the IVP u'=du_dt(u,t) over the interval tspan with
u(tspan[1])=u0, using n subintervals/steps. Return vectors of times and solution
values.
"""
a, b = tspan
h = (b - a) / n
t = np.linspace(a, b, n+1)
u = np.tile(np.array(u0), (n+1, 1))
for i in range(n):
u[i+1] = u[i] + h * du_dt(t[i], u[i])
return t, u.T
About the code
The check t[i]+h==t[i]
on line 19 is to detect when h h h has become so small that it no longer changes the floating-point value of t i t_i t i . This may be a sign that the underlying exact solution has a singularity near t = t i t=t_i t = t i , but in any case, the solver must halt by using a break
statement to exit the loop.
On line 30, we use a combination of absolute and relative tolerances to judge the acceptability of a solution value, as in (5.7.6) . In lines 41--43 we underestimate the step factor q q q a bit and prevent a huge increase in the step size, since a rejected step is expensive, and then we make sure that our final step doesn’t take us past the end of the domain.
Finally, line 37 exploits a subtle property of the BS23 formula called first same as last (FSAL).
While (6.5.5) calls for four stages to find the paired second- and third-order estimates, the final stage computed in stepping from t i t_i t i to t i + 1 t_{i+1} t i + 1 is identical to the first stage needed to step from t i + 1 t_{i+1} t i + 1 to t i + 2 t_{i+2} t i + 2 . By repurposing s₄
as s₁
for the next pass, one of the stage evaluations comes for free, and only three evaluations of f f f are needed per successful step.
Example 6.5.1 (Adaptive step size)
Example 6.5.1 Let’s run adaptive RK on u ′ = e t − u sin u u'=e^{t-u\sin u} u ′ = e t − u s i n u .
using OrdinaryDiffEq, Plots
f(u, p, t) = exp(t - u * sin(u))
ivp = ODEProblem(f, 0, (0.0, 5.0))
t, u = FNC.rk23(ivp, 1e-5)
plot(t, u, m=2,
xlabel=L"t", ylabel=L"u(t)",
title="Adaptive IVP solution")
The solution makes a very abrupt change near t = 2.4 t=2.4 t = 2.4 . The resulting time steps vary over three orders of magnitude.
Δt = diff(t)
plot(t[1:end-1], Δt;
xaxis=(L"t", (0, 5)), yaxis=(:log10, "step size"),
title="Adaptive step sizes")
If we had to run with a uniform step size to get this accuracy, it would be
println("minimum step size = $(minimum(Δt))")
minimum step size = 4.6096854609878335e-5
On the other hand, the average step size that was actually taken was
println("average step size = $(sum(Δt)/(length(t)-1))")
average step size = 0.03205128205128205
We took fewer steps by a factor of almost 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.
Example 6.5.1 Let’s run adaptive RK on u ′ = e t − u sin u u'=e^{t-u\sin u} u ′ = e t − u s i n u .
ivp = ode;
ivp.ODEFcn = @(t, u, p) exp(t - u * sin(u));
ivp.InitialValue = 0;
a = 0; b = 5;
[t, u] = rk23(ivp, a, b, 1e-5);
clf, plot(t, u)
xlabel("t"); ylabel("u(t)")
title(("Adaptive IVP solution"));
The solution makes a very abrupt change near t = 2.4 t=2.4 t = 2.4 . The resulting time steps vary over three orders of magnitude.
Delta_t = diff(t);
semilogy(t(1:end-1), Delta_t)
xlabel("t"); ylabel("step size")
title(("Adaptive step sizes"));
If we had to run with a uniform step size to get this accuracy, it would be
fprintf("minimum step size = %.2e", min(Delta_t))
On the other hand, the average step size that was actually taken was
fprintf("average step size = %.2e", mean(Delta_t))
We took fewer steps by a factor of almost 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.
Example 6.5.1 Let’s run adaptive RK on u ′ = e t − u sin u u'=e^{t-u\sin u} u ′ = e t − u s i n u .
f = lambda t, u: exp(t - u * sin(u))
t, u = FNC.rk23(f, [0.0, 5.0], [0.0], 1e-5)
scatter(t, u[0, :])
xlabel("$t$"), ylabel("$u(t)$")
title(("Adaptive IVP solution"));
The solution makes a very abrupt change near t = 2.4 t=2.4 t = 2.4 . The resulting time steps vary over three orders of magnitude.
dt = [t[i + 1] - t[i] for i in range(t.size - 1)]
semilogy(t[:-1], dt)
xlabel("$t$"), ylabel("time step")
title(("Adaptive step sizes"));
If we had to run with a uniform step size to get this accuracy, it would be
print(f"min step size was {min(dt):.2e}")
min step size was 4.61e-05
On the other hand, the average step size that was actually taken was
print(f"mean step size was {mean(dt):.2e}")
mean step size was 3.21e-02
We took fewer steps by a factor of 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.
Example 6.5.2 (Adaptive step size near a singularity)
Example 6.5.2 In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.
f(u, p, t) = (t + u)^2
ivp = ODEProblem(f, 1, (0.0, 1.0))
t, u = FNC.rk23(ivp, 1e-5);
┌ Warning: Stepsize too small near t=0.7854087204072808
└ @ FNCFunctions ~/Documents/GitHub/FNCFunctions.jl/src/chapter06.jl:102
In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.
plot(t, u;
legend=:none,
xlabel=L"t", yaxis=(:log10, L"u(t)"),
title="Finite-time blowup")
tf = t[end]
vline!([tf], l=:dash)
annotate!(tf, 1e5, latexstring(@sprintf("t = %.6f ", tf)), :right)
Example 6.5.2 In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.
ivp = ode;
ivp.ODEFcn = @(t, u, p) (t + u)^2;
ivp.InitialValue = 1;
[t, u] = rk23(ivp, 0, 1, 1e-5);
Warning: Stepsize too small near t=0.785409.
In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.
clf, semilogy(t, u)
xlabel("t"); ylabel("u(t)")
title("Adaptive solution near a singularity")
tf = t(end);
xline(tf, "linestyle", "--")
text(tf, 1e5, sprintf(" t = %.6f ", tf))
Example 6.5.2 In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.
du_dt = lambda t, u: (t + u)**2
tspan = (0.0, 2.0)
u0 = [1.0]
t, u = FNC.rk23(du_dt, tspan, u0, 1e-5)
/Users/driscoll/Dropbox/Mac/Documents/GitHub/fnc/python/fncbook/fncbook/chapter06.py:90: UserWarning: Stepsize too small near t=0.785408720407281
warnings.warn(f"Stepsize too small near t={t[i]}")
In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.
semilogy(t, u[0, :])
xlabel("$t$"), ylabel("$u(t)$")
title("Finite-time blowup")
tf = t[-1]
axvline(x=tf, color='k', linestyle='--', label=f"t = {tf:.6f}")
legend();
Often the adaptively chosen steps clearly correspond to identifiable features of the solution. However, there are so-called stiff problems in which the time steps seem unreasonably small in relation to the observable behavior of the solution. These problems benefit from a particular type of solver that is considered in Implementation of multistep methods .
6.5.4 Exercises ¶ ⌨ Using Function 6.5.2 with an error tolerance of 10-8 , solve y ′ ′ + ( 1 + y ′ ) 3 y = 0 y'' +(1+y')^3 y = 0 y ′′ + ( 1 + y ′ ) 3 y = 0 over 0 ≤ t ≤ 4 π 0 \le t \le 4 \pi 0 ≤ t ≤ 4 π with the indicated initial conditions. Plot y ( t ) y(t) y ( t ) and y ′ ( t ) y'(t) y ′ ( t ) as functions of t t t and separately plot the time step size as a function of t t t .
(a) y ( 0 ) = 0.1 , y ′ ( 0 ) = 0 y(0) = 0.1, \quad y'(0) = 0 y ( 0 ) = 0.1 , y ′ ( 0 ) = 0
(b) y ( 0 ) = 0.5 , y ′ ( 0 ) = 0 y(0) = 0.5, \quad y'(0) = 0 y ( 0 ) = 0.5 , y ′ ( 0 ) = 0
(c) y ( 0 ) = 0.75 , y ′ ( 0 ) = 0 y(0) = 0.75, \quad y'(0) = 0 y ( 0 ) = 0.75 , y ′ ( 0 ) = 0
(d) y ( 0 ) = 0.95 , y ′ ( 0 ) = 0 y(0) = 0.95, \quad y'(0) = 0 y ( 0 ) = 0.95 , y ′ ( 0 ) = 0
⌨ Solve the FitzHugh–Nagumo system from Exercise 4.3.6 for I = 0.05740 I=0.05740 I = 0.05740 using Function 6.5.2 with error tolerance 10-2 , 10-3 , and 10-4 . (This illustrates that the error tolerance is a target, not a guarantee!)
✍ Derive Equation (6.5.4) using the stated assumption about controlling global rather than local error.
⌨ Solve the problem u ′ = 100 u 2 − u 3 u'=100u^2-u^3 u ′ = 100 u 2 − u 3 , u ( 0 ) = 0.0002 u(0)=0.0002 u ( 0 ) = 0.0002 , 0 ≤ t ≤ 100 0\le t \le 100 0 ≤ t ≤ 100 , and make plots that show both the solution and the time steps taken. The solution makes a quick transition between two nearly constant states. Does the step size selection behave the same in both states?