Adaptive Runge–Kutta - Fundamentals of Numerical Computation

The derivation and analysis of methods for initial-value problems usually assumes a fixed step size $h$ . While the error behavior $O(h^p)$ is guaranteed by Theorem 6.2.1 as $h\rightarrow 0$ , this bound comes with an unknowable constant, and it is not very useful as a guide to the numerical value of the error at any particular value of $h$ . Furthermore, as we saw in Adaptive integration for numerical integration, in many problems a fixed step size is far from the most efficient strategy.

In response we will employ the basic strategy of Adaptive integration: estimate the error and adapt the step size in order to reach an accuracy goal. Unlike the integration problem, though, the “integrand” of an IVP is dependent on the solution itself, so the details differ greatly.

6.5.1Step size prediction¶

Suppose that, starting from a given value $u_i$ and using a step size $h$ , we run one step of two different RK methods simultaneously: one method with order $p$ , producing $u_{i+1}$ , and the other method with order $p+1$ , producing $\tilde{u}_{i+1}$ . In most circumstances, we can expect that $\tilde{\mathbf{u}}_{i+1}$ is a much better approximation to the solution than $\mathbf{u}_{i+1}$ is. So it seems reasonable to use

E_i(h)=|\tilde{\mathbf{u}}_{i+1} - \mathbf{u}_{i+1}|

(6.5.1)

as an estimate of the actual local error made by the $p$ th-order method. For a vector IVP, we would use a norm rather than an absolute value.

Now we ask: looking back, what step size should we have taken to meet an error target of size ε? Let’s speculate, given the behavior of local truncation error as $h\rightarrow 0$ , that $E_i(h)\approx C h^{p+1}$ for an unknown constant $C$ . If we had used a step size $q h$ for some $q>0$ , then trivially, we would expect

E_i(qh)\approx C q^{p+1}h^{p+1}.

(6.5.2)

Our best guess for $q$ would therefore be to set $E_i(qh)\approx \epsilon$ , or

q \approx \left(\frac{\epsilon}{E_i}\right)^{1/(p+1)}.

(6.5.3)

Perhaps, though, we should aim to control the contribution to global error, which is closer to $E_i(qh)/(q h)$ . Then we end up with

q \le \left(\frac{\epsilon}{E_i}\right)^{1/p}.

(6.5.4)

Experts have different recommendations about whether to use (6.5.3) or (6.5.4). Even though (6.5.4) appears to be more in keeping with our assumptions about global errors, modern practice seems to favor (6.5.3).

We now have an outline of an algorithm.

Many details remain unspecified at this point, but we first address step 1.

6.5.2Embedded formulas¶

Suppose, for example, we choose to use a pair of second- and third-order RK methods to get the $\mathbf{u}_{i+1}$ and $\tilde{\mathbf{u}}_{i+1}$ needed in Algorithm 6.5.1. Then we seem to need at least $2+3=5$ evaluations of $f(t,y)$ for each attempted time step. This is more than double the computational work needed by the second-order method without adaptivity.

Fortunately, the marginal cost of adaptivity can be substantially reduced by using embedded Runge–Kutta formulas. Embedded RK formulas are a pair of RK methods whose stages share the same internal $f$ evaluations, combining them differently in order to get estimates of two different orders of accuracy.

A good example of an embedded method is the Bogacki–Shampine (BS23) formula, given by the table

\begin{array}{r|cccc} 0 & \rule{0pt}{2.75ex} & & & \\ \frac{1}{2} & \frac{1}{2} & \rule{0pt}{2.75ex} & & \\ \frac{3}{4} & 0 & \frac{3}{4} & \rule{0pt}{2.75ex} & \\ 1 & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & \rule{0pt}{2.75ex} \\[2pt] \hline \rule{0pt}{2.75ex} & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & 0 \\[2pt] \hline \rule{0pt}{2.75ex} & \frac{7}{24} & \frac{1}{4} & \frac{1}{3} & \frac{1}{8} \end{array}

(6.5.5)

The top part of the table describes four stages in the usual RK fashion. The last two rows describe how to construct a third-order estimate $\tilde{\mathbf{u}}_{i+1}$ and a second-order estimate $\mathbf{u}_{i+1}$ by taking different combinations of those stages.

6.5.3Implementation¶

Our implementation of an embedded second/third-order (RK23) code is given in Function 6.5.2.

Algorithm 6.5.2 (rk23)

Julia

MATLAB

Python

Adaptive IVP solver based on embedded RK formulas

rk23.jl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
"""
    rk23(ivp, tol)

Apply an adaptive embedded RK formula pair to solve given IVP with
estimated error `tol`. Returns a vector of times and a vector of
solution values.
"""
function rk23(ivp, tol)
    # Initialize for the first time step.
    a, b = ivp.tspan
    t = [a]
    u = [float(ivp.u0)]
    i = 1
    h = 0.5 * tol^(1 / 3)
    s₁ = ivp.f(ivp.u0, ivp.p, a)

    # Time stepping.
    while t[i] < b
        # Detect underflow of the step size.
        if t[i] + h == t[i]
            @warn "Stepsize too small near t=$(t[i])"
            break  # quit time stepping loop
        end

        # New RK stages.
        s₂ = ivp.f(u[i] + (h / 2) * s₁, ivp.p, t[i] + h / 2)
        s₃ = ivp.f(u[i] + (3h / 4) * s₂, ivp.p, t[i] + 3h / 4)
        u_new3 = u[i] + h * (2s₁ + 3s₂ + 4s₃) / 9   # 3rd order solution
        s₄ = ivp.f(u_new3, ivp.p, t[i] + h)
        err = h * (-5s₁ / 72 + s₂ / 12 + s₃ / 9 - s₄ / 8)  # 2nd/3rd difference
        E = norm(err, Inf)                         # error estimate
        maxerr = tol * (1 + norm(u[i], Inf))     # relative/absolute blend

        # Accept the proposed step?
        if E < maxerr     # yes
            push!(t, t[i] + h)
            push!(u, u_new3)
            i += 1
            s₁ = s₄       # use FSAL property
        end

        # Adjust step size.
        q = 0.8 * (maxerr / E)^(1 / 3)   # conservative optimal step factor
        q = min(q, 4)               # limit stepsize growth
        h = min(q * h, b - t[i])        # don't step past the end
    end
    return t, u
end

Adaptive IVP solver based on embedded RK formulas

rk23.m

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
function [t, u] = rk23(ivp, a, b, tol)
% RK23   Adaptive IVP solver based on embedded RK formulas.
% Input:
%   ivp     structure defining the IVP
%   a, b    endpoints of time interval (scalars)
%   tol     global error target (positive scalar)
% Output:
%   t       selected nodes (vector, length n+1)
%   u       solution values (array, m by (n+1))

du_dt = ivp.ODEFcn;
u0 = ivp.InitialValue;
p = ivp.Parameters;

% Initialize for the first time step.
t = a;
u(:, 1) = u0(:);  i = 1;
h = 0.5 * tol^(1/3);
s1 = du_dt(t(1), u(:, 1) ,p);

% Time stepping.
while t(i) < b
    % Detect underflow of the step size.
    if t(i) + h == t(i)
        warning('Stepsize too small near t=%.6g.',t(i))
        break  % quit time stepping loop
    end
    
    % New RK stages.
    s2 = du_dt(t(i) + h/2,   u(:, i) + (h/2)   * s1, p);
    s3 = du_dt(t(i) + 3*h/4, u(:, i) + (3*h/4) * s2, p);
    unew2 = u(:, i) + h * (2*s1 + 3*s2 + 4*s3) / 9;    % 2rd order solution
    s4 = du_dt(t(i) + h, unew2, p );
    err = h * (-5*s1/72 + s2/12 + s3/9 - s4/8);        % 2nd/3rd order difference
    E = norm(err, Inf);                                % error estimate
    maxerr = tol * (1 + norm(u(:, i), Inf));           % relative/absolute blend
    
    % Accept the proposed step? 
    if E < maxerr     % yes 
        t(i+1) = t(i) + h;
        u(:, i+1) = unew2;
        i = i+1;
        s1 = s4;      % use FSAL property
    end
    
    % Adjust step size. 
    q = 0.8 * (maxerr/E)^(1/3);       % conservative optimal step factor
    q = min(q, 4);                    % limit stepsize growth
    h = min(q*h, b - t(i));           % don't step past the end
end

Adaptive IVP solver based on embedded RK formulas

rk23.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def euler(du_dt, tspan, u0, n):
    """
    euler(du_dt, tspan, u0, n)

    Apply Euler's method to solve the IVP u'=du_dt(u,t) over the interval tspan with
    u(tspan[1])=u0, using n subintervals/steps. Return vectors of times and solution
    values.
    """
    a, b = tspan
    h = (b - a) / n
    t = np.linspace(a, b, n+1)
    u = np.tile(np.array(u0), (n+1, 1))
    for i in range(n):
        u[i+1] = u[i] + h * du_dt(t[i], u[i])

    return t, u.T

Example 6.5.1 (Adaptive step size)

Julia

MATLAB

Python

Example 6.5.1

Let’s run adaptive RK on $u'=e^{t-u\sin u}$ .

using OrdinaryDiffEq, Plots
f(u, p, t) = exp(t - u * sin(u))
ivp = ODEProblem(f, 0, (0.0, 5.0))
t, u = FNC.rk23(ivp, 1e-5)
plot(t, u, m=2,
    xlabel=L"t",  ylabel=L"u(t)", 
    title="Adaptive IVP solution")

The solution makes a very abrupt change near $t=2.4$ . The resulting time steps vary over three orders of magnitude.

Δt = diff(t)
plot(t[1:end-1], Δt;
    xaxis=(L"t", (0, 5)), yaxis=(:log10, "step size"),
    title="Adaptive step sizes")

If we had to run with a uniform step size to get this accuracy, it would be

println("minimum step size = $(minimum(Δt))")

minimum step size = 4.6096854609878335e-5

On the other hand, the average step size that was actually taken was

println("average step size = $(sum(Δt)/(length(t)-1))")

average step size = 0.03205128205128205

We took fewer steps by a factor of almost 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.

Example 6.5.1

Let’s run adaptive RK on $u'=e^{t-u\sin u}$ .

ivp = ode;
ivp.ODEFcn = @(t, u, p) exp(t - u * sin(u));
ivp.InitialValue = 0;
a = 0;  b = 5;

[t, u] = rk23(ivp, a, b, 1e-5);
clf, plot(t, u)
xlabel("t");  ylabel("u(t)")
title(("Adaptive IVP solution"));

The solution makes a very abrupt change near $t=2.4$ . The resulting time steps vary over three orders of magnitude.

Delta_t = diff(t);
semilogy(t(1:end-1), Delta_t) 
xlabel("t");  ylabel("step size")
title(("Adaptive step sizes"));

If we had to run with a uniform step size to get this accuracy, it would be

fprintf("minimum step size = %.2e", min(Delta_t))

minimum step size =

4.61e-05

On the other hand, the average step size that was actually taken was

fprintf("average step size = %.2e", mean(Delta_t))

average step size =

3.21e-02

We took fewer steps by a factor of almost 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.

Example 6.5.1

Let’s run adaptive RK on $u'=e^{t-u\sin u}$ .

f = lambda t, u: exp(t - u * sin(u))
t, u = FNC.rk23(f, [0.0, 5.0], [0.0], 1e-5)
scatter(t, u[0, :])
xlabel("$t$"), ylabel("$u(t)$")
title(("Adaptive IVP solution"));

The solution makes a very abrupt change near $t=2.4$ . The resulting time steps vary over three orders of magnitude.

dt = [t[i + 1] - t[i] for i in range(t.size - 1)]
semilogy(t[:-1], dt)
xlabel("$t$"), ylabel("time step")
title(("Adaptive step sizes"));

If we had to run with a uniform step size to get this accuracy, it would be

print(f"min step size was {min(dt):.2e}")

min step size was 4.61e-05

On the other hand, the average step size that was actually taken was

print(f"mean step size was {mean(dt):.2e}")

mean step size was 3.21e-02

We took fewer steps by a factor of 1000! Even accounting for the extra stage per step and the occasional rejected step, the savings are clear.

Example 6.5.2 (Adaptive step size near a singularity)

Julia

MATLAB

Python

Example 6.5.2

In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.

f(u, p, t) = (t + u)^2
ivp = ODEProblem(f, 1, (0.0, 1.0))
t, u = FNC.rk23(ivp, 1e-5);

┌ Warning: Stepsize too small near t=0.7854087204072808
└ @ FNCFunctions ~/Documents/GitHub/FNCFunctions.jl/src/chapter06.jl:102

In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.

plot(t, u;
    legend=:none,
    xlabel=L"t",  yaxis=(:log10, L"u(t)"), 
    title="Finite-time blowup")

tf = t[end]
vline!([tf], l=:dash)
annotate!(tf, 1e5, latexstring(@sprintf("t = %.6f ", tf)), :right)

Example 6.5.2

In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.

ivp = ode;
ivp.ODEFcn = @(t, u, p) (t + u)^2;
ivp.InitialValue = 1;
[t, u] = rk23(ivp, 0, 1, 1e-5);

Warning: Stepsize too small near t=0.785409.

In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.

clf, semilogy(t, u)
xlabel("t");  ylabel("u(t)")
title("Adaptive solution near a singularity")

tf = t(end);
xline(tf, "linestyle", "--")
text(tf, 1e5, sprintf(" t = %.6f ", tf))

Example 6.5.2

In Demo 6.1.3 we saw an IVP that appears to blow up in a finite amount of time. Because the solution increases so rapidly as it approaches the blowup, adaptive stepping is required even to get close.

du_dt = lambda t, u: (t + u)**2
tspan = (0.0, 2.0)
u0 = [1.0]
t, u = FNC.rk23(du_dt, tspan, u0, 1e-5)

/Users/driscoll/Dropbox/Mac/Documents/GitHub/fnc/python/fncbook/fncbook/chapter06.py:90: UserWarning: Stepsize too small near t=0.785408720407281
  warnings.warn(f"Stepsize too small near t={t[i]}")

In fact, the failure of the adaptivity gives a decent idea of when the singularity occurs.

semilogy(t, u[0, :])
xlabel("$t$"),  ylabel("$u(t)$")
title("Finite-time blowup")

tf = t[-1]
axvline(x=tf, color='k', linestyle='--', label=f"t = {tf:.6f}")
legend();

Often the adaptively chosen steps clearly correspond to identifiable features of the solution. However, there are so-called stiff problems in which the time steps seem unreasonably small in relation to the observable behavior of the solution. These problems benefit from a particular type of solver that is considered in Implementation of multistep methods.

6.5.4Exercises¶

⌨ Using Function 6.5.2 with an error tolerance of 10^-8, solve $y'' +(1+y')^3 y = 0$ over $0 \le t \le 4 \pi$ with the indicated initial conditions. Plot $y(t)$ and $y'(t)$ as functions of $t$ and separately plot the time step size as a function of $t$ .
(a) $y(0) = 0.1, \quad y'(0) = 0$
(b) $y(0) = 0.5, \quad y'(0) = 0$
(c) $y(0) = 0.75, \quad y'(0) = 0$
(d) $y(0) = 0.95, \quad y'(0) = 0$
⌨ Solve the FitzHugh–Nagumo system from Exercise 4.3.6 for $I=0.05740$ using Function 6.5.2 with error tolerance 10^-2, 10^-3, and 10^-4. (This illustrates that the error tolerance is a target, not a guarantee!)
✍ Derive Equation (6.5.4) using the stated assumption about controlling global rather than local error.
⌨ Solve the problem $u'=100u^2-u^3$ , $u(0)=0.0002$ , $0\le t \le 100$ , and make plots that show both the solution and the time steps taken. The solution makes a quick transition between two nearly constant states. Does the step size selection behave the same in both states?