Adaptive Runge–Kutta

The derivation and analysis of methods for initial-value problems usually assumes a fixed step size h. While the error behavior O(hp) is guaranteed by the convergence theorem as h0, this bound comes with an unknowable constant, and it is not very useful as a guide to the numerical value of the error at any particular value of h. Furthermore, as we saw in Adaptive integration with numerical integration, in many problems a fixed value of h throughout atb is far from the most efficient strategy.

In response we will employ the basic strategy of Adaptive integration: adapt the time step size in order to reach an accuracy goal, as measured by an error estimate formed from computing multiple approximations. The details are quite different, however.

Error estimation

Suppose that, starting from a given value ui and using a step size h, we run one step of two RK methods simultaneously: one method with order p, producing ui+1, and the other method with order p+1, producing ˜ui+1. In most circumstances, we can expect that ˜ui+1 is a much better approximation to the solution than ui+1 is. So it seems reasonable to use Ei(h)=|˜ui+1ui+1| (in the vector case, a norm) as an estimate of the actual local error made by the pth order method.

If our goal is to keep error less than some predetermined value ϵ, we could decide to accept the new solution value if Ei<ϵ and otherwise reject it. (Even though the estimate Ei is meant to go with the less accurate proposed value ui+1, it’s hard to resist the temptation to keep the more accurate value ˜ui+1 instead, and this is common in practice.)

Regardless of whether Ei<ϵ and we accept the step, we now ask a question: looking back, what step size should we have taken to just meet our error target ϵ? Let’s speculate that Ei(h)Chp+1 for an unknown constant C, given the behavior of local truncation error as h0. If we had used a step size qh for some q>0, then trivially, Ei(qh)Cqp+1hp+1 is what we would expect to get. Our best guess for q would be to set Ei(qh)ϵ, or

(185)q(ϵEi)1/(p+1).

Whether or not we accepted the value proposed for t=ti+1, we will adjust the step size to qh for the next attempted step.

Given what we know about the connection between local and global errors, we might instead decide that controlling the normalized contribution to global error, which is closer to Ei(qh)/(qh), is more reasonable. Then we end up with

(186)q(ϵEi)1/p.

Experts have different recommendations about whether to use (185) or (186). Even though (186) appears to be more in keeping with our assumptions about global errors, modern practice seems to favor (185).

Embedded formulas

We have derived two useful pieces of information: a reasonable estimate of the actual value of the local (or global) error, and a prediction how the step size will affect that error. Together they can be used to adapt step size and keep errors near some target level. But there remains one more important twist to the story.

At first glance, it would seem that to use (for example) any pair of second- and third-order RK methods to get the ui+1 and ˜ui+1 needed for adaptive error control, we need at least 2+3=5 evaluations of f(t,y) for each attempted time step. This is more than double the computational work needed by the second-order method without adaptivity. Fortunately, the marginal cost of adaptation can be substantially reduced by using embedded Runge–Kutta Embedded RK formulas are a pair of RK methods whose stages share the same internal f evaluations, combining them differently in order to get estimates of two different orders of accuracy.

A good example of an embedded method is the Bogacki–Shampine (BS23) formula, given by the table

(187)012123403412913492913490724141318

The top part of the table describes four stages in the usual RK fashion. The last two rows describe how to construct a second-order estimate ui+1 and a third-order estimate ˜ui+1 by taking different combinations of those stages.

Implementation

Function 68 (rk23)

Adaptive IVP solver based on embedded RK formulas.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
"""
rk23(ivp,tol)

Apply an adaptive embedded RK formula pair to solve given IVP with
estimated error `tol`. Returns a vector of times and a vector of
solution values.
"""
function rk23(ivp,tol)
    # Initialize for the first time step.
    a,b = ivp.tspan
    t = [a]
    u = [float(ivp.u0)];   i = 1;
    h = 0.5*tol^(1/3)
    s1 = ivp.f(ivp.u0,ivp.p,a)

    # Time stepping.
    while t[i] < b
        # Detect underflow of the step size.
        if t[i]+h == t[i]
            @warn "Stepsize too small near t=$(t[i])"
            break  # quit time stepping loop
        end

        # New RK stages.
        s2 = ivp.f( u[i]+(h/2)*s1,   ivp.p, t[i]+h/2   )
        s3 = ivp.f( u[i]+(3*h/4)*s2, ivp.p, t[i]+3*h/4 )
        unew2 = u[i] + h*(2*s1 + 3*s2 + 4*s3)/9   # 2rd order solution
        s4 = ivp.f( unew2, ivp.p, t[i]+h )
        err = h*(-5*s1/72 + s2/12 + s3/9 - s4/8)  # 2nd/3rd difference
        E = norm(err,Inf)                         # error estimate
        maxerr = tol*(1 + norm(u[i],Inf))     # relative/absolute blend

        # Accept the proposed step?
        if E < maxerr     # yes
            push!(t,t[i]+h)
            push!(u,unew2)
            i += 1
            s1 = s4       # use FSAL property
        end

        # Adjust step size.
        q = 0.8*(maxerr/E)^(1/3)   # conservative optimal step factor
        q = min(q,4)               # limit stepsize growth
        h = min(q*h,b-t[i])        # don't step past the end
    end
    return t,u
end
Copy to clipboard

Our implementation of an embedded second/third order (RK23) code is given in rk23. It has a few details that are worth explaining.

First, as in (164), we use a combination of absolute and relative tolerances to judge the acceptability of a solution value. Second, we have a check whether ti+h equals ti, which looks odd. This check is purely about roundoff error, because h can become so small that it no longer changes the floating point value of ti. When this happens, it’s often a sign that the underlying exact solution has a singularity near t=ti. Third, some adjustments are made to the step size prediction factor q. We use a smaller value than (185), to be conservative about the many assumptions that were made to derive it. We also prevent a huge jump in the step size for the same reason. And, we make sure that our final step doesn’t take us past the requested end of the domain.

Finally, there is some careful programming done to avoid redundant evaluations of f. As written in (187), there seem to be four stages needed to find the paired second- and third-order estimates. This is unfortunate, since there are three-stage formulas of order three. But BS23 has a special property called “first same as last” (FSAL). If the proposed step is accepted, the final stage computed in stepping from ti to ti+1 is identical to the first stage needed to step from ti+1 to ti+2, so in that sense one of the stage evaluations comes at no cost. This detail is addressed in our code.

Often the steps chosen adaptively clearly correspond to identifiable features of the solution. However, there are so-called stiff problems in which the time steps seem unreasonably small in relation to the observable behavior of the solution. These problems benefit from a particular type of solver and will be taken up in Implementation of multistep methods.

Exercises

  1. ⌨ Using rk23, solve y over $ 0 \le t \le 4 \pi$ with the indicated initial conditions. Plot y(t) and y'(t) as a function of t and separately plot the solution curve parametrically in the phase plane—that is, the \bigl(y(t),y'(t)\bigr)-plane.

    (a) y(0) = 0.1, \quad y'(0) = 0

    (b) y(0) = 0.5, \quad y'(0) = 0

    (c) y(0) = 0.75, \quad y'(0) = 0

    (d) y(0) = 0.95, \quad y'(0) = 0

  2. ⌨ Solve the caffeine exercise using rk23 with an error tolerance of 10^{-5}. Plot the solution so that you can see the individual points. What is the smallest time step taken, and at what time does it occur?

  3. ⌨ Solve the FitzHugh--Nagumo exercise using rk23. Let the error tolerance be 10^{-k}, increasing the integer k until the graph of the solutions no longer changes. (This illustrates that the error tolerance is a request, not a guarantee!)

  4. ✍ Derive equation (186) using the stated assumption about controlling global rather than local error.

  5. ⌨ Solve the problem u'=u^2-u^3, u(0)=0.001, 0\le t \le 2000 and make plots as in Adaptive RK that show both the solution and the time steps taken. Does the step size selection seem to be entirely explained by the local variability of the solution?