Skip to article frontmatterSkip to article content

Adaptive Runge–Kutta

The derivation and analysis of methods for initial-value problems usually assumes a fixed step size hh. While the error behavior O(hp)O(h^p) is guaranteed by Theorem 6.2.1 as h0h\rightarrow 0, this bound comes with an unknowable constant, and it is not very useful as a guide to the numerical value of the error at any particular value of hh. Furthermore, as we saw in Adaptive integration for numerical integration, in many problems a fixed step size is far from the most efficient strategy.

In response we will employ the basic strategy of Adaptive integration: estimate the error and adapt the step size in order to reach an accuracy goal. Unlike the integration problem, though, the “integrand” of an IVP is dependent on the solution itself, so the details differ greatly.

6.5.1Step size prediction

Suppose that, starting from a given value uiu_i and using a step size hh, we run one step of two different RK methods simultaneously: one method with order pp, producing ui+1u_{i+1}, and the other method with order p+1p+1, producing u~i+1\tilde{u}_{i+1}. In most circumstances, we can expect that u~i+1\tilde{\mathbf{u}}_{i+1} is a much better approximation to the solution than ui+1\mathbf{u}_{i+1} is. So it seems reasonable to use

Ei(h)=u~i+1ui+1E_i(h)=|\tilde{\mathbf{u}}_{i+1} - \mathbf{u}_{i+1}|

as an estimate of the actual local error made by the ppth-order method. For a vector IVP, we would use a norm rather than an absolute value.

Now we ask: looking back, what step size should we have taken to meet an error target of size ε? Let’s speculate, given the behavior of local truncation error as h0h\rightarrow 0, that Ei(h)Chp+1E_i(h)\approx C h^{p+1} for an unknown constant CC. If we had used a step size qhq h for some q>0q>0, then trivially, we would expect

Ei(qh)Cqp+1hp+1.E_i(qh)\approx C q^{p+1}h^{p+1}.

Our best guess for qq would therefore be to set Ei(qh)ϵE_i(qh)\approx \epsilon, or

q(ϵEi)1/(p+1). q \approx \left(\frac{\epsilon}{E_i}\right)^{1/(p+1)}.

Perhaps, though, we should aim to control the contribution to global error, which is closer to Ei(qh)/(qh)E_i(qh)/(q h). Then we end up with

q(ϵEi)1/p. q \le \left(\frac{\epsilon}{E_i}\right)^{1/p}.

Experts have different recommendations about whether to use (6.5.3) or (6.5.4). Even though (6.5.4) appears to be more in keeping with our assumptions about global errors, modern practice seems to favor (6.5.3).

We now have an outline of an algorithm.

Many details remain unspecified at this point, but we first address step 1.

6.5.2Embedded formulas

Suppose, for example, we choose to use a pair of second- and third-order RK methods to get the ui+1\mathbf{u}_{i+1} and u~i+1\tilde{\mathbf{u}}_{i+1} needed in Algorithm 6.5.1. Then we seem to need at least 2+3=52+3=5 evaluations of f(t,y)f(t,y) for each attempted time step. This is more than double the computational work needed by the second-order method without adaptivity.

Fortunately, the marginal cost of adaptivity can be substantially reduced by using embedded Runge–Kutta formulas. Embedded RK formulas are a pair of RK methods whose stages share the same internal ff evaluations, combining them differently in order to get estimates of two different orders of accuracy.

A good example of an embedded method is the Bogacki–Shampine (BS23) formula, given by the table

012123403412913492913490724141318\begin{array}{r|cccc} 0 & \rule{0pt}{2.75ex} & & & \\ \frac{1}{2} & \frac{1}{2} & \rule{0pt}{2.75ex} & & \\ \frac{3}{4} & 0 & \frac{3}{4} & \rule{0pt}{2.75ex} & \\ 1 & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & \rule{0pt}{2.75ex} \\[2pt] \hline \rule{0pt}{2.75ex} & \frac{2}{9} & \frac{1}{3} & \frac{4}{9} & 0 \\[2pt] \hline \rule{0pt}{2.75ex} & \frac{7}{24} & \frac{1}{4} & \frac{1}{3} & \frac{1}{8} \end{array}

The top part of the table describes four stages in the usual RK fashion. The last two rows describe how to construct a third-order estimate u~i+1\tilde{\mathbf{u}}_{i+1} and a second-order estimate ui+1\mathbf{u}_{i+1} by taking different combinations of those stages.

6.5.3Implementation

Our implementation of an embedded second/third-order (RK23) code is given in Function 6.5.2.

Often the adaptively chosen steps clearly correspond to identifiable features of the solution. However, there are so-called stiff problems in which the time steps seem unreasonably small in relation to the observable behavior of the solution. These problems benefit from a particular type of solver that is considered in Implementation of multistep methods.

6.5.4Exercises

  1. ⌨ Using Function 6.5.2 with an error tolerance of 10-8, solve y+(1+y)3y=0y'' +(1+y')^3 y = 0 over 0t4π 0 \le t \le 4 \pi with the indicated initial conditions. Plot y(t)y(t) and y(t)y'(t) as functions of tt and separately plot the time step size as a function of tt.

    (a) y(0)=0.1,y(0)=0y(0) = 0.1, \quad y'(0) = 0

    (b) y(0)=0.5,y(0)=0y(0) = 0.5, \quad y'(0) = 0

    (c) y(0)=0.75,y(0)=0y(0) = 0.75, \quad y'(0) = 0

    (d) y(0)=0.95,y(0)=0y(0) = 0.95, \quad y'(0) = 0

  2. ⌨ Solve the FitzHugh–Nagumo system from Exercise 4.3.6 for I=0.05740I=0.05740 using Function 6.5.2 with error tolerance 10-2, 10-3, and 10-4. (This illustrates that the error tolerance is a target, not a guarantee!)

  3. ✍ Derive Equation (6.5.4) using the stated assumption about controlling global rather than local error.

  4. ⌨ Solve the problem u=100u2u3u'=100u^2-u^3, u(0)=0.0002u(0)=0.0002, 0t1000\le t \le 100, and make plots that show both the solution and the time steps taken. The solution makes a quick transition between two nearly constant states. Does the step size selection behave the same in both states?