One way to attack the TPBVP (10.1.1) is to adapt our IVP solving techniques from Chapter 6 to it. Those techniques work only when we know the entire initial state, but we can allow that state to vary in order to achieve the stated conditions.
This is the idea behind the shooting method . Imagine adjusting your aiming point and power to sink a basketball shot from the free-throw line. The way in which you miss—too long, flat trajectory, etc.—informs how you will adjust for your next attempt.
Example 10.2.1 (Naive shooting)
Example 10.2.1 Let’s first examine the shooting approach for the TPBVP from Example 10.1.2 with λ = 0.6 \lambda=0.6 λ = 0.6 .
The character ϕ
is typed as \phi
Tab .
λ = 0.6
ϕ = (r, w, dwdr) -> λ / w^2 - dwdr / r;
We convert the ODE to a first-order system in order to apply a numerical method. We also have to truncate the domain to avoid division by zero.
f = (y, p, r) -> [y[2]; ϕ(r, y[1], y[2])]
a, b = eps(), 1.0;
The BVP specifies w ′ ( 0 ) = y 2 ( 0 ) = 0 w'(0)=y_2(0)=0 w ′ ( 0 ) = y 2 ( 0 ) = 0 . We can try multiple values for the unknown w ( 0 ) = y 1 ( 0 ) w(0)=y_1(0) w ( 0 ) = y 1 ( 0 ) and plot the solutions.
plt = plot(
xaxis = (L"x"),
yaxis = (L"w(x)"),
title = "Different initial values",
leg = :bottomright,
)
for w0 in 0.4:0.1:0.9
IVP = ODEProblem(f, [w0, 0], (a, b))
y = solve(IVP, Tsit5())
plot!(y, idxs = [1], label = "w(0) = $w0")
end
plt
On the graph, it’s the curve starting at w ( 0 ) = 0.8 w(0)=0.8 w ( 0 ) = 0.8 that comes closest to the required condition w ( 1 ) = 1 w(1)=1 w ( 1 ) = 1 , but it’s a bit too large.
Example 10.2.1 Let’s first examine the shooting approach for the TPBVP from Example 10.1.2 with λ = 0.6 \lambda=0.6 λ = 0.6 .
lambda = 0.6;
phi = @(r, w, dwdr) lambda ./ w.^2 - dwdr ./ r;
We convert the ODE to a first-order system in order to apply a numerical method. We also have to truncate the domain to avoid division by zero.
f = @(r, w) [ w(2); phi(r, w(1), w(2)) ];
a = eps; b = 1;
The BVP specifies w ′ ( 0 ) = y 2 ( 0 ) = 0 w'(0)=y_2(0)=0 w ′ ( 0 ) = y 2 ( 0 ) = 0 . We can try multiple values for the unknown w ( 0 ) = y 1 ( 0 ) w(0)=y_1(0) w ( 0 ) = y 1 ( 0 ) and plot the solutions.
clf
ivp = ode;
ivp.ODEFcn = f;
ivp.InitialTime = a;
for w0 = 0.4:0.1:0.9
ivp.InitialValue = [w0; 0];
sol = solve(ivp, a, b);
plot(sol.Time, sol.Solution(1, :))
hold on
end
xlabel('r'), ylabel('w(r)')
title('Solutions for multiple choices of w(0)')
On the graph, it’s the curve starting at w ( 0 ) = 0.8 w(0)=0.8 w ( 0 ) = 0.8 that comes closest to the required condition w ( 1 ) = 1 w(1)=1 w ( 1 ) = 1 , but it’s a bit too large.
Example 10.2.1 Let’s first examine the shooting approach for the TPBVP from Example 10.1.2 with λ = 0.6 \lambda=0.6 λ = 0.6 .
lamb = 0.6
phi = lambda r, w, dw_dr: lamb / w**2 - dw_dr / r
We convert the ODE to a first-order system in order to apply a numerical method. We also have to truncate the domain to avoid division by zero.
f = lambda r, y: hstack([y[1], phi(r, y[0], y[1])])
a, b = finfo(float).eps, 1
The BVP specifies w ′ ( 0 ) = y 2 ( 0 ) = 0 w'(0)=y_2(0)=0 w ′ ( 0 ) = y 2 ( 0 ) = 0 . We can try multiple values for the unknown w ( 0 ) = y 1 ( 0 ) w(0)=y_1(0) w ( 0 ) = y 1 ( 0 ) and plot the solutions.
from scipy.integrate import solve_ivp
t = linspace(a, b, 400)
for w0 in arange(0.4, 1.0, 0.1):
sol = solve_ivp(f, [a, b], [w0, 0], t_eval=t)
plot(t, sol.y[0], label=f"$w_0$ = {w0:.1f}")
xlabel("$r$"), ylabel("$w(r)$")
legend(), grid(True)
title("Solutions for choices of w(0)");
On the graph, it’s the curve starting at w ( 0 ) = 0.8 w(0)=0.8 w ( 0 ) = 0.8 that comes closest to the required condition w ( 1 ) = 1 w(1)=1 w ( 1 ) = 1 , but it’s a bit too large.
We can do much better than trial-and-error for the unknown part of the initial state. As usual, we can rewrite the ODE u ′ ′ ( x ) = ϕ ( x , u , u ′ ) u''(x) = \phi(x,u,u') u ′′ ( x ) = ϕ ( x , u , u ′ ) in first-order form as
y 1 ′ = y 2 , y 2 ′ = ϕ ( x , y 1 , y 2 ) . \begin{split}
y_1' &= y_2,\\
y_2' &= \phi(x,y_1,y_2).
\end{split} y 1 ′ y 2 ′ = y 2 , = ϕ ( x , y 1 , y 2 ) . We turn this into an IVP by specifying y ( a ) = s 1 y(a)=s_1 y ( a ) = s 1 , y ′ ( a ) = s 2 y'(a)=s_2 y ′ ( a ) = s 2 , for a vector s \mathbf{s} s to be determined by the boundary conditions. Define the residual function v ( s ) \mathbf{v}(\mathbf{s}) v ( s ) by
v 1 ( s 1 , s 2 ) = g 1 ( y 1 ( a ) , y 2 ( a ) ) = g 1 ( s 1 , s 2 ) , v 2 ( s 1 , s 2 ) = g 2 ( y 1 ( b ) , y 2 ( b ) ) . \begin{split}
v_1(s_1,s_2) &= g_1(y_1(a),y_2(a)) = g_1(s_1,s_2),\\
v_2(s_1,s_2) &= g_2(y_1(b),y_2(b)).
\end{split} v 1 ( s 1 , s 2 ) v 2 ( s 1 , s 2 ) = g 1 ( y 1 ( a ) , y 2 ( a )) = g 1 ( s 1 , s 2 ) , = g 2 ( y 1 ( b ) , y 2 ( b )) . The dependence of v 2 v_2 v 2 on s \mathbf{s} s is indirect, through the solution of the IVP for y ( x ) \mathbf{y}(x) y ( x ) . We now have a standard rootfinding problem that can be solved via the methods of Chapter 4 .
10.2.1 Implementation ¶ Our implementation of shooting is given in Function 10.2.1 . Note the structure: we use a rootfinding method that in turn relies on an IVP solver. This sort of arrangement is what makes us concerned with minimizing the number of objective function calls when rootfinding.
Shooting method for a two-point boundary-value problemAbout the code
Because x
and y
are assigned empty values in line 24, when the function objective
runs it uses those values rather than new ones in local scope. Thus, line 19 updates them to hold the latest results of the IVP solver, saving the need to solve it again after levenberg
has finished the rootfinding.
The error tolerance in the IVP solver is kept smaller than in the rootfinder, to prevent the rootfinder from searching in a noisy landscape. Finally, note how line 28 uses destructuring of eachrow(y)
to assign the columns of y
to separate names.
Shooting method for a two-point boundary-value problem1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
function [x, u, du_dx] = shoot(phi, a, b, ga, gb, init, tol)
%SHOOT Shooting method for a two-point boundary-value problem.
% Input:
% phi defines u'' = phi(x, u, u') (function)
% a, b endpoints of the domain (scalars)
% ga residual boundary function of u(a), u'(a)
% gb residual boundary function of u(b), u'(b)
% init initial guess for u(a) and u'(a) (column vector)
% tol error tolerance (scalar)
% Output:
% x nodes in x (length n+1)
% u values of u(x) (length n+1)
% du_dx values of u'(x) (length n+1)
% To be solved by the IVP solver
function f = shootivp(x, y)
f = [ y(2); phi(x, y(1), y(2)) ];
end
ivp = ode(ODEFcn=@shootivp);
ivp.InitialTime = a;
ivp.AbsoluteTolerance = tol / 10;
ivp.RelativeTolerance = tol / 10;
% To be solved by levenberg
function residual = objective(s)
ivp.InitialValue = s;
sol = solve(ivp, a, b);
x = sol.Time; y = sol.Solution;
residual = [ga(y(1, 1), y(2, 1)); gb(y(1, end), y(2, end))];
end
y = []; % shared variable
s = levenberg(@objective, init, tol);
% Don't need to solve the IVP again. It was done within the
% objective function already.
u = y(1, :); % solution
du_dx = y(2, :); % derivative
end
Shooting method for a two-point boundary-value problem1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def shoot(phi, a, b, ga, gb, init):
"""
shoot(phi, a, b, ga, gb, init)
Use the shooting method to solve a two-point boundary value problem.
The ODE is u'' = phi(x, u, u') for x in (a,b). The functions
ga(u(a), u'(a)) and gb(u(b), u'(b)) specify the boundary conditions.
The value init is an initial guess for [u(a), u'(a)].
Return vectors for the nodes, the values of u, and the values of u'.
"""
# Tolerances for IVP solver and rootfinder.
tol = 1e-5
# To be solved by the IVP solver
def shootivp(x, y):
return np.array([y[1], phi(x, y[0], y[1])])
# Evaluate the difference between computed and target values at x=b.
def objective(s):
nonlocal x, y # change these values in outer scope
x = np.linspace(a, b, 400) # make decent plots on return
sol = solve_ivp(shootivp, [a, b], s, atol=tol/10, rtol=tol/10, t_eval=x)
x = sol.t
y = sol.y
residual = np.array([ga(y[0, 0], y[1, 0]), gb(y[0, -1], y[0, -1])])
return residual
# Find the unknown quantity at x=a by rootfinding.
x, y = [], [] # the values will be overwritten
s = levenberg(objective, init, tol)
# Don't need to solve the IVP again. It was done within the
# objective function already.
u = y[0] # solution
du_dx = y[1] # derivative
return x, u, du_dx
Example 10.2.2 (Shooting solution of a BVP)
Example 10.2.2 We revisit Demo 10.2.1 but let Function 10.2.1 do the heavy lifting.
λ = 0.6
ϕ = (r, w, dwdr) -> λ / w^2 - dwdr / r;
a = eps();
b = 1;
We specify the given and unknown endpoint values.
g₁(w, dw) = dw # w'=0 at left
g₂(w, dw) = w - 1 # w=1 at right
r, w, dw_dx = FNC.shoot(ϕ, (a, b), g₁, g₂, [0.8, 0])
plot(r, w, title = "Shooting solution", xaxis = (L"x"), yaxis = (L"w(x)"))
The value of w w w at r = 1 r=1 r = 1 , meant to be exactly one, was computed to be
The accuracy is consistent with the error tolerance used for the IVP solution. The initial value w ( 0 ) w(0) w ( 0 ) that gave this solution is
Example 10.2.2 We revisit Demo 10.2.1 but let Function 10.2.1 do the heavy lifting.
lambda = 0.6;
phi = @(r, w, dwdr) lambda ./ w.^2 - dwdr ./ r;
a = eps; b = 1; % avoid r=0 in denominator
We specify the given and unknown endpoint values.
ga = @(u, du) du;
gb = @(u, du) u - 1;
init = [0.8; 0]; % initial guess for u(a) and u'(a)
[r, w, dwdx] = shoot(phi, a, b, ga, gb, init, 1e-5);
clf, plot(r, w)
title('Correct solution')
xlabel('r'), ylabel('w(r)')
The value of w w w at r = 1 r=1 r = 1 , meant to be exactly one, was computed to be
The accuracy is consistent with the error tolerance used for the IVP solution. The initial value w ( 0 ) w(0) w ( 0 ) that gave this solution is
Example 10.2.2 We revisit Demo 10.2.1 but let Function 10.2.1 do the heavy lifting.
lamb = 0.6
phi = lambda r, w, dwdr: lamb / w**2 - dwdr / r
a, b = finfo(float).eps, 1
We specify the given and unknown endpoint values.
ga = lambda w, dw : dw # w'=0 at left
gb = lambda w, dw : w - 1 # w=1 at right
In this setting, we need to provide initial guesses for w ( a ) w(a) w ( a ) and w ′ ( a ) w'(a) w ′ ( a ) .
init = array([0.8, 0])
r, w, dw_dx = FNC.shoot(phi, a, b, ga, gb, init)
plot(r, w)
title("Shooting solution")
xlabel("$r$"), ylabel("$w(r)$");
The value of w w w at r = 1 r=1 r = 1 , meant to be exactly one, was computed to be
print(f"w at right end is {w[-1]}")
The accuracy is consistent with the error tolerance used for the IVP solution by shoot
. The initial value w ( 0 ) w(0) w ( 0 ) that gave this solution is
print(f"w at left end is {w[0]}")
10.2.2 Instability ¶ The accuracy of the shooting method should be comparable to those of the component pieces, the rootfinder, and the IVP solver. However, the shooting method is unstable for some problems. An example illustrates the trouble.
Example 10.2.3 (Instability of shooting)
We solve the problem
u ′ ′ = λ 2 u + λ 2 , 0 ≤ x ≤ 1 , u ( 0 ) = − 1 , u ( 1 ) = 0. u'' = \lambda^2 u + \lambda^2, \quad 0\le x \le 1, \quad u(0)=-1,\; u(1)=0. u ′′ = λ 2 u + λ 2 , 0 ≤ x ≤ 1 , u ( 0 ) = − 1 , u ( 1 ) = 0. The exact solution is easily confirmed to be
u ( x ) = sinh ( λ x ) sinh ( λ ) − 1. u(x) = \frac{\sinh(\lambda x)}{\sinh(\lambda)} - 1. u ( x ) = sinh ( λ ) sinh ( λ x ) − 1. This solution satisfies − 1 ≤ u ( x ) ≤ 0 -1\le u(x) \le 0 − 1 ≤ u ( x ) ≤ 0 for all x ∈ [ 0 , 1 ] x\in[0,1] x ∈ [ 0 , 1 ] . Now we compute shooting solutions for several values of λ.
Example 10.2.3 plt = plot(
xaxis = (L"x"),
yaxis = ([-1.2, 0.5], L"u(x)"),
title = "Shooting instability",
leg = :topleft,
)
for λ in 6:4:18
g₁(u, du) = u + 1
g₂(u, du) = u
ϕ = (x, u, du_dx) -> λ^2 * (u + 1)
x, u = FNC.shoot(ϕ, (0.0, 1.0), g₁, g₂, [-1, 0])
plot!(x, u, label = "λ=$λ")
end
plt
The numerical solutions evidently don’t satisfy the right boundary condition as λ increases, which makes them invalid.
Example 10.2.3 ga = @(u, du) u + 1;
gb = @(u, du) u;
clf
warning off
for lambda = 16:4:28
phi = @(x, u, du_dx) lambda^2 * (u + 1);
[x, u, du_dx] = shoot(phi, 0.0, 1.0, ga, gb, [-1; 0], 1e-8);
plot(x, u, displayname=sprintf("lambda=%d", lambda))
hold on
xlabel('x'), ylabel('u(x)')
title('Shooting instability')
legend(location="northwest");
end
The numerical solution fails at the largest value of λ because the initial condition became infinite.
Example 10.2.3 ga = lambda u, du : u + 1 # u=-1 at left
gb = lambda u, du : u # u= 0 at right
init = array([-1, 0])
for lamb in range(6, 22, 4):
phi = lambda x, u, du_dx: lamb**2 * u + lamb**2
x, u, du_dx = FNC.shoot(phi, 0.0, 1.0, ga, gb, init)
plot(x, u, label=f"$\\lambda$ = {lamb:.1f}")
xlabel("$x$"), ylabel("$u(x)$"), ylim(-1.0, 0.25)
grid(True), legend(loc="upper left")
title("Shooting instability");
The numerical solutions evidently don’t satisfy the right boundary condition as λ increases, which makes them invalid.
The cause is readily explained. The solution to the ODE with u ( 0 ) = − 1 u(0)=-1 u ( 0 ) = − 1 and u ′ ( 0 ) = s 2 u'(0)=s_2 u ′ ( 0 ) = s 2 is
s 2 λ sinh ( λ x ) − 1. \frac{s_2}{\lambda}\sinh(\lambda x) - 1. λ s 2 sinh ( λ x ) − 1. If x x x is a fixed value in [ 0 , 1 ] [0,1] [ 0 , 1 ] , we compute that the absolute condition number of (10.2.5) with respect to s 2 s_2 s 2 is the magnitude of the partial derivative,
∣ sinh λ x λ ∣ , \left| \frac{\sinh\lambda x}{\lambda} \right|, ∣ ∣ λ sinh λ x ∣ ∣ , which grows rapidly with λ near x = 1 x=1 x = 1 . With the IVP solution so sensitive to s 2 s_2 s 2 , a numerical approach to find s 2 s_2 s 2 approximately is doomed.
The essence of the instability is that errors can grow exponentially away from the boundary at x = a x=a x = a , where the state is arbitrarily being set (see Theorem 6.1.2 ). Using shooting, acceptable accuracy near x = b x=b x = b therefore means requiring extraordinarily high accuracy near x = a x=a x = a .
The instability of shooting can be circumvented by breaking the interval into smaller pieces and thus limiting the potential for error growth. However, we do not go into these details. Instead, the methods in the rest of this chapter treat both ends of the domain symmetrically and solve over the whole domain at once.
10.2.3 Exercises ¶ ⌨ For each BVP in Exercise 10.1.2 , use Function 10.2.1 to compute the solution. Plot the solution and, separately, its error as functions of x x x .
⌨ (Continuation of Exercise 10.1.4 .) Consider the pendulum from Example 10.1.1 with g = L = 1 g=L=1 g = L = 1 . Suppose we want to release the pendulum from rest such that θ ( 5 ) = π / 2 \theta(5)=\pi/2 θ ( 5 ) = π /2 . Find one solution that passes through θ = 0 \theta=0 θ = 0 , and another solution that does not. Plot θ ( t ) \theta(t) θ ( t ) for both cases together.
⌨ (Continuation of Exercise 10.1.5 .) The stationary Allen–Cahn equation is
ϵ u ′ ′ = u 3 − u , 0 ≤ x ≤ 1 , u ( 0 ) = − 1 , u ( 1 ) = 1. \epsilon u'' = u^3-u, \qquad 0 \le x \le 1, \qquad u(0)=-1, \quad u(1)=1. ϵ u ′′ = u 3 − u , 0 ≤ x ≤ 1 , u ( 0 ) = − 1 , u ( 1 ) = 1. As ϵ → 0 \epsilon\rightarrow 0 ϵ → 0 , the solution tends toward a step function transition between -1 and 1. By symmetry, u ′ ( x ) = − u ′ ( 1 − x ) u'(x)=-u'(1-x) u ′ ( x ) = − u ′ ( 1 − x ) .
(a) Use Function 10.2.1 to solve the equation for ϵ = 0.2 \epsilon=0.2 ϵ = 0.2 . Plot the solution and compute the numerical value of u ′ ( 0 ) − u ′ ( 1 ) u'(0)-u'(1) u ′ ( 0 ) − u ′ ( 1 ) .
(b) Repeat for ϵ = 0.02 \epsilon=0.02 ϵ = 0.02 .
(c) Repeat for ϵ = 0.002 \epsilon=0.002 ϵ = 0.002 . You will receive multiple warning messages. Does the result look like a valid solution?
✍ Consider the linear TPBVP
u ′ ′ = p ( x ) u ′ + q ( x ) u + r ( x ) , u ′ ( a ) = 0 , u ( b ) = β . \begin{split}
u'' &= p(x)u' + q(x)u + r(x),\\
u'(a) &= 0, \quad u(b)=\beta.
\end{split} u ′′ u ′ ( a ) = p ( x ) u ′ + q ( x ) u + r ( x ) , = 0 , u ( b ) = β . The shooting IVP uses the same ODE with initial data u ( a ) = s 1 u(a)=s_1 u ( a ) = s 1 , u ′ ( a ) = s 2 u'(a)=s_2 u ′ ( a ) = s 2 to solve for a trial solution u ( x ) u(x) u ( x ) . Define
z ( x ) = ∂ u ∂ s 1 . z(x) = \frac{\partial u}{\partial s_1}. z ( x ) = ∂ s 1 ∂ u . By differentiating the IVP with respect to s 1 s_1 s 1 , show that z z z satisfies the IVP
z ′ ′ = p ( x ) z ′ + q ( x ) z , z ( 0 ) = 1 , z ′ ( 0 ) = 0. z'' = p(x)z' + q(x)z, \quad z(0)=1, \; z'(0)=0. z ′′ = p ( x ) z ′ + q ( x ) z , z ( 0 ) = 1 , z ′ ( 0 ) = 0. It follows that z ( x ) z(x) z ( x ) is independent of s 1 s_1 s 1 , and therefore u ( x ) u(x) u ( x ) is a linear function of s 1 s_1 s 1 at each fixed x x x . Use the same type of argument to show that u ( x ) u(x) u ( x ) is also a linear function of s 2 s_2 s 2 , and explain why the residual function v \mathbf{v} v in (10.2.2) is a linear function of s \mathbf{s} s .