For the time being we will focus on the rootfinding problem for single functions of one variable.
The formulation f(x)=0 is general enough to solve any equation. If we are trying to solve an equation g(x)=h(x), we can define f=g−h and find a root of f.
Unlike the linear problems of the earlier chapters, the usual situation here is that the root cannot be produced in a finite number of operations, even in exact arithmetic. Instead, we seek a sequence of approximations that formally converge to the root, stopping when some member of the sequence seems to be good enough, in a sense we will clarify later.
In the rootfinding problem, the data is a continuous function f and the result is a root. (This overrides our Chapter 1 notation of f as the map from data to result.) How does the result change in response to perturbations in f? We will compute an absolute condition number rather than a relative one.
You might wonder about the relevance of perturbing a function as data to a problem. If nothing else, the values of f will be represented in floating point and thus subject to rounding error. Furthermore, in many applications, f might not be a simple formula but the result of a computation that uses an inexact algorithm. While there are infinitely many possible perturbations to a function, a constant perturbation is enough to get the main idea.
We assume f has at least one continuous derivative near a particular root r. Suppose that f is perturbed to f~(x)=f(x)+ϵ. As a result, the root (if it still exists) will be perturbed to r~=r+δ such that f~(r~)=0. We now compute an absolute condition number κr, which is the ratio ∣∣ϵδ∣∣ as ϵ→0.
Since r is a root, we have f(r)=0. This lets us relate δ to ε, and their ratio is the condition number.
Equivalently, (4.1.4) is just the magnitude of the derivative of the inverse function f−1 at zero.
We must accept that when ∣f′∣ is small at the root, it may not be possible to get a small error in a computed root estimate. As always, the error is not a quantity we can compute without knowing the exact answer. There is something else we can measure, though.
It stands to reason that a small residual might be associated with a small error. To quantify the relationship, let r~ approximate root r, and define the new function g(x)=f(x)−f(r~). Trivially, g(r~)=0, meaning that r~ is a true root of g. Since the difference g(x)−f(x) is the residual value f(r~), the residual is the distance to an exactly solved rootfinding problem.
In general, it is not realistic to expect a small error in a root approximation if the condition number (4.1.4) is large. However, we can gauge the backward error from the residual.
The condition number (4.1.4) naturally leads to the question of what happens if f′(r)=0 at a root r. The following definition agrees with and extends the notion of algebraic multiplicity in a polynomial to roots of more general differentiable functions.
Another useful characterization of multiplicity m is that f(x)=(x−r)mq(x) for a differentiable q with q(r)=0.
When r is a nonsimple root, the condition number (4.1.4) is effectively infinite.[1] However, even if r is simple, we should expect difficulty in rootfinding if the condition number is very large. This occurs when ∣f′(r)∣ is very small, which means that the quotient q satisfies q(r)≈0 and another root of f is very close to r. We made the same observation about polynomial roots all the way back in Demo 1.4.3.
⌨ For each equation and given interval, do the following steps.
(a) Rewrite the equation into the standard form for rootfinding, f(x)=0. Make a plot of f over the given interval and determine how many roots lie in the interval.
(b) Use nlsolve to find each root, as shown in Demo 4.1.1.
(c) Compute the condition number of each root found in part (b).
x2=e−x, over [−2,2]
2x=tanx, over [−0.2,1.4]
ex+1=2+x, over [−2,2]
⌨ A basic safe type of investment is an annuity: one makes monthly deposits of size P for n months at a fixed annual interest rate r, and at maturity collects the amount
Say you want to create an annuity for a term of 300 months and final value of $1,000,000. Using nlsolve, make a table of the interest rate you will need to get for each of the different contribution values P=500,550,…,1000.
⌨ The most easily observed properties of the orbit of a celestial body around the sun are the period τ and the elliptical eccentricity ε. (A circle has ϵ=0.) From these, it is possible to find at any time t the angle θ(t) made between the body’s position and the major axis of the ellipse. This is done through
Equation (4.1.7) must be solved numerically to find ψ(t), and then (4.1.6) can be solved analytically to find θ(t).
The asteroid Eros has τ=1.7610 years and ϵ=0.2230. Using nlsolve for (4.1.7), make a plot of θ(t) for 100 values of t between 0 and τ, which is one full orbit. (Note: Use mod(θ,2π) to put the angle between 0 and 2π if you want the result to be a continuous function.)
⌨ Lambert’s W function is defined as the inverse of xex. That is, y=W(x) if and only if x=yey. Write a function lambertW that computes W using nlsolve. Make a plot of W(x) for 0≤x≤4.
✍ For each function, find the multiplicity of the given root. If it is a simple root, find its absolute condition number.