Definition 5.1.1 (Interpolation problem)
Given n + 1 n+1 n + 1 distinct points ( t 0 , y 0 ) (t_0,y_0) ( t 0 , y 0 ) , ( t 1 , y 1 ) , … , ( t n , y n ) (t_1,y_1),\ldots,(t_n,y_n) ( t 1 , y 1 ) , … , ( t n , y n ) , with t 0 < t 1 < … < t n t_0<t_1<\ldots <t_n t 0 < t 1 < … < t n called nodes , the interpolation problem is to find a function p ( x ) p(x) p ( x ) , called the interpolant , such that p ( t k ) = y k p(t_k)=y_k p ( t k ) = y k for k = 0 , … , n k=0,\dots,n k = 0 , … , n .
In this chapter, we use t k t_k t k for the nodes and x x x to denote the continuous independent variable.
The interpolation nodes are numbered from 0 to n n n . This is convenient for our mathematical statements, but less so in a language such as Julia in which vector indices start with 1. Our approach is that indices in a computer code have the same meaning as those identically named in the mathematical formulas , and therefore must be incremented by one whenever used in an indexing context.
5.1.1 Polynomials ¶ Polynomials are the obvious first candidate to serve as interpolating functions. They are easy to work with, and in Polynomial interpolation we saw that a linear system of equations can be used to determine the coefficients of a polynomial that passes through every member of a set of given points in the plane. However, it’s not hard to find examples for which polynomial interpolation leads to unusable results.
Trouble in polynomial interpolationHere are some points that we could consider to be observations of an unknown function on [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] .
n = 5
t = range(-1, 1, length=n + 1)
y = @. t^2 + t + 0.05 * sin(20 * t)
scatter(t, y, label="data", leg=:top)
The polynomial interpolant, as computed using fit
, looks very sensible. It’s the kind of function you’d take home to meet your parents.
p = Polynomials.fit(t, y, n) # interpolating polynomial
plot!(p, -1, 1, label="interpolant")
But now consider a different set of points generated in almost exactly the same way.
n = 18
t = range(-1, 1, length=n + 1)
y = @. t^2 + t + 0.05 * sin(20 * t)
scatter(t, y, label="data", leg=:top)
The points themselves are unremarkable. But take a look at what happens to the polynomial interpolant.
p = Polynomials.fit(t, y, n)
x = range(-1, 1, length=1000) # use a lot of points
plot!(x, p.(x), label="interpolant")
Surely there must be functions that are more intuitively representative of those points!
Trouble in polynomial interpolationHere are some points that we could consider to be observations of an unknown function on [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] .
n = 5;
t = linspace(-1,1,n+1)';
y = t.^2 + t + 0.05 * sin(20 * t);
clf, scatter(t,y)
The polynomial interpolant, as computed using polyfit
, looks very sensible. It’s the kind of function you’d take home to meet your parents.
c = polyfit(t, y, n); % polynomial coefficients
p = @(x) polyval(c, x);
hold on
fplot(p, [-1 1])
legend('data', 'interpolant', 'location', 'north')
But now consider a different set of points generated in almost exactly the same way.
n = 18;
t = linspace(-1, 1, n+1);
y = t.^2 + t + 0.05 * sin(20 * t);
clf, scatter(t, y)
The points themselves are unremarkable. But take a look at what happens to the polynomial interpolant.
c = polyfit(t, y, n); % polynomial coefficients
p = @(x) polyval(c, x);
hold on, fplot(p, [-1 1])
legend('data', 'interpolant', 'location', 'north')
Surely there must be functions that are more intuitively representative of those points!
Trouble in polynomial interpolationHere are some points that we could consider to be observations of an unknown function on [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] .
n = 5
t = linspace(-1, 1, n + 1)
y = t**2 + t + 0.05 * sin(20 * t)
fig, ax = subplots()
plot(t, y, "o", label="data")
xlabel("$x$")
ylabel("$y$")
The polynomial interpolant, as computed using fit
, looks very sensible. It’s the kind of function you’d take home to meet your parents.
p = poly1d(polyfit(t, y, n)) # interpolating polynomial
tt = linspace(-1, 1, 400)
ax.plot(tt, p(tt), label="interpolant")
ax.legend()
fig
But now consider a different set of points generated in almost exactly the same way.
n = 18
t = linspace(-1, 1, n + 1)
y = t**2 + t + 0.05 * sin(20 * t)
fig, ax = subplots()
plot(t, y, "o", label="data")
xlabel("$x$")
ylabel("$y$")
The points themselves are unremarkable. But take a look at what happens to the polynomial interpolant.
p = poly1d(polyfit(t, y, n))
ax.plot(tt, p(tt), label="interpolant")
ax.legend()
fig
Surely there must be functions that are more intuitively representative of those points!
In Chapter 9 we explore the large oscillations in the last figure of Demo 5.1.1 ; it turns out that one must abandon either equally spaced nodes or n → ∞ n\to\infty n → ∞ for polynomials. In the rest of this chapter we will keep n n n fairly small and let the nodes be unrestricted.
5.1.2 Piecewise polynomials ¶ In order to keep polynomial degrees small while interpolating large data sets, we will choose interpolants from the piecewise polynomials . Specifically, the interpolant p p p must be a polynomial on each subinterval [ t k − 1 , t k ] [t_{k-1},t_k] [ t k − 1 , t k ] for k = 1 , … , n k=1,\ldots,n k = 1 , … , n .
Some examples of piecewise polynomials for the nodes t 0 = − 2 t_0=-2 t 0 = − 2 , t 1 = 0 t_1=0 t 1 = 0 , t 2 = 1 t_2=1 t 2 = 1 , and t 3 = 4 t_3=4 t 3 = 4 are p 1 ( x ) = x + 1 p_1(x)=x+1 p 1 ( x ) = x + 1 , p 2 ( x ) = sign ( x ) p_2(x)=\operatorname{sign}(x) p 2 ( x ) = sign ( x ) , p 3 ( x ) = ∣ x − 1 ∣ 3 p_3(x)=|x-1|^{3} p 3 ( x ) = ∣ x − 1 ∣ 3 , and p 4 ( x ) = ( max { 0 , x } ) 4 p_4(x)=(\max\{0,x\})^{4} p 4 ( x ) = ( max { 0 , x } ) 4 . Note that p 1 p_{1} p 1 , p 2 p_{2} p 2 , and p 4 p_4 p 4 would also be piecewise polynomial on the node set { t 0 , t 1 , t 3 } \{t_0,t_1,t_3\} { t 0 , t 1 , t 3 } , but p 3 p_3 p 3 would not.
Usually we designate in advance a maximum degree d d d for each polynomial piece of p ( x ) p(x) p ( x ) . An important property of the piecewise polynomials of degree d d d is that they form a vector space: that is, any linear combination of piecewise polynomials of degree d d d is another piecewise polynomial of degree d d d . If p p p and q q q share the same node set, then the combination is piecewise polynomial on that node set.
Piecewise polynomial interpolationLet us recall the data from Demo 5.1.1 .
n = 12
t = range(-1, 1, length=n + 1)
y = @. t^2 + t + 0.5 * sin(20 * t)
scatter(t, y, label="data", leg=:top)
Here is an interpolant that is linear between each consecutive pair of nodes, using plinterp
from Piecewise linear interpolation .
p = FNC.plinterp(t, y)
plot!(p, -1, 1, label="piecewise linear")
We may prefer a smoother interpolant that is piecewise cubic, generated using Spline1D
from the Dierckx
package.
p = Spline1D(t, y)
plot!(x -> p(x), -1, 1, label="piecewise cubic")
Piecewise polynomial interpolationLet us recall the data from Demo 5.1.1 .
n = 18;
t = linspace(-1, 1, n+1);
y = t.^2 + t + 0.05 * sin(20 * t);
clf, scatter(t, y)
Here is an interpolant that is linear between each consecutive pair of nodes, using interp1
from MATLAB.
x = linspace(-1, 1, 400)';
hold on, plot(x, interp1(t, y, x))
title('Piecewise linear interpolant')
We may prefer a smoother interpolant that is piecewise cubic, generated using Spline1D
from the Dierckx
package.
cla
scatter(t, y)
plot(x, interp1(t, y, x, 'spline'))
title('Piecewise cubic interpolant')
Piecewise polynomial interpolationLet us recall the data from Demo 5.1.1 .
clf
n = 12
t = linspace(-1, 1, n + 1)
y = t**2 + t + 0.5 * sin(20 * t)
fig, ax = subplots()
scatter(t, y, label="data")
xlabel("$x$")
ylabel("$y$")
Here is an interpolant that is linear between each consecutive pair of nodes, using plinterp
from Piecewise linear interpolation .
tt = linspace(-1, 1, 400)
p = interp1d(t, y, kind="linear")
ax.plot(tt, p(tt), label="piecewise linear")
ax.legend()
fig
We may prefer a smoother interpolant that is piecewise cubic:
scatter(t, y, label="data")
p = interp1d(t, y, kind="cubic")
tt = linspace(-1, 1, 400)
plot(tt, p(tt), label="cubic spline")
xlabel("$x$")
ylabel("$y$")
legend()
We will consider piecewise linear interpolation in more detail in Piecewise linear interpolation , and we look at piecewise cubic interpolation in Cubic splines .
5.1.3 Conditioning of interpolation ¶ In the interpolation problem we are given the values ( t k , y k ) (t_k,y_k) ( t k , y k ) for k = 0 , … , n k=0,\ldots,n k = 0 , … , n . Let us consider the nodes t k t_k t k of the problem to be fixed, and let a = t 0 a=t_0 a = t 0 , b = t n b=t_n b = t n . Then the data for the interpolation problem consists of a vector y \mathbf{y} y , and the result of the problem is a function on [ a , b ] [a,b] [ a , b ] .
Let I \mathcal{I} I be a prescription for producing the interpolant from a data vector. That is, I ( y ) = p \mathcal{I}(\mathbf{y})=p I ( y ) = p , where p ( t k ) = y k p(t_k)=y_k p ( t k ) = y k for all k k k . The interpolation methods we will consider are all linear , in the sense that
I ( α y + β z ) = α I ( y ) + β I ( z ) \cI(\alpha\mathbf{y} + \beta\mathbf{z}) = \alpha \cI(\mathbf{y}) + \beta \cI(\mathbf{z}) I ( α y + β z ) = α I ( y ) + β I ( z ) for all vectors y , z \mathbf{y},\mathbf{z} y , z and scalars α , β \alpha,\beta α , β .
Linearity greatly simplifies the analysis of interpolation. To begin with, for any data vector y \mathbf{y} y we have the standard expression y = ∑ y k e k \mathbf{y}=\sum y_k \mathbf{e}_k y = ∑ y k e k , where as always e k \mathbf{e}_k e k is a column of an identity matrix.[1] Hence by linearity,
I ( y ) = I ( ∑ k = 0 n y k e k ) = ∑ k = 0 n y k I ( e k ) . \cI( \mathbf{y} ) = \cI \left( \sum_{k=0}^n y_k \mathbf{e}_k \right) = \sum_{k=0}^n y_k \cI( \mathbf{e}_k ). I ( y ) = I ( k = 0 ∑ n y k e k ) = k = 0 ∑ n y k I ( e k ) . The functions appearing within the sum above have particular significance.
Definition 5.1.2 (Cardinal function)
A cardinal function ϕ k \phi_k ϕ k for a node set t 0 , … , t n t_0,\ldots,t_n t 0 , … , t n is the function that interpolates the value ( t k , 1 ) (t_k,1) ( t k , 1 ) and ( t j , 0 ) (t_j,0) ( t j , 0 ) for all j ≠ k j\neq k j = k .
For any set of n + 1 n+1 n + 1 nodes, there are n + 1 n+1 n + 1 cardinal functions ϕ 0 , … , ϕ n \phi_0,\ldots,\phi_n ϕ 0 , … , ϕ n , each singling out a different interpolation node in the set. We finish (5.1.2) by writing
I ( y ) = ∑ k = 0 n y k ϕ k . \cI( \mathbf{y} ) = \sum_{k=0}^n y_k \phi_k. I ( y ) = k = 0 ∑ n y k ϕ k . In the following result we use the function infinity-norm or max-norm defined by
∥ f ∥ ∞ = max x ∈ [ a , b ] ∣ f ( x ) ∣ . \| f\|_{\infty} = \max_{x \in [a,b]} |f(x)|. ∥ f ∥ ∞ = x ∈ [ a , b ] max ∣ f ( x ) ∣. Theorem 5.1.1 (Conditioning of interpolation)
Suppose that I \cI I is a linear interpolation method on nodes t 0 , … , t n t_0,\ldots,t_n t 0 , … , t n . Then with respect to the infinity norm, the absolute condition number of I \cI I satisfies
max 0 ≤ k ≤ n ∥ ϕ k ∥ ∞ ≤ κ ( y ) ≤ ∑ k = 0 n ∥ ϕ k ∥ ∞ , \max_{0\le k \le n}\, \bigl\| \phi_k \bigr\|_\infty \le \kappa(\mathbf{y}) \le \sum_{k=0}^n \, \bigl\| \phi_k \bigr\|_\infty, 0 ≤ k ≤ n max ∥ ∥ ϕ k ∥ ∥ ∞ ≤ κ ( y ) ≤ k = 0 ∑ n ∥ ∥ ϕ k ∥ ∥ ∞ , where the ϕ k \phi_k ϕ k are cardinal interpolating functions.
Suppose the data vector is perturbed from y \mathbf{y} y to y + d \mathbf{y}+ \mathbf{d} y + d . Then
I ( y + d ) − I ( y ) = I ( d ) = ∑ k = 0 n d k ϕ k . \cI(\mathbf{y} + \mathbf{d}) - \cI(\mathbf{y}) = \cI(\mathbf{d}) = \sum_{k=0}^n d_k \phi_k. I ( y + d ) − I ( y ) = I ( d ) = k = 0 ∑ n d k ϕ k . Hence
∥ I ( y + d ) − I ( y ) ∥ ∞ ∥ d ∥ ∞ = ∥ ∑ k = 0 n d k ∥ d ∥ ∞ ϕ k ∥ ∞ . \frac{\bigl\|\cI(\mathbf{y} + \mathbf{d}) - \cI(\mathbf{y}) \bigr\|_{\infty}}{\| \mathbf{d} \|_{\infty}} =
\left\|\, \sum_{k=0}^{n} \frac{d_k}{\|\mathbf{d} \|_{\infty}} \phi_k \, \right\|_{\infty}. ∥ d ∥ ∞ ∥ ∥ I ( y + d ) − I ( y ) ∥ ∥ ∞ = ∥ ∥ k = 0 ∑ n ∥ d ∥ ∞ d k ϕ k ∥ ∥ ∞ . The absolute condition number maximizes this quantity over all d \mathbf{d} d . Suppose j j j is such that ∥ ϕ j ∥ ∞ \|\phi_j\|_\infty ∥ ϕ j ∥ ∞ is maximal. Then let d = e j \mathbf{d}=\mathbf{e}_j d = e j and the first inequality in (5.1.5) follows. The other inequality follows from the triangle inequality:
∥ ∑ k = 0 n d k ∥ d ∥ ∞ ϕ k ∥ ∞ ≤ ∑ k = 0 n ∣ d k ∣ ∥ d ∥ ∞ ∥ ϕ k ∥ ∞ . \left\| \, \sum_{k=0}^{n} \frac{d_k}{\|\mathbf{d} \|_{\infty}} \phi_k \, \right\|_{\infty} \le \sum_{k=0}^{n} \frac{|d_k|}{\|\mathbf{d} \|_{\infty}} \| \phi_k \|_\infty. ∥ ∥ k = 0 ∑ n ∥ d ∥ ∞ d k ϕ k ∥ ∥ ∞ ≤ k = 0 ∑ n ∥ d ∥ ∞ ∣ d k ∣ ∥ ϕ k ∥ ∞ . Since ∣ d k ∣ ≤ ∥ d ∥ ∞ |d_k|\le \|\mathbf{d}\|_\infty ∣ d k ∣ ≤ ∥ d ∥ ∞ for all k k k , this finishes (5.1.5) .
Conditioning of interpolationIn Demo 5.1.1 and Demo 5.1.3 we saw a big difference between polynomial interpolation and piecewise polynomial interpolation of some arbitrarily chosen data. The same effects can be seen clearly in the cardinal functions, which are closely tied to the condition numbers.
n = 18
t = range(-1, stop=1, length=n + 1)
y = [zeros(9); 1; zeros(n - 9)]; # data for 10th cardinal function
scatter(t, y, label="data")
ϕ = Spline1D(t, y)
plot!(x -> ϕ(x), -1, 1, label="spline",
xlabel=L"x", ylabel=L"\phi(x)",
title="Piecewise cubic cardinal function")
The piecewise cubic cardinal function is nowhere greater than one in absolute value. This happens to be true for all the cardinal functions, ensuring a good condition number for any interpolation with these functions. But the story for global polynomials is very different.
scatter(t, y, label="data")
ϕ = Polynomials.fit(t, y, n)
plot!(x -> ϕ(x), -1, 1, label="polynomial",
xlabel=L"x", ylabel=L"\phi(x)", legend=:top,
title="Polynomial cardinal function")
From the figure we can see that the condition number for polynomial interpolation on these nodes is at least 500.
Conditioning of interpolationIn Demo 5.1.1 and Demo 5.1.3 we saw a big difference between polynomial interpolation and piecewise polynomial interpolation of some arbitrarily chosen data. The same effects can be seen clearly in the cardinal functions, which are closely tied to the condition numbers.
n = 18;
t = linspace(-1, 1, n+1)';
y = [zeros(9, 1); 1; zeros(n - 9, 1)]; % 10th cardinal function
clf, scatter(t, y)
hold on
x = linspace(-1, 1, 400)';
plot(x, interp1(t, y, x, 'spline'))
title('Piecewise cubic cardinal function')
xlabel('x'), ylabel('p(x)')
The piecewise cubic cardinal function is nowhere greater than one in absolute value. This happens to be true for all the cardinal functions, ensuring a good condition number for any interpolation with these functions. But the story for global polynomials is very different.
clf, scatter(t, y)
c = polyfit(t, y, n);
hold on, plot(x, polyval(c, x))
title('Polynomial cardinal function')
xlabel('x'), ylabel('p(x)')
From the figure we can see that the condition number for polynomial interpolation on these nodes is at least 500.
Conditioning of interpolationIn Demo 5.1.1 and Demo 5.1.3 we saw a big difference between polynomial interpolation and piecewise polynomial interpolation of some arbitrarily chosen data. The same effects can be seen clearly in the cardinal functions, which are closely tied to the condition numbers.
clf
n = 18
t = linspace(-1, 1, n + 1)
y = zeros(n + 1)
y[9] = 1.0
p = interp1d(t, y, kind="cubic")
scatter(t, y, label="data")
tt = linspace(-1, 1, 400)
plot(tt, p(tt), label="cardinal function")
title("Cubic spline cardinal function")
legend()
The piecewise cubic cardinal function is nowhere greater than one in absolute value. This happens to be true for all the cardinal functions, ensuring a good condition number for any interpolation with these functions. But the story for global polynomials is very different.
p = poly1d(polyfit(t, y, n))
scatter(t, y, label="data")
plot(tt, p(tt), label="cardinal function")
xlabel("$x$")
ylabel("$y$")
title("Polynomial cardinal function")
legend()
From the figure we can see that the condition number for polynomial interpolation on these nodes is at least 500.
5.1.4 Exercises ¶ ⌨ Create data by entering
(a) Use fit
to construct and plot the polynomial interpolant of the data, superimposed on a scatter plot of the data.
(b) Use Spline1D
to construct and plot a piecewise cubic interpolant of the data, superimposed on a scatter plot of the data.
⌨ The following table gives the life expectancy in the U.S. by year of birth.
1980 1985 1990 1995 2000 2005 2010 73.7 74.7 75.4 75.8 77.0 77.8 78.7
(a) Defining “year since 1980” as the independent variable, use fit
to construct and plot the polynomial interpolant of the data.
(b) Use Spline1D
to construct and plot a piecewise cubic interpolant of the data.
(c) Use both methods to estimate the life expectancy for a person born in 2007. Which value is more believable?
⌨ The following two vectors define a flying saucer shape.
x = [ 0,0.51,0.96,1.06,1.29,1.55,1.73,2.13,2.61,
2.19,1.76,1.56,1.25,1.04,0.58,0 ]
y = [ 0,0.16,0.16,0.43,0.62,0.48,0.19,0.18,0,
-0.12,-0.12,-0.29,-0.30,-0.15,-0.16,0 ]
We can regard both x x x and y y y as functions of a parameter s s s , with the points being values given at s = 0 , 1 , … , 15 s=0,1,\ldots,15 s = 0 , 1 , … , 15 .
(a) Use Spline1D
once on each coordinate as functions of s s s , and make a picture of the flying saucer.
(b) One drawback of the result in part (a) is the noticeable corner at the left side, which corresponds to s = 0 s=0 s = 0 from above and s = 15 s=15 s = 15 from below. There is a periodic variation on cubic spline interpolation that you can invoke by adding the keyword periodic=true
to the Spline1D
call. Use this to re-plot the flying saucer.
✍ Define
q ( s ) = a s ( s − 1 ) 2 − b ( s − 1 ) ( s + 1 ) + c s ( s + 1 ) 2 . q(s) = a\frac{s(s-1)}{2} - b (s-1)(s+1) + c \frac{s(s+1)}{2}. q ( s ) = a 2 s ( s − 1 ) − b ( s − 1 ) ( s + 1 ) + c 2 s ( s + 1 ) . (a) Show that q q q is a polynomial interpolant of the points ( − 1 , a ) (-1,a) ( − 1 , a ) , ( 0 , b ) (0,b) ( 0 , b ) , ( 1 , c ) (1,c) ( 1 , c ) .
(b) Find a change of variable s = A x + B s=Ax+B s = A x + B so that the values s = − 1 , 0 , 1 s=-1,0,1 s = − 1 , 0 , 1 correspond to x = x 0 − h , x 0 , x 0 + h x=x_0-h,x_0,x_0+h x = x 0 − h , x 0 , x 0 + h .
(c) Find a quadratic polynomial interpolant q ~ ( x ) \tilde{q}(x) q ~ ( x ) for the points ( x 0 − h , a ) (x_0-h,a) ( x 0 − h , a ) , ( x 0 , b ) (x_0,b) ( x 0 , b ) , ( x 0 + h , c ) (x_0+h,c) ( x 0 + h , c ) .
✍ (continuation) Use the result of the previous exercise and Theorem 5.1.1 to derive bounds on the condition number of quadratic polynomial interpolation at the nodes x 0 − h x_0-h x 0 − h , x 0 x_0 x 0 , x 0 + h x_0+h x 0 + h .