Fitting temperature dataΒΆ

Here are the 5-year temperature averages again.

using FundamentalsNumericalComputation
year = 1955:5:2000
t = year .- 1955;
y = [ -0.0480, -0.0180, -0.0360, -0.0120, -0.0040,
    0.1180, 0.2100, 0.3320, 0.3340, 0.4560 ];

The standard best-fit line results from using a linear polynomial that meets the least squares criterion.

V = [ t.^0 t ]    # Vandermonde-ish matrix
@show size(V)
c = V\y
size(V) = (10, 2)
2-element Array{Float64,1}:
 -0.12938181818181826
  0.011670303030303033
p = Polynomial(c)
-0.12938181818181826 + 0.011670303030303033∙x
f = s -> p(s-1955)
scatter(year,y,label="data",
    xlabel="year",ylabel="anomaly (degrees C)",leg=:bottomright)
plot!(f,1955,2000,label="linear fit")

If we use a global cubic polynomial, the points are fit more closely.

V = [ t[i]^j for i=1:length(t), j=0:3 ]   # Vandermonde-ish matrix
@show size(V)
size(V) = (10, 4)
(10, 4)
p = Polynomial( V\y )
f = s -> p(s-1955)

plot!(f,1955,2000,label="cubic fit")

If we were to continue increasing the degree of the polynomial, the residual at the data points would get smaller, but overfitting would increase.