Solved – Cubic splines for interpolation through four points in R

interpolationrsplines

I am attempting to write R code for cubic splines to connect points on a graph. Specifically, I am attempting to reproduce Figure 3.3 of Wood (2006) Generalized Additive Models: An Introduction with R (page 124) where he uses seven cubic splines to connect eight points. I am starting with only three or four data points and have modified their values, so they may not closely match his data.

I have written R code based on equations provided at Wikipedia.

https://en.wikipedia.org/wiki/Spline_interpolation

I believe I have successfully programmed the equations provided in Wikipedia for the example there with three data points. However, I am unsure whether I have correctly modified that code for four data points.

What is the name of the algorithm shown in the above Wikipedia article and used in the example there? Is there a readily accessible citation for that algorithm?

I cannot see where the equations in the Wikipedia example follow from the derivation of the algorithm. Specifically, where do the equations for a11 and a22 come from and what do a11 and a22 represent?

The plot created by my R code for three data points appears to match the plot provided in the Wikipedia example. However, I cannot reproduce that plot using the spline function in R.

Wolphram-Alpha creates a plot that appears to match the plot returned by the FMM method in the spline function in R. What does FMM stand for and what would be a good citation for that method?

I suspect the reason the plot I obtain and the example plot in Wikipedia does not match the plot returned by Wolphram-Alpha or by the FMM function in R is that the example in Wikipedia uses two different cubic splines joined together at the middle data point, but that Wolphram-Alpha / FMM in R create one polynomial equation using all data points simultaneously. Is that correct?

In other words, I suspect Wikipedia and my R code use spline interpolation, but that Wolfram-Alpha and the FMM method in R use polynomial interpolation. Is that correct?

Originally I was hoping for verification that I had generated the correct equations for four data points based on the equations in Wikipedia for three data points. However, I have now shifted my focus to the questions in the preceding paragraphs and removed my R code.

If I can learn the answers to the questions above perhaps I can make progress following the algorithm shown in the above Wikipedia article and then perhaps post a programming question on Stack Overflow. Thank you for any answers to those questions.

EDIT

So far the best reference I have found is Numerical Computing with MATLAB by Cleve Moler, available here:

http://uk.mathworks.com/moler/chapters.html

If I make substantial progress I will post what I learn as an answer.

Best Answer

I have made progress toward answers to my questions. Here I post what I have learned so far and I can modify my post as I learn more.

Firstly, the best reference I have found so far is Numerical Computing with MATLAB by Cleve Moler, as I mentioned in an edit to the question and as Mark L. Stone mentioned in a comment.

I have not yet derived the algorithm shown in Wikipedia. Nevertheless, I assumed my R code was correct and expanded that code to handle the eight points Wood (2006:124) used in Figure 3.3 of Generalized Additive Models: An Introduction with R.

Those eight points, to the best of my eyesight, are:

my.data <- read.table(text = '
      x     y
    0.04   0.60
    0.18   0.01
    0.32   0.30
    0.415  0.28
    0.58   0.79
    0.68   0.27
    0.81   0.70
    0.88   0.83
', header = TRUE, stringsAsFactors = FALSE, na.strings = 'NA')

In the process I discovered an error in my R code, described below. After correcting that error I discovered that the example in Wikipedia, and my R code, does indeed reproduce Wood's (2006) Figure 3.3. I was also able to reproduce that same plot using this built-in R command:

lines(spline(x, y, method = c("natural"), n = 100), lwd = 2, col = 'blue')

More specifically:

spline(x, y, method = c("natural"), n = 100)

In other words, the Wikipedia example is using the natural cubic spline as near as I can tell. Not the FMM spline. The FMM spline seems to stand for the Forsythe, Malcolm, Moler spline described in Forsythe, G. E., Malcolm, M. A. and Moler, C. B. (1977) Computer Methods for Mathematical Computations. Wiley.

Wolfram-Alpha does not reproduce that plot. At least I have not been able to do it. When I use this command in Wolfram-Alpha:

interpolating polynomial {0.04,0.60},{0.18,0.01},{0.32,0.30},{0.415,0.28},{0.58,0.79},{0.68,0.27},{0.81,0.70},{0.88,0.83}

Wolfram-Alpha returns a single polynomial:

6.29713 - 221.199 x + 2402.99 x^2 - 12051.2 x^3 + 32012.2 x^4 - 46374.4 x^5 + 34557.4 x^6 - 10355.5 x^7

Note that this is a 7th-order polynomial, not a cubic polynomial.

I did try to create a 3rd-order polynomial in Wolfram-Alpha using:

BSplineCurve[{{0.04,0.60},{0.18,0.01},{0.32,0.30},{0.415,0.28},{0.58,0.79},{0.68,0.27},{0.81,0.70},{0.88,0.83}}, SplineDegree -> 3]

However, this returns no results. This might be Mathematica code. Regardless, the Wolfram-Alpha code I have tried does not return a natural cubic spline.

Once I derive the example equations in Wikipedia I will post my interpretation of them, in particular regarding a11 and a22. My limited understanding of the difference between the splines R offers is that it has to do with how they treat the second derivative of the first and last cubic spline.

Lastly, for now, regarding the error in my R code, the line:

a33 = 2 * ((1 / (x[3] - x[2])) + (1 / (x[3] - x[2])))

should have been.

a33 = 2 * ((1 / (x[3] - x[2])) + (1 / (x[4] - x[3])))

Best Answer

Related Solutions

Solved – Calculation of natural cubic splines in R

Splines Regression – Addressing Multicollinearity in Predictive Models

Related Question