Typically a cubic spline is given for a set of points such that for each interval a cubic function is fitted to the points, with matching slopes and curvature, as well as the endpoints, have zero curvature (2nd derivative). This is the definition of a natural spline.
In your case, you have 4 points and 3 intervals. The number of unknowns is 12, as for each interval there are 4 unknowns (the cubic part).
Your constraints are:
- Points on given nodes (4 equations)
- Interval end points match (2 equations)
- Interval end slopes match (2 equations)
- Interval end curvature match (2 equations)
- End point cuvature zero (2 equations)
In total, you have 12 equations with 12 unknowns.
See http://www.physics.utah.edu/~detar/phys6720/handouts/cubic_spline/cubic_spline/node1.html for more details. Also wikipedia has a nice write-up.
For your case the solution is
$$ y(x)=\begin{cases}
\frac{2x(7-2x^{2})}{5} & x=0\ldots1\\
\frac{5x^{3}-27x^{2}+41x-9}{5} & x=1\ldots2\\
\frac{-x^{3}+9x^{2}-31x+39}{5} & x=2\ldots3
\end{cases} $$
Details
The general equation for a cubic spline is an interpolation using the values $Y_i$ and the 2nd derivatives $Y''_i$ at each node. Once the 2nd derivatives are known, then the spline is solved for and the following equation can be used for interpolation between nodes $i$ and $i+1$.
This is done with a parameter $t=0\ldots 1$ that spans the interval
$$\begin{aligned}
x_i & = X_i + t\,h \\
y_i & = (1-t) Y_i + y\,Y_{i+1} + \tfrac{h^2}{6} \left( (1-t) (t^2-2 t) Y''_i + t (t^2-1) Y''_{i+1} \right) \end{aligned} $$
The system of equations to find a spline is formed using the following procedure:
Start Point Either the spline in a natural spline with $Y''_1 = 0$ or it has a specified slope $Y'_1$ at which point the following equation must be used
$$ Y''_2 + 2 Y''_1 = \tfrac{6}{h^2} \left( Y_2 - Y_1 - h Y'_1 \right) \tag{i = 1}$$
Internal Points Here the slope needs to match at nodes and this is ensured with the following equation for $i=2 \ldots n-1$
$$ Y''_{i-1} + Y''_{i+1} + 4 Y''_i = \tfrac{6}{h^2} \left( Y_{i-1} + Y_{i+1} - 2 Y_i \right) \tag{i = 2 to n-1}$$
End Point Again the spline is either natural with $Y''_n = 0$ or with the specified slope $Y'_n$, and the following equation must be used.
$$ Y''_{n-1} +2 Y''_n = \tfrac{6}{h^2} \left( Y_{n-1} - Y_n + h Y'_n \right) \tag{i = n}$$
For your example, I assume you want a natural spline so only the interior points are considered, resulting in these two equations
$$\begin{aligned}
4 Y''_2 + Y''_3 & = \tfrac{6}{h^2} \left( Y_3 - 2 Y_2 + Y_1 \right) \\
Y''_2 + 4 Y''_3 & = \tfrac{6}{h^2} \left( Y_4 - 2 Y_3 + Y_1 \right)
\end{aligned}$$
Using $h=1$ and $Y=[0,2,1,0]$ and solved for $Y''_2$ and $Y''_3$.
NOTE: I have assumed constant interval with $h = X_{i}-X_{i-1}$, but that above can be adapted to a varying grid also.
In general, cubic interpolation is better than linear interpolation in most aspects such as smoothness of the function and higher accuracy in approximating the original function. However, there is at least one aspect where linear interpolation is better: the linear interpolation will not produce the "overshoot" situation. For example, if you have a function that always produce positive y values, using cubic interpolation to approximate this function might give you a function that "overshoots" to negative y territory even when all the interpolated y values are positive. This will not happen with linear interpolation.
As for "cubic spline interpolation", I am not aware of any context where "cubic interpolation" and "cubic spline interpolation" are considered as different.
Best Answer
Not terribly hard to do; as a matter of fact, even if you just limit yourself to Hermite interpolation, there are a number of methods. I shall discuss the three which I have most experience with.
Recall that given points $(x_i,y_i),\quad i=1\dots n$, and assuming no two $x_i$ are the same, one can fit a piecewise cubic Hermite interpolant to the data. (I gave the form of the Hermite cubic in this previous answer.)
To use the notation of that answer, you already have $x_i$ and $y_i$ and require an estimate of the $y_i^\prime$ from the given data. There are at least three schemes for doing this: Fritsch-Carlson, Steffen, and Stineman.
(In the succeeding formulae, I use the notation $h_i=x_{i+1}-x_i$ and $d_i=\frac{y_{i+1}-y_i}{h_i}$.)
The method of Fritsch-Carlson computes a weighted harmonic mean of slopes:
$$y_i^\prime = \begin{cases}3(h_{i-1}+h_i)\left(\frac{2h_i+h_{i-1}}{d_{i-1}}+\frac{h_i+2h_{i-1}}{d_i}\right)^{-1} &\text{ if }\mathrm{sign}(d_{i-1})=\mathrm{sign}(d_i)\\ 0&\text{ if }\mathrm{sign}(d_{i-1})\neq\mathrm{sign}(d_i)\end{cases}$$
the method of Steffen is based on a weighted mean (alternatively, a parabolic fit within the interval):
$$y_i^\prime = (\mathrm{sign}(d_{i-1})+\mathrm{sign}(d_i))\min\left(|d_{i-1}|,|d_i|,\frac12 \frac{h_i d_{i-1}+h_{i-1}d_i}{h_{i-1}+h_i}\right)$$
and the method of Stineman fits to circles:
$$y_i^\prime = \frac{h_{i-1} d_{i-1}h_i^2(1+d_i^2)+h_i d_i h_{i-1}^2(1+d_{i-1}^2)}{h_{i-1} h_i^2(1+d_i^2)+h_i h_{i-1}^2(1+d_{i-1}^2)}$$
The formulae I have given are applicable only to "internal" points; you'll have to consult those papers for the slope formulae for handling the endpoints.
As a demonstration of these three methods, consider these two datasets due to Akima:
$$\begin{array}{|l|l|} \hline x&y\\ \hline 1&10\\2&10\\3&10\\5&10\\6&10\\8&10\\9&10.5\\11&15\\12&50\\14&60\\15&95\\ \hline \end{array}$$
and Fritsch and Carlson:
$$\begin{array}{|l|l|} \hline x&y\\ \hline 7.99&0\\8.09&2.7629\times 10^{-5}\\8.19&4.37498\times 10^{-3}\\8.7&0.169183\\9.2&0.469428\\10&0.94374\\12&0.998636\\15&0.999919\\20&0.999994\\ \hline \end{array}$$
Here are plots of these two datasets:
Here are plots of the cubic spline fits to these two sets:
Note the wiggliness that was not present in the original data; this is the price one pays for the second-derivative continuity the cubic spline enjoys.
Here now are plots of interpolants using the three methods mentioned earlier.
This is the Fritsch-Carlson result:
This is the Steffen result:
This is the Stineman result:
(The not-too-good result for the Fritsch-Carlson data set might be due to the use of a cubic Hermite interpolant instead of the rational interpolant Stineman recommended to be used with his derivative prescription.)
As I said, I've had good experience with these three; however, you will have to experiment in your environment on which of these is most suitable to your needs.