Since your temperature readings aren't equispaced, you can't directly apply Simpson's rule; the approach equivalent to this is to construct the parabola that interpolates three consecutive points (i.e., across two panels), and integrate that. The problem with this approach, of course, is that you need to have an odd number of data points (even number of panels) to do this.
You can use the trapezoidal rule, of course, across each panel, but a probably better idea might be to construct a cubic Hermite interpolant for each panel, and then integrate that. An obvious problem is that four conditions are needed to uniquely determine a cubic for each panel (two points and two derivative values), but estimates of derivative values can be constructed from the data such that the piecewise interpolant is locally monotonic; briefly, an piecewise interpolant is locally monotonic if there are no spurious inflection points or extrema within a panel. One approach to estimating derivative values for monotonic interpolation, due to Fritsch and Carlson, is implemented in FORTRAN as the pchip
package, and in MATLAB as the function pchip
. A more modern approach, and one which may be better is some situations, is due to Steffen. You may have to experiment which of the trapezoidal rule, Fritsch-Carlson, or Steffen would be best for integrating your data.
Let me detail the way one uses cubic Hermite interpolation for integrating data:
Each panel is bounded by two points, $(x_i,y_i)$ and $(x_{i+1},y_{i+1})$. The trapezoidal rule consists of constructing the line joining these two points (linear interpolation):
$$f_i(x)=\frac{x_{i+1}-x}{x_{i+1}-x_i}y_i+\frac{x-x_i}{x_{i+1}-x_i}y_{i+1}$$
and integrating that:
$$\int_{x_i}^{x_{i+1}}f_i(x)\mathrm dx=\frac{x_{i+1}-x_i}{2}(y_i+y_{i+1})$$
Integrating with cubic Hermite interpolation can be considered as a further "improvement" of the trapezoidal rule; briefly, in addition to points, one has derivative values $y_i^{\prime}$ and $y_{i+1}^{\prime}$, from which one constructs the cubic
$$g_i(x)=y_i+y_i^{\prime}(x-x_i)+c_i(x-x_i)^2+d_i(x-x_i)^3$$
where
$\displaystyle c_i=\frac{3\frac{y_{i+1}-y_i}{x_{i+1}-x_i}-2y_i^{\prime}-y_{i+1}^{\prime}}{x_{i+1}-x_i}$ and $\displaystyle d_i=\frac{y_i^{\prime}+y_{i+1}^{\prime}-2\frac{y_{i+1}-y_i}{x_{i+1}-x_i}}{(x_{i+1}-x_i)^2}$.
Integrating $g_i(x)$ might look slightly complicated, however, there is a nice expression for the integral:
$$\int_{x_i}^{x_{i+1}}g_i(x)\mathrm dx=\frac{x_{i+1}-x_i}{6}\left(y_i+4g_i\left(\frac{x_{i+1}+x_i}{2}\right)+y_{i+1}\right)$$
whose verification I'll leave up to you.
I'll just add the note that in the case of equispaced data, integration with a piecewise cubic Hermite interpolant is equivalent to integration with the trapezoidal rule plus corrections at the beginning and end.
But I'm not convinced we should always apply this rule any time we cut
into $2n$ intervals. Why not just throw out the uneven weighting and
use a few more sample points? If the weighting is so helpful, why not
use a more complicated weighting (like the various n-point rules
(Newton-Cotes formulas) ... )?
The problem with Newton-Cotes methods of high order is that it inherits the same sort of problems you see with using high-order interpolating polynomials. Remember that the Newton-Cotes quadrature rules are based on integrating interpolating polynomial approximations to your function over equally spaced points.
In particular, there is the Runge phenomenon: high-order interpolating functions are in general quite oscillatory. This oscillation manifests itself in the weights of the Newton-Cotes rules: in particular, the weights of Newton-Cotes quadrature rules for 2 to 8 points and and 10 points (Simpson's is the three-point rule) are all positive, but in all the other cases, there are negative weights present. The reason for insisting on weights of the same sign for a quadrature rule is the phenomenon of subtractive cancellation, where two nearly equal quantities are subtracted, giving a result that has less significant digits. By ensuring that the all weights have the same sign, any cancellation that may occur in the computation is due to the function itself being integrated (e.g. the function has a simple zero within the integration interval) and not due to the quadrature rule.
The approach of breaking up a function into smaller intervals and applying a low-order quadrature rule like Simpson's is effectively the integration of a piecewise polynomial approximation. Since piecewise polynomials are known to have better approximation properties than interpolating polynomials, this good behavior is inherited by the quadrature method.
On the other hand, one can still salvage the interpolating polynomial approach if one no longer insists on having equally-spaced sample points. This gives rise to e.g. Gaussian and Clenshaw-Curtis quadrature rules, where the sample points are taken to be the roots of Legendre polynomials in the former, and roots (or extrema in some implementations) of Chebyshev polynomials in the latter. (Discussing these would make this answer too long, so I shall say no more about them, except that these quadrature rules tend to be more accurate than the corresponding Newton-Cotes rule for the same number of function evaluations.)
...is Simpson's rule so useful that calculus students should always use it for approximations?
As with any tool, blind use can lead you to a heap of trouble. In particular, we know that a polynomial can never have horizontal asymptotes or vertical tangents. It stands to reason that a polynomial will be a poor approximation to a function with these features, and thus a quadrature rule based on interpolating polynomials will also behave poorly. The piecewise approach helps a bit, but not much. One should always consider a (clever?) change of variables to eliminate such features before applying a quadrature rule.
Best Answer
Increasing the precision, both in terms of the order of the method and in the number of gridpoints used, usually (most of the time) leads to a more accurate estimate for the integral we are trying to compute. However this is not always the case as the following (aritficial) example shows.
$$\bf \text{Example where higher order does not imply better accuracy}$$
Let
$$f(x) = \left\{\matrix{1 & x < \frac{1}{2}\\0 & x \geq \frac{1}{2}}\right.$$
and consider the integral $I=\int_0^1f(x){\rm d}x = \frac{1}{2}$. If we use the trapezoidal rule with $n$ gridpoints then
$$I_{n} = \frac{1}{n}\sum_{i=1}^{n}\frac{f(\frac{i-1}{n})+f(\frac{i}{n})}{2} \implies I_n = \left\{\matrix{\frac{1}{2} & n~~\text{odd}\\\frac{1}{2} - \frac{1}{n} & n~~\text{even}}\right.$$
so for $n=3$ we have the exact answer which is better than any even $n$ no matter how large it is. This shows that increasing the number of gridpoints does not always improve the accuracy. With Simpson's rule we find
$$I_n = \frac{1}{3n}\sum_{i=1}^{n/2}f\left(\frac{2i-2}{n}\right)+4f\left(\frac{2i-1}{n}\right)+f\left(\frac{2i}{n}\right) \implies I_n = \left\{\matrix{\frac{1}{2} - \frac{1}{3n}&n\equiv 0\mod 4\\\frac{1}{2}&n\equiv 1\mod 4\\\frac{1}{2} + \frac{2}{3n} & n\equiv 2\mod 4\\\frac{1}{2} - \frac{5}{6n} & n\equiv 3\mod 4}\right.$$
so even if Simpson's rule has higher order we see that it does not always do better than the trapezoidal rule.
$$\bf \text{What does higher degree of precision really mean?}$$
If we have a smooth function then a standard Taylor series error analysis gives that the error in estimating the integral $\int_a^bf(x){\rm d}x$ using $n$ equally spaced points is bounded by (here for Simpsons and the trapezoidal rule)
$$\epsilon_{\rm Simpsons} = \frac{(b-a)^5}{2880n^4}\max_{\zeta\in[a,b]}|f^{(4)}(\zeta)|$$ $$\epsilon_{\rm Trapezoidal} = \frac{(b-a)^3}{12n^2}\max_{\zeta\in[a,b]}|f^{(2)}(\zeta)|$$
Note that the result we get from such an error analysis is always an upper bound (or in some cases an order of magnitude) for the error apposed to the exact value for the error. What this error analysis tell us is that if $f$ is smooth on $[a,b]$, so that the derivatives are bounded, then the error with a higher order method will tend to decrease faster as we increase the number of gridpoints and consequently we typically need fewer gridpoints to get the same accuracy with a higher order method.
The order of the method only tell us about the $\frac{1}{n^k}$ fall-off of the error and says nothing about the prefactor in front so a method that has an error of $\frac{100}{n^2}$ will tend to be worse than a method that has an error $\frac{1}{n}$ as long as $n\leq 100$.
$$\bf \text{Why do we need all these methods?}$$
In principle we don't need any other methods than the simplest one. If we can compute to arbitrary precision and have enough computation power then we can evaluate any integral with the trapezoidal rule. However in practice there are always limitations that in some cases forces us to choose a different method.
Using a low-order method requires many gridpoints to ensure good enough accuracy which can make the computation take too long time especially when the integrand is expensive to compute. Another problem that can happen even if we can afford to use as many gridpoints as we want is that truncation error (errors due to computers using a finite number of digits) can come into play so even if we use enough points the result might not be accurate.
Other methods can elevate these potential problems. Personally, whenever I need to integrate something and has to implement the method myself I always start with a low-precision method like the trapezoidal rule. This is very easy to implement, it's hard to make errors when coding it up and it's usually good enough for most purposes. If this is not fast enough or if the integrand has properties (e.g. rapid osccilations) that makes it bad I try a different method. For example I have had to compute (multidimensional) integrals where a trapezoidal rule would need more than a year to compute it to good enough accuracy, but with Monte-Carlo integration the time needed was less than a minute! It's therefore good to know different numerical integration methods in case you encounter a problem where the simplest method fails.