MATLAB: Numerical Integration on Discrete Data Sets

MATLAB

Hi All,
Im currently trying to run the following code:
close all; % close all programs
clc; % clears command window
clear all; % Clears memory
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%






%% Parameters
%Function and Interval
A= csvread('finaldata.csv');
x= A(:,2)
value= @sin
a= 0; % Lower limit of function
b= pi; % Upper limit of function
% Number of Points
N= 1250; % Number of points can be adjusted later
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Discrete Integration
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Positions of the points


dx = (b-a)/N;
x = [0.0010:N - 0.0010]*dx; % Centre positions of segments under the curve to be integrated
% Calculate Function


y = value(x) % Calculates function at midpoints
% Numerical Integration


I_Discrete = sum(y)*dx; % Numerical integration by summing and multiplying by dx
subplot(1,3,1)
plot(x,y)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Trapezoidal Integration
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Positions of the points
x = linspace(a,b,N); % To place N points on a and b exactly

dx = x(2) -x(1); % Extracts spacing between x values

% Calculate Function
y = value(x); % Caluclates function at midpoints

% Numerical Integration
W = [0.5 ones(1,N-2) 0.5];
I_Trap = sum(W.*y)*dx;
subplot(1,3,2)
plot(x,y)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Simpsons 1/3rd Integration
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Positions of the points
x = linspace(a,b,N); % To place N points on a and b exactly
dx = x(2) -x(1); % Extracts spacing between x values
% Calculate Function
y = value(x); % Caluclates function at midpoints
% Numerical Integration
I = 0;
for n = 1 : 2 : N-2
a = y(n) + 4*y(n+1) + y(n+2);
I = I + a;
end
I_simp = I*(dx/3);
subplot(1,3,3)
plot(x,y);
Im trying to feed a data set (it forms a sinewave) to integrate to remove the phase shift on the signal. The program works fine with the sin function, but cannot seem to get it working well with my data set. the data set has two collumns which when plotted together shows the sinewave over a number of periods.
Is it a case that i cannot apply the data in this way? Any help is appreicated.
Andy

Best Answer

I might first guess that your data is not uniformly spaced. Simpson's rule does not apply to non-uniformly spaced data (thus non-uniform in x.) Even trapezoidal rule needs to be constructed correctly for it to work. But, since we do not see your data, this is only a guess.
Most simply, you should just use trapz. It is designed to take a vector x and y, then compute the integral. Writing code to do what is already available to solve your problem is usually a bad idea, unless you know enough to do a better job. Here you don't. Sorry, but that is just a statement of fact. So use trapz.
x = [0,sort(rand(1,100)),1]*pi;
y = sin(x);
trapz(x,y)
ans =
1.99847987386531
We know the integral of sin(x) over [0,pi] is 2, so that is quite reasonable. Trapz has no problem with the unevenly spaced data, yet if I did what you did, it would fail miserably, as would Simpson's rule.
If you have the curve fitting toolbox, then a far better solution would come from an integral of a spline.
spl = spline(x,y);
diff(fnval(fnint(spl),[0,pi]))
ans =
1.99999912870377
But even lacking that toolbox, you can still use integral.
spl = spline(x,y);
integral(@(x) ppval(spl,x),0,pi)
ans =
1.99999912487824
Again though, unless you have been forbidden from using better tools as a homework assignment, writing code to replace existing code is a bad idea.
Could I have shown you how to fix the problem with trapezoidal rule? Well, yes. You constructed the vector of weights improperly, assuming unequally spaced data. A non-uniform trapezoidal rule would use the entire vector of differences in x to compute the weights on each data point. Could you fix Simpson's rule to work for unequally spaced data? That is not as easily doable, and the order of the rule would potentially suffer unless you were very careful. So for a more accurate integration of data, the best idea is to use the spline integral I did show you how to use.
Finally, is there a best way to solve the problem in the case where your data is noisy, thus with significant noise in y? In this case, it depends on how much noise there is in y then. For example...
y = sin(x) + randn(size(x))/5;
trapz(x,y)
ans =
1.98676569356795
spl = spline(x,y);
diff(fnval(fnint(spl),[0,pi]))
ans =
1.36442608167266
What you should see here is the spline integral did quite poorly on noisy data, whereas trapezoidal rule was still pretty good. An integral of a pchip interpolant is nearly as good as trapezoidal rule.