I have a time series defined by gradual changes interrupted by large jumps; the time series also contains random noise. It also has irregular time steps. With my eye, I can separate the time series into uniform segments of the gradual changes and ignore the noise. I would like to automate this as I actually have several hundred such time series. Are there existing techniques to do this kind of task? Otherwise I will trial and error criteria to break the time series into segments, smooth down the noise, then differentiate. My attempts have not matched my eye's results so I want to make sure I am not missing any known approaches.
MATLAB: Piecewise differentiation, automatic identification of pieces
differentiation
Related Solutions
Yes, the Symbolic Toolbox can handle that without difficulty. The syntax would go something like
syms theta(t) tdiff(cos(theta(t))
The "best" way? That will be completely dependent on your data, on what you know about your data, and on what you are willing to assume about the process that generated the data.
Looking at your data, since I don't have your data, nor do I know anything about the data or the process, I can only guess. Thus, can I assume that what looks like noise really is noise? I had one group of clients who were not willing to make that assumption. Any smoothing at all was a bad thing, because there was potentially information in the high frequencty stuff that they needed to know about. Luckily for them, the noise was not overwhelming.
In general though, numerical differentiation is a noise amplifying process. So you need to kill off the noise to whatever extent possible. You can do so by use of a spline fit (IF you use the right spline model. An interpolating spline would be generally a bad idea for you), or you could use a regression model of some sort, IF you have a viable model for this process.
Again, I don't have your data. So, at best, I can only make some data up for an example. So, let me see, this should look at least a little like what you have shown:
x = linspace(0.5,2.5,100);y = 19 + exp((x - 0.5)/2)/10 + randn(size(x))/100;plot(x,y,'.')
The picture you show has a fair amount of data, yet it is not that complex of a relationship. As such, a low order polynomial model might be sufficient. Then you could just differentiate the polynomial. For example...
P3 = polyfit(x,y,3)plot(x,y,'b.',x,polyval(P3,x),'r-')
P3d = polyder(P3)plot(x,polyval(P3d,x),'-')
A problem with such an approach is it does not build information that you may have about the underlying functional relationship. That is, do you know that temperature will ALWAYS increase with time? If so, then the first derivative must always be positive, yet a polynomial fit gives you no such assurance. In fact, you might notice that the true underlying function behind my data has an everywhere increasing derivative, yet a cubic polynomial model had a negative second derivative. Worse, what if you use too high an order for a polynomial? That is easy to do.
P7 = polyfit(x,y,7);plot(x,y,'b.',x,polyval(P7,x),'r-')
The same fact applies to a Savitsky-Golay filter. A Savitsky-Golay derivative estimation on noisy data will need to have a long window width, and low order implicit polynomial model. However, remember that Savitsky-Golay does not produce a differentiable or continuous function/derivative estimate. So, using my own movingslope code, that builds polynomial model on a moving window, then returns the derivative, we see:
plot(x,movingslope(y,20,1),'.')
Anyway, if we look at the bottom end of the curve, the 7th degree polynomial started to get a little squirrely. So, I might try to use a regression spline, constrained to have the expected shape. Lets see,
slm = slmengine(x,y,'plot','on','increasing','on','concaveup','on','jerk','positive');
So here, a cubic spline, constrained to have first, second, and third derivatives entirely positive over the domain.
Better yet, of course, is to use a nonlinear model, IF you know a good choice of model in advance.
But really, what matters is as I said in the beginning. What do you know? Because once you can provide information to the modeling process, you will GREATLY improve the derivative estimation process. Tools like Savitsky-Golay assume almost nothing about the system. A polynomial model assumes relatively little, but somewhat more.
Best Answer