Hi. Sorry for not getting back to your comment on my answer yesterday. Here is how I would do it:
First, some random data for my example:
data = 2100*rand(1,10000);
Next, I'll make a few sections of data repitition:
data(1,50:120) = 79.356;
data(1,200:210) = 81.220;
data(1,400:520) = 1445.201;
data(1,900:948) = 0.113;
Now do the differencing. Runs of zeros will be potential problem areas. The ~ logical command is used to return binary data. That is, where the difference function returned zero (no change) we return "true." Everywhere else returns "false." So now we have a 10,000 element binary vector with sections of ones and zeros, and the ones are repetitions.
Now here is where I search for zeros. Like I said, there are definitely other ways of doing this, including using a for loop, but I find this to be the most compact and simple way I've come across. I'll split it up into steps instead of jamming it all together like I did yesterday.
First, turn your differenced vector into a string:
datarepstr = num2str(datarep)
Turning a vector into a string puts spaces between each number, so we'll use a "regular expression replace" function to get rid of them and leave us just the ones and zeros. The function finds all points of ' ' in our string and replaces them with ''.
s = regexprep(datarepstr,' ','');
Now we want to find where all the ones are in the string, as well as how long each sections of ones is. regexp searches our string for all cases where there are one or more ones, or '1+'. Our expression should find four different sections of ones (because that's how many runs of repetition I added. "ids" is the start of each section and runs is the section pulled out from the string.
[ids runs] = regexp(s,'1+','start','match');
These values are returned in cell arrays. cellfun is a function that performs another function (in this case, length) on each cell of an array. It's like looping over each element but more compact. l should have four elements telling how long each run is.
l = cellfun('length',runs);
Now we have everything we need in order to check our potential problem runs for ones that cross the line. It will all depend on the frequency of your sampling. If it's on datapoint every second, we'll see if any of our lengths are greater than sixty. If it's every half second, we'll look for >120. And so on.
if any(l > 60)
disp('Error')
end
Of course, you may want more info than that in your message. You may also want to stop execution of your program, in which case calling error instead of disp would be needed. You may want to tell which elements are the problematic repetitions, and you can do that, because you have the lengths of the runs in l and the indices of where each run starts in ids.
Finally, here's the function in its entirety, now in a very compact form:
[ids runs] = regexp(regexprep(num2str(~diff(data)),' ',''),'1+','start','match');
l = cellfun('length',runs);
if any(l > 60)
disp('Error')
end
Best Answer