MATLAB: Delete rows with bad data and surrounding rows

todelete

I would like to delete rows which contain ones, sinces ones indicate bad data (inclusion criterion 1). Moreover, I would like to remove rows that are surrounded by those rows with bad information. The aim is to only include rows if they are present in sets of minimally 3 good (all zeros) rows (inclusion criterion 2). I created a matrix B to explain my question:
B = [0 1 0 0 1 0 1;
0 0 0 0 0 0 0;
0 1 0 0 1 0 1;
0 1 0 0 0 1 0;
0 0 0 0 0 0 0;
0 1 0 1 1 0 1;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 1 0 0 0 1 0;
0 1 0 0 0 1 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 1 0 0 1 1 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
1 0 0 0 1 0 0;
1 0 1 1 1 0 1];
In this 19×7 matrix row 1, 3, 4, 6, 8, 9, 10 ,1 4, 18 an 19 would be deleted by inclusion criterion 1. So far my loop (for multiple matrices like B) works. Regarding my inclusion criterion 2, row 2, 5, 7, and 8 must be deleted as well since they are not part of set of 3 or more rows with zeros. For inclusion criterion 2 I have to create an if structure in my existing loop.
% find or strcmp to look for the rows
% todelete = [] to eliminate these r
How can I delete rows that contain ones OR (||) are present in a set of less than 3 rows with all zeros?

Best Answer

Here's another approach
% script to clean data
B = [0 1 0 0 1 0 1;
0 0 0 0 0 0 0;
0 1 0 0 1 0 1;
0 1 0 0 0 1 0;
0 0 0 0 0 0 0;
0 1 0 1 1 0 1;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 1 0 0 0 1 0;
0 1 0 0 0 1 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 1 0 0 1 1 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
1 0 0 0 1 0 0;
1 0 1 1 1 0 1];
D = rand(size(B)); % data matrix to be cleaned
% assign parameters
minRun = 3; % minimum number of adjacent rows to be considered good data
% make vector with ones for the good rows (rows with only zeros)
iGood = ~any(B,2);
% now mark the locations where the beginning and end of each run of ones
% starts and ends
% use diff to create jumps at transitions, pad with 1's to ensure jump at start and end
isJump = [1; diff(iGood(:))~=0; 1];
% find location of jumps
jmpIdx = find(isJump);
% find run lengths of zeros, and ones
n = diff(jmpIdx);
% n has the lengths of runs of zeros, and runs of ones interleaved, but we
% need to find out whether it starts with the zeros, or starts with the
% ones
if iGood(1) == 1
% starts with ones
offset = 0;
else
% starts with zeros
offset = 1;
end
% in preparation for using repelem, build a vector with alternating
% values of zero and run lengths
run = zeros(size(n)); % initalize and preallocate
iStart = 1 + offset; % element where first run of ones starts
run(iStart:2:end) = n(iStart:2:end);
% assign the run lengths corresponding to each row
runLength = repelem(run,n);
% only keep rows in B that are members of sufficiently wide (run length) peaks
idxClean = 1:size(B,1);
idxClean = idxClean(runLength >= minRun);
Bclean = B(idxClean,:);
% also probably want to clean some other matrix based upon status of B
Dclean = D(idxClean,:)