MATLAB: Remove duplicate variables depending on a second variable

duplicate

Dear experts, I have a list of variables where I need te remove duplicate variables. However, in case of duplicate variables I want to keep the varibles that have value 1 in the second column. In cases when there are multiple duplicates with a 1 then it needs to keep randomly only one variable. See example below: Here I want to keep the variable BG1028 where the data in the third column is 1.3. For BG1030, I want to keep the variable with 3.0 or 0.3 in the third column. I hope it is clear. Im puzzling how to do this. This is the code I came up with so far.
ppn(:,1) = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030';'BG1030';'BG1030';'BG1030'};
ppn(:,2) = {'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'};
ppn(:,3) = {'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'};
% find duplicates
ppn2 = ppn(:,1);
idx = find(strcmp(ppn2(1:end-1),ppn2(2:end)))+1;
%remove duplicates
ppn((idx),:) = [];

Best Answer

Hi Marty,
Try the code below.
% Defining ppn (all at once)
ppn = [ {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';...
'BG1030';'BG1030';'BG1030';'BG1030'},... % start col 2
{'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'},... % start col 3
{'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'}];
% Storing ppn column 2 as numerical values
bPpn=cell2mat(cellfun(@(c)str2double(c),ppn(:,2),...
'UniformOutput',false));
% Deleting all duplicates with 0 in bPpn
idx = strcmp(ppn(1:end-1,1),ppn(2:end,1));
delidx = ([idx;false] | [false;idx]) & ~bPpn;
ppn(delidx,:)=[];
clear bPpn idx delidx;
% Get names of remaining duplicates
chooseNames = ppn([strcmp(ppn(1:end-1,1),ppn(2:end,1));false],1);
% Loop over chooseNames and keep one at random
if numel(chooseNames)>0,
for j=1:numel(chooseNames),
dupidx=find(strcmp(chooseNames{j},ppn(:,1)));
dupidx(randi(numel(dupidx)))=[];
ppn(dupidx,:)=[];
end,
end,
Hope this helps.