MATLAB: How to delete ‘NA’ columns from a text file

natext file

Hi guys, I have a text file which has 7 rows and 499 columns and in some columns, there are 'NA'
How can i delete them?
Example.txt:
NA T2b t1c t3b
60 79 78 50
7 7 9 7
t2c t3a t2c t3b
4 5 3 NA
0.1 0.1 0.18 0.1
4 4 5 3

Best Answer

This is easy with textscan, strcmp, and some indexing:
opt = {'MultipleDelimsAsOne',true};
fmt = repmat('%s',1,4); % define how many columns!
[fid,msg] = fopen('test.txt','rt');
assert(fid>=3,msg)
C = textscan(fid,fmt,opt{:});
fclose(fid);
C = [C{:}];
idx = any(strcmp(C,'NA'),1);
D = C(:,~idx);
which generates this matrix:
D =
T2b t1c
79 78
7 9
t3a t2c
5 3
0.1 0.18
4 5
The test file is attached to this answer. It is also easy to write this to a new file:
tmp = D';
len = max(cellfun('length',D(:)));
fmt = sprintf('%%-%ds',len+1);
fmt = [repmat(fmt,1,size(D,2)),'\n'];
[fid,msg] = fopen('test_new.txt','wt');
assert(fid>=3,msg)
fprintf(fid,fmt,tmp{:});
fclose(fid);
this generates a new file which looks like this:
T2b t1c
79 78
7 9
t3a t2c
5 3
0.1 0.18
4 5