MATLAB: Importing specific rows of Data from Text file

import from text fileMATLABspecific row import from text file

Hi;
I am having some sensor data which is a very large text (.dat) file. Some of the relevant data from this file needs to be analyzed and plotted through help of MATLAB.
The example for the data is like:
C 0 0.001 -0.02 24.09 4.64 -100.00 -100.00
C 0 1.005 0.29 24.09 4.43 -100.00 -100.00
C 0 2.009 -0.34 24.09 8.26 -100.00 -100.00
C 0 3.014 -0.18 24.06 6.06 -100.00 -100.00
C 0 4.018 0.07 24.06 5.61 -100.00 -100.00
C 0 5.022 0.02 24.09 4.92 -100.00 -100.00
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000 ……
Now all I want to import to MATLAB and analyze is the rows which start with this alphabet 'R' , which stands for Result. There is a pattern to the occurrence of 'Result' data in this big text file. The 'R' row occurs at an interval of every 160 rows.
How can I achieve this solution to import only these rows which tell the 'Result' into MATLAB, maybe interactively or programmatically. I would deeply appreciate a detailed answer as I am on intermediate level of MATLAB programming.
Thank you so much in advance! Pramit

Best Answer

If the entire file fits in memory, try this code
>> num = cssm()
num =
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
where
function out = cssm()
% read the entire file to one cell array with one row per cell
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, '%s', 'Delimiter', '\n' );
[~] = fclose( fid );
% find rows which begin with 'R'.
isR = cellfun( @(str) strncmp(strtrim(str),'R',1), cac{:} );
% extract the rows beginning with 'R'
rlt = cac{:}(isR);
% join all rows with results to one long string separated by '\n'
one_str = strjoin( rlt, '\n' );
% parse the string.

result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );
% make sure that only results are included in the output

assert( strcmp( unique(result{1}), 'R' ) ...
, 'Non-result rows included in result' )
out = result{2};
end
and where cssm.txt contains
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
&nbsp
... and an alternative, which is an order of magnitude faster
function out = faster()
% read the entire file to one string
str = fileread( 'cssm.txt' );
% find start and end indicies of all the "rows" beginning with 'R'
xpr = '(?<=\s)R[^(\n|\r)]+(\n|\r){1,2}';
[ix1,ix2] = regexp( str, xpr, 'start', 'end' );
% extract the "rows" beginning with 'R'
isi = false(1,length(str));
for ii = 1:length(ix1)
isi(ix1(ii):ix2(ii))=true;
end
one_str = str(isi);
% parse the string.
result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );
% make sure that only results are included in the output
assert( strcmp( unique(result{1}), 'R' ) ...
, 'Non-result rows included in result' )
out = result{2};
end