MATLAB: Matlab coder str2num alternatives

c/c++charfreadMATLABMATLAB C/C++ Math Librarymatlab coderread text filestringtextscanvectorizewhile loop

I have this data stored in a character array. I've used fread and removed headers to get this data from a text file (I'm constrained not to use textscan or fileread as they are not supported by Matlab Coder, also find it difficult to use coder.ceval to use fscanf).
unsorted_data 1×767 char
1
-8.3033E-01 -4.2882E+00 -8.4900E+00 -4.0889E-01 -4.2372E+00 -1.3796E+00
-1.1903E+00 -3.9289E+00 -6.2813E+00 -9.2360E-01 -2.8582E+00 -1.2460E+00
2
-3.6261E+00 -4.7218E+00 1.4143E+01 1.6041E+00 -5.1505E+00 1.6737E+00
-3.9131E+00 -5.9048E+00 -2.7256E+01 2.0434E+00 -1.6630E+01 5.5229E+00
3
2.2578E+01 -1.7633E-02 2.1166E+01 2.8041E-01 1.8919E+00 2.4702E+01
6.0947E+01 5.1242E+00 4.0910E+01 -1.0404E+01 -4.8758E+00 5.0202E+01
Need to extract every third row (R1, R4, R7, R10,…) as double [Nx1] and a second matrix having the other rows of data [Nx6].
So far I'm able to extract the first part (R1, R4, R7, R10,…) in "numbers" variable, but I get NaNs for "Vector" variable. This would work with str2num but is not supported by Matlab Coder.
remain = unsorted_data;
data_str = string([]);
while (remain ~= string())
[token,remain] = strtok(remain, char(10));
data_str = [data_str ; token];
end
data = str2double(data_str);
len_data = length(data);
cnum = 1;
cvector = 1;
vector_rows = 2;
number = zeros(len_data/(vector_rows+1),1);
Vector = zeros(len_data*vector_rows/(vector_rows+1),1);
for i = 1:len_data
num_loc = (vector_rows+1)*(cnum-1)+1;
if i == num_loc
number(cnum,1) = data(i,1);
cnum = cnum+1;
else
Vector(cvector,1) = data(i,1);
cvector = cvector+1;
end
end
I'm looking to get two matrices of this data in the right format and secondly make this more efficient by replacing the "while" loop, as it takes too much time to process 5mil lines. Any help is greatly appreciated.

Best Answer

As far as I can tell from that list of coder-supported functions, something like this should work. The basic idea is to split the char vector into two preallocated cell arrays, then convert to numeric. Given your 1x767 char vector:
  • identify whitespace using isstrprop.
  • use diff and find to get indices of the numbers.
  • use eq and find to locate newline characters.
  • preallocate two cell arrays (perhaps transposed).
  • use for loop over the indices and collect the char numbers into the cell arrays.
  • apply str2double to both cell arrays.