MATLAB: Importing data using textscan in one array instead of multiarrays

data importdata import and exporttext filetextscan

Dear All
I do have large files of data (.csv) which each file contain 22 column and 1000's of lines of data (mix string and numeric). I am trying to import these data into Matlab workspace to do some analysis.
I am using textscan to import the data, my codes are:
fid = fopen('myfile.csv');
% To import the headers only
myfile_hdr = textscan(fid,repmat('%s',1,22), 1, 'delimiter', ',');
% To use 'repmat':
lineformat = [repmat('%d',1,1) repmat('%s',1,7) repmat('%d',1,1) '%s' repmat('%d',1,2) repmat('%s',1,2) '%d %s %d %s %d %f %d %f'];
% To import all data of 169490 lines:
myfile_data = textscan(fid, lineformat, 169490, 'delimiter', ',');
fclose(fid);
The output is two files; 1) myfile_hdr ; which contain the headers, and 2) myfile_data; which contain the data in 22 cells each cell is an array of 1×169490 (169490: is the number of lines for that particular file)
I have two questions:
A) How can import the data into one array of 22×169490? Or import the data/load them into Matlab workspace as they are in the original file (same order/organisation)
B) Also what is the code that can help me finding how many line I have in each file? So instead of keeping looking at each file individually and find how many lines and provide that to textscan, instead I can automatically feed textscan with that information?
Thanks in advance
Tariq

Best Answer

Answering in order:
A) You cannot mix datatypes except in cell array which will wrap the content. However, I do not advice to store numeric data in one cell per numeric scalar.
I also suggest to use everywhere %f (unless you're seriously constrained RAM).
So your lineformat would be
lineformat = ['%f' repmat('%s',1,7) '%f %s %f %f %s %s %f %s %f %s %f %f %f %f'];
B) You don't need to feed the number of lines if you're planning to import the whole file
data = textscan(fid, lineformat, 'delimiter', ',', 'CollectOutput',1);
Note the CollectOutput option. This is as far as you go, data{1} will have all the numeric columns and data{2} the char.