MATLAB: Trouble using textread to ignore certain elements in a .csv file (new to matlab!)

textread

Hi all,
I am very new to MatLab, and need some help using textread to read a .csv file, ignoring certain elements. Below is a small example of the data.
timestamp,time completed,task,set_no,x,y,time trained,stars,guess,guess (display),trials complete,trials remaining
1375988040,"Thu, 08 Aug 2013 18:54:00 GMT",aud_spatial_match_crbi,10,3,0,255,4,6,6 syls,11,0 1375988312,"Thu, 08 Aug 2013 18:58:32 GMT",aud_spatial_match_crbi,10,3,0,262,5,6,6 syls,6,5 1375989376,"Thu, 08 Aug 2013 19:16:16 GMT",digit_span_crbi,0,0,0,768,2,5,5 objects,14,0
My goal is to have textread ignore GMT" in the dates/time, since matlab seems to not recognize this (08 Aug 2013 18:54:00 GMT") as a date/time. There is also the issue of the preceding "Thu, which MatLab considers its own field due to the extra comma. This would also be great to get rid of. In addition, I'm also trying to ignore the first line.
Here is what I've attempted so far: [timestamp, thu, timecompl, task, setnum, x, y, timetrained, stars, guess, guessdisp, trialscomp, trialsrem] = textread('BrainTEST.csv','%d %s GMT"%s %s %d %d %d %d %d %d %s %d %d','delimiter',',','headerlines',1,'whitespace','')
The variable "thu" is me trying to accommodate the extra comma.
I have been confronted with the following errors:
Error using dataread Trouble reading literal string from file (row 1, field 3) ==> 08 Aug 2013 18:54:00 GMT",aud_spatial_
Error in textread (line 175) [varargout{1:nlhs}]=dataread('file',varargin{:}); %#ok<REMFF1>
Any suggestions would be greatly appreciated!
Cheers, Sean C.

Best Answer

You could go for
buffer = fileread('BrainTEST.csv') ;
data = textscan(buffer,'%d %s %s %s %d %d %d %d %d %d %s %d %d', ...
'delimiter', ',', 'headerlines', 1, 'whitespace', '') ;
and then instead of processing e.g.
data{3}{k}
which would be ' 08 Aug 2013 18:54:00 GMT"' in your example for k=1, you process
data{3}{k}(2:end-5)
which is '08 Aug 2013 18:54:00'.
EDIT: just as an example, here is one way to tackle this with named regexp.
buffer = fileread('BrainTEST.csv') ;
data = regexp(buffer, '(?<timestamp>\d+),.{6}(?<datetime>[^G]+)GMT",(?<task>[^,]+),(?<setnum>\d+),(?<x>\d+),(?<y>\d+),(?<timeTrained>\d+),(?<stars>\d+),(?<guess>\d+),(?<guessDisplay>[^,]+),(?<trialsComplete>\d+),(?<trialsRemaining>\d+)', 'names') ;
which generates a struct array:
>> data(1)
ans =
timestamp: '1375988040'
datetime: '08 Aug 2013 18:54:00 '
task: 'aud_spatial_match_crbi'
setnum: '10'
x: '3'
y: '0'
timeTrained: '255'
stars: '4'
guess: '6'
guessDisplay: '6 syls'
trialsComplete: '11'
trialsRemaining: '0'
>> data(2)
ans =
timestamp: '1375988312'
datetime: '08 Aug 2013 18:58:32 '
task: 'aud_spatial_match_crbi'
setnum: '10'
x: '3'
y: '0'
timeTrained: '262'
stars: '5'
guess: '6'
guessDisplay: '6 syls'
trialsComplete: '6'
trialsRemaining: '5'
etc ...
but don't go for this solution as it is way less efficient than the first one in this case of very well structured file. This would be a good solution if your file was a mix of structured and unstructured data.