MATLAB: Trouble using textread to ignore certain elements in a .csv file (new to matlab!)

Hi all,

I am very new to MatLab, and need some help using textread to read a .csv file, ignoring certain elements. Below is a small example of the data.

timestamp,time completed,task,set_no,x,y,time trained,stars,guess,guess (display),trials complete,trials remaining

1375988040,"Thu, 08 Aug 2013 18:54:00 GMT",aud_spatial_match_crbi,10,3,0,255,4,6,6 syls,11,0 1375988312,"Thu, 08 Aug 2013 18:58:32 GMT",aud_spatial_match_crbi,10,3,0,262,5,6,6 syls,6,5 1375989376,"Thu, 08 Aug 2013 19:16:16 GMT",digit_span_crbi,0,0,0,768,2,5,5 objects,14,0

My goal is to have textread ignore GMT" in the dates/time, since matlab seems to not recognize this (08 Aug 2013 18:54:00 GMT") as a date/time. There is also the issue of the preceding "Thu, which MatLab considers its own field due to the extra comma. This would also be great to get rid of. In addition, I'm also trying to ignore the first line.

Here is what I've attempted so far: [timestamp, thu, timecompl, task, setnum, x, y, timetrained, stars, guess, guessdisp, trialscomp, trialsrem] = textread('BrainTEST.csv','%d %s GMT"%s %s %d %d %d %d %d %d %s %d %d','delimiter',',','headerlines',1,'whitespace','')

The variable "thu" is me trying to accommodate the extra comma.

I have been confronted with the following errors:

Error using dataread Trouble reading literal string from file (row 1, field 3) ==> 08 Aug 2013 18:54:00 GMT",aud_spatial_

Error in textread (line 175) [varargout{1:nlhs}]=dataread('file',varargin{:}); %#ok<REMFF1>

Any suggestions would be greatly appreciated!

Cheers, Sean C.

buffer = fileread('BrainTEST.csv') ; data = regexp(buffer, '(?<timestamp>\d+),.{6}(?<datetime>[^G]+)GMT",(?<task>[^,]+),(?<setnum>\d+),(?<x>\d+),(?<y>\d+),(?<timeTrained>\d+),(?<stars>\d+),(?<guess>\d+),(?<guessDisplay>[^,]+),(?<trialsComplete>\d+),(?<trialsRemaining>\d+)', 'names') ;

>> data(1) ans = timestamp: '1375988040' datetime: '08 Aug 2013 18:54:00 ' task: 'aud_spatial_match_crbi' setnum: '10' x: '3' y: '0' timeTrained: '255' stars: '4' guess: '6' guessDisplay: '6 syls' trialsComplete: '11' trialsRemaining: '0' >> data(2) ans = timestamp: '1375988312' datetime: '08 Aug 2013 18:58:32 ' task: 'aud_spatial_match_crbi' setnum: '10' x: '3' y: '0' timeTrained: '262' stars: '5' guess: '6' guessDisplay: '6 syls' trialsComplete: '6' trialsRemaining: '5'

Best Answer

You could go for

 buffer = fileread('BrainTEST.csv') ;
 data = textscan(buffer,'%d %s %s %s %d %d %d %d %d %d %s %d %d', ...
                 'delimiter', ',', 'headerlines', 1, 'whitespace', '') ;

and then instead of processing e.g.

 data{3}{k}

which would be ' 08 Aug 2013 18:54:00 GMT"' in your example for k=1, you process

 data{3}{k}(2:end-5)

which is '08 Aug 2013 18:54:00'.

EDIT: just as an example, here is one way to tackle this with named regexp.

which generates a struct array:

etc ...

but don't go for this solution as it is way less efficient than the first one in this case of very well structured file. This would be a good solution if your file was a mix of structured and unstructured data.

Best Answer

Related Solutions

MATLAB: Textscan won’t read the dates with spaces

MATLAB: How to import a file that contains text and numerical values

Related Question