MATLAB: Read values from a very complex txt file into matlab

read datatextread

Dear Matlab community,
I have a data file of wave heights that looks like this:
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-19;19:00;;361;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-19;22:00;;363;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-20;01:00;;379;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-20;04:00;;381;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
I am only interested in reading the date (column 3) the hour (column 4) and the wave height, (column 6). The file has 18 columns and is Semicolon(;) separated, so the commas in between the strings can be ignored. There is an empty column among the hour and wave height, that's why wave height is in column 6 and not 5. The length of the file is very big, something like (28years * 365days * 24 h*60 min) in length.
I am using the command:
[data(:,1),data(:,2)…data(:,18)] = textread('wave.txt','%q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q','delimiter',';');
This method works but is very very slow, and it gave me some problems with 'buffersize' memory sometimes. Do you guys know a better way to do this? Maybe read only the date and wave heights and dump all the crappy text?

Best Answer

You can ignore data in textread using the asterisk (*) after the percentage symbol when reading in data.
For example, if you have 4 columns and only wish to read in the first and third columns of data:
data = textread('myFile.txt','%s %*f %f %*f','delimiter',';');
Also, you may wish to look at the documentation for TEXTREAD to see if you can read in the data using better formats other than %q which is used to read in a double quoted string. It seems that you have some numeric data that would be well suited for %f.