MATLAB: Preprocessing using readtable()

I have a CSV file that I read it as a table with 'readtable'. The original CSV file contains timestamps for a wide range of days. In most of the cases, I do not need that massive amount of timestamps available, because my analyze is centred in a shorter period. Let me put an example to explain better my problem:

The CSV file contains the timestamps in column 1. It starts at timestamp=1573377305, and it starts increasing the timestamps without a determined size, the next timestamp can be at 8 seconds, the third at 10 seconds and so on. What I know are the timestamps of my analysis, the beginning and the end, but I don't know the number of rows that correspond with that interval.

For instance: my timestamps

Imagine that my timestamp for analysis is from 1573377326 to 1573377433. I don't want to read all the previous and end information with readtable(). In this case, I could do DataLines = [4 15], but it is an illustrative example. Imagine that you have much more than 20 timestamps, and what you know is the timestamp from beginning to end.

If I upload all the data with 'readtable()', it is inefficient as I am loading information that I am not going to use. How can I do to select the precise interval I am going to use before using readtable()? Or how can I do this process more efficient?

fmt = '%D%...'; %replace by actual format of your file. Possibly, you could use detectImportOptions for help here data = {} fid = fopen(yourfile, 'rt', 'n', yourfileencoding); %Possibly omit the encoding. readtable detect the file encoding for you. while ~feof(fid) line = textscan(fid, fmt, 1); %read one line %assuming the timestamp is in first column if line{1} > endtimestamp break; %end reading end if line{1} >= starttimestamp data = [data; line]; end end fclose(fid);

Best Answer

Assuming you don't know which rows of the text file correspond to your start and end timestamps, what you want cannot be achieved just ith readtable. There's no way to tell readtable to stop once a specific value has been encountered (even using detectImportOptions). The only way to tell readtable to stop at a specific line is with the DataLines property and you have to know in advance at which line to stop.

You could determine that line by parsing the file yourself line by line with textscan for example but then there'd be no point in using readtable after that.

Unless the file is significantly larger than your range of interest, it's simpler to just read the whole file and then use isbetween to keep the required range.

Otherwise, as said, you'll have to use textscan to parse the file line by line until you encounter your end timestamp. It's significantly more work than readtable, you lose automatic format detection, nice table formating, etc. and because you'll be reading the file line by line may actually be slower than reading the whole file in one go. It would go like this:

Best Answer

Related Solutions

MATLAB: How to combine these .csv files

MATLAB: Reading .csv files from .txt file.

Related Question