MATLAB: Handling Big Data in MatLab

csvdata handlinglarge datamatfileMATLAB

Hi everyone,
I am working with large sized data and the followings are the steps I follow, but need your advice/help to correct the procedure? or make it better…
I receive the data as csv files (data is of mixed types like doubles, date-time ets.) and the whole data set spans across about 150 csv files each having more than 100MB of data. So, I combined them all into one matlab table so the table contains the whole data set. I can't (don't like to really) make matlab tables for each csv file, becuase of the continuity of the data across each csv file so I guess it would be harder to retrieve them later. A part of the table I created looks like below:
This Table is more than 2GB in size (26 columns and millions of raws). Then I saved this Table as a mat file so I can 'load' it back to workspace later and retrieve the necessary data segments as per requirements. For example, I may need to retrive and post-process data of 'CurrentA' and 'VoltageV' for Cycle_Index=1 & Step_Index=30.
My concerens are:
  1. this method is quite time taking, for example loading the mat file takes quite long, so wonder is there a better way? I saw using 'matfiles' we don't need to load the mat file to memory. But I am not sure this will allow me to retrieve data with much flexibility compared to original file loading to workspace method?
  2. not so sure about the total methodology I followed, is there a better way to handle these kind of large data in a better, efficient way ?
Many thanks in advance…

Best Answer

Some of the tools and techniques described in the Large Files and Big Data category in the documentation may be of interest to you.