I have a large structure array (500K+ items), and I wish to access certain fields of the that array and concatenate the results. Below is a placeholder example.
A{1}.time = 1500 A{1}.data.temp = 70; A{1}.data.humidity = 20; A{2}.time = 1501 A{2}.data.temp = 73; A{2}.data.humidity = 19;
etc. Till we have 500,000 of these. (I have made it a cell array since the actual entries differ in my data, and I have other code that will go through and just grab the cells we want.)
Now, I want to access e.g. all of the 'data' and concatenate it so that I have a simple vector I can plot. Currently this is done using a loop, but that is very slow. Is there a faster way to do this than some version of the below:
fieldNames = fields(A{1}.data); for ii = 1:length(fieldNames) out.(fieldNames{ii}) = ... cat(1,cellfun(@(x) getField(x,'data',fieldNames{ii}), A));end
where
function out = getField(in, fieldname1,fieldname2) out = in.(fieldname1).(fieldname2);end
Again, this certainly works but for extremely large datasets with lots of fields it becomes very very slow. I bet that there is a much more efficient way of gathering all of the data contained in the fields and subfields of a large data set like above. Any help is appreciated.
Thanks, -Dan
An additional discovery: Matlab is somehow storing the field names for each sub-structure individually. In the above example, it has memory allocated for the fieldnames data.temp and data.humidity TWICE (once for each copy). This is why it is so slow. A 50 Mbyte set of data has grown to 3 GB because of this organization scheme. I am going to make a separate post about this (is the memory issue resolved if each entry is a known class? That way the field names aren't stored once for each copy?).
Best Answer