MATLAB: Is the MAT file much larger than the variable I was trying to save when using the -v7.3 flag

MATLAB

I have a very large ensemble generated using the "fitensemble" command. Whenever I try to save the ensemble to a MAT file using the "save" command with the -v7.3 flag, the file size of the MAT file is significantly larger than the size of the ensemble within MATLAB. Why does this happen?

Best Answer

MATLAB attempts to compress the data whenever data is saved using the "save" command. There is a 2GB file size limitation on the default "save" command, but this limitation can be overcome using the "-v7.3" flag. However, the -v7.3 algorithm uses an HDF5 file format to save the data, which adds extra metadata to the file. The amount of metadata associated with data structures will depend on how complicated the data structure is. For complicated data structures (such as "ensemblefit" data structures), the added metadata can increase the file size by a factor of 4 or higher.
To work around this issue, please save the MAT file without the -v7.3 flag.
If you are dealing with regression trees and the data is larger than 2GB, first try to reduce the size of the data by using the "compact" command before saving the data.
If the "compact" command does not reduce the size below 2GB, please try breaking the data set apart and saving each part to its own MAT file.