MATLAB: Does “h5read” output string data differently than “hdf5read” when reading the same file

h5readhdf5hdf5readMATLAB

I am moving from using "hdf5read" to using "h5read" to import data from HDF5 files into MATLAB. I have a file that contains string data, and when I read it in with "hdf5read" the data I got was the character matrix 'ABC'. When I read the same file using "h5read", the data I get is the cell array
{'ABC ¤'}
Why are they different and how can I get "h5read" to give me the same data as "hdf5read"?

Best Answer

"h5read" returns data as a cell array to ensure that variable-sized datasets, such as datasets containing variable-sized strings, are handled in a consistent and predictable way. The cell array output of "h5read" can be converted into a matrix (if the data is not variable-sized) using the "cell2mat" command. It is possible for "h5read" to display characters that are not seen with the "hdf5read" command because "h5read" will output all the characters in a string, while "hdf5read" will terminate a string when it encounters a null character (null characters are represented as spaces when displayed in MATLAB). It is possible to replicate this behavior using the command:

>> data = regexp(data,'(.*?)(?=\x0)','match','once')

where "data" is the output of "h5read", converted into a character array.