MATLAB: Fast initialization of cell array of strings mex

mexspeed

I'm looking to initialize a cell array of strings in mex as quickly as possible from a long array of characters with an additional array of the locations of starts and stop from which to grab the strings. Any tips on ways of doing this quickly in Matlab?
I am planning on doing roughly the following:
  • intialize with mxCreateCellArray
  • initialize a cell with empty values using mxCreateCharArray – I think that using an empty value will prevent any initialization overhead. It would also be great if I could create a bunch of mxArray headers quickly in matlab without having to call a function for each one.
  • populate the cell data (string value) using string data that is 2 bytes per character and using mxSetData,mxSetM,mxSetN – here it seems that a smart memory allocator would allow initializing a large block of memory and then breaking it into smaller chunks, rather than individually requesting memory from the OS for each string

Best Answer

So, sounds like you have a single array of char data in your C routine, with a second int array that contains start/stop locations. I assume you know how many strings are involved up front from the int array. So just use mxCreateCellArray and then loop through your strings with mxCreateString and mxSetCell. It is unclear to me why you think you need mxCreateNumericArray for anything, since it is OK to have a NULL cell element for an empty cell (that's what MATLAB does at the m-file level). Also it is unclear to me what kind of speed/resource advantage you think you will get by using mxSetData, mxSetM, and mxSetN in some way (unless perhaps the strings in your char array are not individually null terminated?). Can you elaborate? Are all of the strings unique or are some of them shared among multiple cell elements?
I don't know how to create a bunch of mxArray's en masse using the official API functions. You have to do it one at a time.