MATLAB: Changing contents of Cell Array mex files

MATLABmxsetcell mxgetcell mxdestroyarray

I have a mex file where I have a cell array that contains some structure arrays. I want to grow those structure arrays as needed while processing a large dataset, and then finally return the cell array to matlab.
This has been painfully difficult. After many many matlab crashes, I finally have it working. I think the root cause of my trouble has something to do with how cell arrays manage memory. I see this mentioned in the documentation for mxGetCell:
..this might explain why when I tried to re-size my structure arrays (obtained from the mxGetCell call) using the method described in this link, things crashed horribly. However, if one looks at the online documentation for mxSetCell, you see this rather contradictory advice:
so which is it? What I have working is a rather brute-force solution. When I want to grow the structure array, I allocate it in the new size with a call to mxCreateStructArray, and then I manually copy every element and field with calls to mxGetFieldByNumber and mxSetFieldByNumber. I then call mxSetCell and I do not call mxDestroyArray on the original structure array, contrary to the advice in the mxSetCell documentation.
I guess this works, but it's inefficient and I'd rather use the more-efficient method for growing structure arrays utilizing reAlloc….but it seems like this might be incompatible with how cell arrays manage memory. In any case, it would be nice to understand what is going on so I can avoid problems in the future, and also some clarification on the inconsistencies in the documentation.

Best Answer

When you mxDestroyArray a cell array or struct array, it does a deep destroy. Meaning all of the cell array or struct array elements are deep destroyed first, then the cell array or struct array itself is destroyed. That is why you should not call mxDestroyArray on the result of a mxGetCell or mxGetField call without cleaning things up, because if you do then you have invalidated the memory that is contained in the cell array or struct array and when the cell array or struct array eventually gets destroyed it will try to free invalid memory and bomb.
Regarding the documentation:
Do not call mxDestroyArray on an mxArray returned by the mxGetCell function
This refers to cell arrays that come from one of the prhs[ ] input variables, where doing so can screw up the workspace if it is a shared data copy of another variable. This sentence does not apply to cell arrays that you create and populate inside the mex routine where you know it is not a shared data copy of another variable, as long as you NULL out the corresponding spot in the cell array. Same thing is true for struct arrays btw.
To free existing memory, call mxDestroyArray on the pointer returned by mxGetCell before you call mxSetCell
This refers to cell array content that you originally created inside the mex routine, i.e. not part of a prhs[ ] variable. As long as you created it inside the mex routine, you can destroy it safely as well. E.g., start with this:
mxArray *mycell, *myvariable, *var;
mycell = mxCreateCellMatrix(1,1); /* 1x1 cell array */
myvariable = mxCreateDoubleScalar(5.0); /* Temporary, on garbage collection list */
mxSetCell( mycell, 0, myvariable ); /* myvariable is now Sub-Element of mycell */
The pointer contained in myvariable is put directly into mycell. Not a copy of the variable, but the actual pointer itself. And the type of the variable is changed from Temporary (on the garbage collection list) to Sub-Element (NOT on the garbage collection list). It's disposition is now entirely dependent on the disposition of the cell array it is part of.
What you cannot do is this:
var = mxGetCell( mycell, 0 ); /* this part is ok */
mxDestroyArray( var ); /* you have just made mycell invalid */
mxDestroyArray( mycell ); /* this will bomb MATLAB */
That last line will bomb MATLAB because mycell still contains the pointer of the variable you previously destroyed, so when MATLAB tries to subsequently destroy that element it will access invalid memory and bomb. That is, destroying var is actually OK since you originally created var in the mex routine ... but not cleaning up mycell properly will lead to a crash.
The correct way to extract and destroy a variable you originally created in a mex routine is:
var = mxGetCell( mycell, 0 ); /* this part is ok */
mxSetCell( mycell, 0, NULL ); /* NULL out the pointer that we just extracted */
/* See NOTE below */
mxDestroyArray( var ); /* OK since we originally created this inside the mex routine */
mxDestroyArray( mycell ); /* this will work OK */
NOTE: The var you extracted from mxGetCell( ) is actually not a Temporary variable anymore, it is a Sub-Element since it came from a cell array. Meaning, it is NOT on the garbage collection list and will NOT get automatically destroyed when the mex routine exits. This is true even though you originally created it inside the mex routine (as soon as you called mxSetCell( ) with this as input the type changed and it was removed from the garbage collection list). At this point you must do one of two things to avoid a memory leak. Either mxDestroyArray( var ) downstream in your code, or attach var to a cell or struct array. There are no API functions to put an mxArray back on the garbage collection list once it has been removed.