MATLAB: Is there a way to make device memory persist between CUDA MEX calls

apicudagpumexparallelpersistent

I'm wondering if there is a graceful way to allocate data on or transfer data to the GPU in one MEX file (basically MATLAB interfaces to cudaMalloc or cudaMemcpy) and then process that data on the GPU with a different MEX file. I'm wondering if it is possible to do this without the Parallel Computing Toolbox.
When I do the memory transfer/allocation on the GPU, I will need to keep the pointer to the device memory and have it reside in the MATLAB workspace in some form until it is ready to be passed to the data processing MEX file. I'm wondering what the best way to do that is. Would I just convert the pointer value (not dereferenced) to a MATLAB integer and then convert it back again in the data processing MEX file when needed?

Best Answer

Yes, you can reinterpret_cast the pointer to an integer of a sufficient bit length, e.g. uint64, and return this to MATLAB. Then pass the integer to the new mex file, where it is reinterpreted as the pointer to the GPU memory again.
If you want to use this approach, and minimize chances of leaking memory, then wrap the memory allocation in a C++ class, and use the approach outlined in this FEX submission.
Related Question