MATLAB: Are parfor loops so inefficient

data paralleldata parallelismefficiencymemoryParallel Computing Toolboxparfor

In terms of memory, that is. A parfor loop uses vastly more memory than its for-loop counterpart, apparently because it makes copies of all of the data for each thread. But it does this even when the data are read-only, and therefore such copies are completely unnecessary–simultaneous reads of a piece of data from multiple threads are just fine, in general. Moreover, Matlab clearly already knows what data is read only, through its 'classification'. Yet the copies are made anyway. I have lost a lot of time as my system grinds to a halt when trying to run parallelized code on large data files. Is there any way to remedy the situation? Or is it just a programming fail we have to live with (at least for now)?

Best Answer

It isn't quite the case that copies of all the data are always made. Sliced variables are not copied, nor are distributed variables (used in SPMD). I think you're expected to be partitioning up your computation to take advantage of that.
simultaneous reads from multiple threads are just fine.
Might help if you elaborate on that. My understanding was that threads trying to read from the same memory location is a major problem in parallel computing, because some threads then have to wait idly for their turn at access.