MATLAB: Distributing arrays to workers for local processing

distributedlocal accessMATLAB Parallel Servermatlabpoolparforspmd

How to access parts of distributed data on the workers/labs ?
I have a large timeseries data and want to run some functions on smaller chunks of it. Have a working parfor implementation.
Want to try it with distributed array/spmd, but don't know how to access the local data once I distribute the array.
matlabpool start 100;
size(myMat) = 800000 x 2
myMatdb= distributed(myMat');
spmd
chunk_of_data = myMatdb;
[out_of_chunk] = objFun(params, chunk_of_data);
end
Works but all the labs/workers have full data rather than a small chunk of it.
I would like to explore codistributed with codistributor1d option to have more control over the distribution. Still, how do I tell the worker to operate only on its local copy but not the total Composite.
For some strange reason, functions like getLocalPart,localPart etc., aren't available on my Matlab r2011b.

Best Answer

Hello. If you are able to successfully open a matlabpool with your installation of R2011b, then you must have the Parallel Computing Toolbox. In that case, the getLocalPart function should also be available to you. What is the output from typing the following at the MATLAB command line:
which getLocalPart
Assuming that you can get the issue with getLocalPart sorted out (perhaps by calling technical support), this is how you would proceed with distributed arrys/spmd:
matlabpool open 100 % this will open 100 workers
% using your default configuration
% I assume that myMat was already loaded as a standard MATLAB array
size(myMat) % You've stated that myMat is 800000 x 2
% There are a lot of rows, so let's use codistributor1d to
% distribute the rows across all the workers in the pool. This must
% be done inside the spmd block because that's where
% codistributed arrays and codistributors live.
spmd
codist = codistributor1d(1); % Create a scheme to distribute the first
% dimension of a matrix (its rows) as evenly as
% possible across all the workers in the
% pool
myMatdb = codistributed(myMat, codist); % Use the scheme to create
% distributed data
chunk_of_data = getLocalPart(myMatdb); % Each worker operates on its data
[out_of_chunk] = objFun(params, chunk_of_data);
fullOutput = codistributed.build(out_of_chunk, codist); % Create a new
% array from the
% local outputs. I
% assume that
% out_of_chunk is
% the same size as
% chunk_of_data on
% each worker so
% that the
% codistributor can
% be reused.
end
% fullOutput and myMatdb can be used as distributed arrays outside of the spmd block
You can find more information here:
help getLocalPart
help codistributor.build