MATLAB: How to create pool of workers from a list of hostnames

parallel computingParallel Computing Toolbox

Hi all. In the past I've used the parallel computing toolbox on an HPC cluster to submit a job to the scheduler which Matlab then connects to and creates a worker pool. The problem is that on a shared HPC cluster this process must be started from a job too and it's not guaranteed that both jobs will start in temporal proximity.
What I would like to do is request all compute resources in advance, start up matlab on a single host, and then provide it with a list of the remaining hostnames where it can start up worker pools. Is this possible?
Any advice would be aprreciated. Thanks!

Best Answer

This isn't feasible at this time; however, you might consider wrapping your code with batch. For example, let's assume your code looks something like this
function my_parallel_code
% ...
nworkers = ... ;
% Check if a parallel pool is running. If so, use it. If not, initiate a parallel pool with 'nworkers'
p = gcp('nocreate');
if isempty(p)
% Pool has not been started, spawn pool across multiple nodes (kicks of new job)
% If the pool has already been started, it would have been done in "submit_job" with call to batch()
parpool('hpc',nworkers);
end
parfor idx = 1:N
% parallel code
end
Now, write a wrapper function
function submit_job
c = parcluster('hpc');
j = c.batch(@my_parallel_code,0,{},'Pool',nworkers);
Submit your MATLAB job (e.g. using Slurm). Call submit_job instead of my_parallel_code directly
#SBATCH -n 1
module load matlab
matlab -batch submit_job