MATLAB: HPC MATLAB parpool and speed

hpchpccMATLABparallel computingparforparpool

Hey guys! I am new to the HPCC. And I am now running my MATLAB program on it. I am using parellel computing, i.e. parpool
Here is the code for my "submit.sh"
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
/opt/hpc/MATLAB/R2019b/bin/matlab -nojvm -nodesktop -r "main_MultiEA;exit;"
The first thing is that I found the speed is similar to my local computer. Should I specify something in the .sh file to change this? And how can I know whether I reach the limit of the resource or not?
The second thing is that I found that the only available parpool is "local", using the "allNames = parallel.clusterProfiles()" command. Should it be different on the HPCC?
The third thing is that when I use "parpool(16)" or "parpool('local',16)" or "parpool("myPool",16)" etc.. to try to improve the speed, it the program seems to crash. Here is my test.m to test the parpool. And I guess the program crashes as there is no a.mat in the directory.
parpool("local",16);
a=0;
parfor i = 1:10
a = a+1;
end
save a.mat;
exit;
Would you tell me why's that? And how can I improve the speed? Thanks a lot!!

Best Answer

Hi Ruan,
There are two ways to speed up your code, implicitly and explicitly. You don't have much control over implicitly. MATLAB will find the best ways to use your multi-cores. Explicitly, you can vectorize, pre-allocate, MEX-files, etc. You can also use parallel pools.
Looking at your Slurm job script, make the following change:
/opt/hpc/MATLAB/R2019b/bin/matlab -nojvm -nodesktop -r "main_MultiEA;exit;"
to
/opt/hpc/MATLAB/R2019b/bin/matlab -batch main_MultiEA
-batch works instead of -nodesktop, -r, "exit". And you'll need the JVM if you use PCT.
I'd also consider using module if you have it (your module name -- matlab -- might be slightly different)
module load matlab
matlab -batch main_MultiEA
Next, you're requesting from Slurm 2 nodes, with 2 cores per node (total of 4 cores). But MATLAB only runs on a single node, so the 2nd node is of no use. That means when you start the pool of 16 workers, you're running it on 2 cores (or you should be -- might depend if you have cgroups). This is probably why MATLAB is crashing -- you're running out of memory. To write this more flexibly, try
sz = getenv('SLURM_CPUS_PER_TASK');
parpool("local",sz);
a=0;
parfor i = 1:10
a = a+1;
end
save a.mat
This way, regardless of the cores per node you request, you'll get the right size.
With that said, there are two things to think about
  1. obviously, you'll see no speed up in your example. There has to be a reasonable amount of work to do.
  2. using the "local" profile, the parallel pool will only run "local" to wherever MATLAB is running (on the HPC compute node). If you want to run a larger pool, across nodes, then you'll need to create a Slurm profile with MATLAB Parallel Server.
Raymond