MATLAB: Does the batch job run on a single worker if the code batched doesn’t contain any parallel for loop

batcheigsparallel

I got a code file, which solves a large sparse matrix, about 40000×40000, by calling function 'eigs'.
When I first run the code by simply typing its file name, like 'myfile.m', in the command window and pressing Enter, the code is running as it should be. I checked that matlab client is running with multiple threads. This indicates that the inner function EIGS itself can run parallelly. The computing time that the code cosumes is acceptable.
Now, I have to change the input varibles of myfile to run as many as tens of times manually. So I choose using function BATCH to submit jobs in sequency.
When I batch myfile, I found the client is running with only one worker. It does not run parallelly automatically as function EIGS indicate. I check in the tutorials and know BATCH will transfer myfile to a worker of the cluster. Since myfile has no any parfor loops, that's why the parallel pool does not start, even if function EIGS is in the code? And it costs a lot time to compute. So is there any way that can make the batched jobs myfile still run in a parallel mode, with no parfor but eigs? The time cost is appreciable with no parallel computing.
Thanks in advance!

Best Answer

Your local machine is making use of maxNumCompThreads, which is set to the number of physical cores on your machine. The 'parallelism' is threads, not processes, that are spawned by eigs onto those cores.
When you call batch, you can pass it a pool argument, specifying how many workers (processes) should start up. batch will request one additional worker to act as your proxy MATLAB client. For example,
j = batch(...,'Pool',3);
will reqest 4 workers run on your cluster. As you've noticed, you don't need additional workers for your code. You just need the single worker that batch will request, which in turn needs to spawn threads (across several cores).
Your question then is, why does eigs run with multiple threads locally, but only a single thread on the cluster? By default, MATLAB will only request a single thread per worker. To increase this, set the number of threads per worker, as such (8 is only an example)
c = parcluster(name-of-cluster-profile);
c.NumThreads = 8;
j = c.batch('myfile');
If, for some reason, that doesn't work, you can call maxNumCompThreads in myfile. The caveat is that setting NumThreads should request additional cores (needed for the additional threads). If setting NumThreads didn't work, you'll only get a single core. Then, calling maxNumCompThreads may have one of the following consequences
  1. Starting threads on cores you don't own
  2. Starting all on the same core, if cgroups are implemented
Neither is desireable.