MATLAB: Do I receive “qsub: Job exceeds queue resource limits” error during parallel job submissiont to a Torque / PBS cluster

Parallel Computing Toolbox

I am submitting a parallel job to a Parallel Computing Toolbox configuration for a Torque or PBS cluster. When ClusterSize is set to a value greater than the number of nodes in the cluster (though fewer than the total number of cores or MATLAB Parallel Server worker licenses) I receive the an error similar to the following:
Error executing the PBS script command 'qsub'. The reason given is
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
The job completes successfully if ClusterSize is set to a value equal to or less than the number of nodes in the cluster. How can my parallel jobs take advantage of the additional cores and worker licenses available?

Best Answer

The error indicates that the job request exceeds the queue resource limits. Verify with the system administrator the exact limits of the queue, number of physical nodes and number of processors/cores per node. If there are multiple cores per node, and the number of workers per job exceeds the number of physical nodes, you will need to modify the communicatingSubmitFcn.m file (pbsNonSharedParallelSubmitFcn.m in older releases) on the client. In particular you will need to change the line containing:
procsPerNode = 1;
Change the value assigned to procsPerNode from 1 to 2,3,4...N cores to take advantage of all available cores on cluster.
For more information please refer to generic scheduler section of the MATLAB Parallel Computing Toolbox Documentation:
NOTE: Starting in R2019a the following name changes occurred: * MATLAB Distributed Computing Server was renamed to MATLAB Parallel Server * mdce_def was renamed to mjs_def * mdce binary was renamed to mjs