MATLAB: Ga does not use all the available workers

clustergaparallelworkers

Hello, i have configured successfully a cluster to use a matlab job scheduler. Admincenter communicates with all the available nodes and starts successfully the requested workers ( 40 in total) . Matlab makes connection with those workers , but the problem lies when i run my code ( genetic algorith in parallel) only 20 out of 40 workers seem to work. Any ideas why this happens.. Thanks

Best Answer

My first suspicion is that you have have 20 (real) cores available to do the work, so you end up essentially queuing the second set of 20 units of work, so when the first 20 finish, those move on and open up the resources for the second set.
Check the specifications for your processor to see if it has hyperthreading. If it's enabled you will see twice the number of compute cores you actually have at the OS level.
Generally the starting recommendation for PCT is one worker per core, but that's only a starting point. You might do better with fewer, you might do better with more. It's also possible that another resource (memory, disk I/O, network I/O) is contributing to how the scheduling is happening. You'd need to monitor a cluster node to see what was actually going on while you are running a job to see if that was the case.