I have function that uses a parfor (~100 iteration) evaluating another function. However one of the workers is two times faster than the other two (it uses a GPU that is two times faster, than the ones used by the other workers). Suddenly the usage of worker one (the fast one) stops, while the other ones are still calculating a lot of iterations (say 5-10 each). I suspect that the worker one is out of available chunks of the parfor load balancing whilst the other ones are still busy with one of the larger chunks.
Is there a way to change the maximal chunksize to for instance 2 or 3 such that the problem of unexploited resources is circumvented?
Best Answer