MATLAB: How the labindex was assigned for the workers inside a node/machine in MDCS?

labindexmatlab distributed computing servermdcsnodeworkerworker per node

We know that in MDCS we can choose to create more than one workers inside a node/machine, say 4 workers per node/machine. So how the labindex was assigned for these 4 workers?Are thay always 1,2,3,4 for each node, or they are continuous increment node by node, such as 5-8, 9-12…, or they are totally random such as 1,3,9,6 for a node/,machine?

Best Answer

You don't specify which cluster type you're using with MDCS, but I'm going to assume MJS for now. (Not all of what follows will be scheduler-specific).
labindex within an spmd context is equal to the task index executing on the worker. So, if you have 2 nodes each running 4 workers, and you run a single communicating job of size 8 (i.e. parpool('myMjsCluster', 8)), then the task indices are 1:8, as are the corresponding values of labindex.
MJS will endeavour to schedule things such that consecutive tasks are co-located on a single node - i.e. it will attempt to put tasks 1:4 on the first node, and 5:8 on the second. (Most other scheduler types will end up doing something similar, but by a different means).
Basically, what you need to do is come up with a mapping of labindex to hostname to work out which labs are located on which host, and then you can use that "local labindex" to pick which Java program to use. Here's one way.
spmd
[s, hostname] = system('hostname');
assert(s == 0, 'Failed to compute hostname');
hostname = strtrim(hostname);
% Get a list of all hostnames in the pool
allHostnames = gcat({hostname}, 1);
% Work out which labindex values are on this host
allLabs = 1:numlabs;
labsOnThisHost = allLabs(strcmp(hostname, allHostnames))
% Work out this lab's position among the labs on this host
myIndexOnThisHost = find(labindex == labsOnThisHost)
end