Hi,
I have a simple task to accomplish using the parallel tool box and distributed computing server. I need to execute a program (unix() call) on multiple workers. The workers run the same program but the program takes in a data file and writes a data file. Each worker's running program takes a different data file in a directory which all workers have access to. There are more data files than workers, so I would like a worker to choose another data file to run after executing the program on a previous data file. I want all workers to do this until all data files are exhausted. After reading the doc, it seems spmd is the way to go.
I have a cell array of file names and I would like to create an array of booleans of the same dimension to use for synchronization. When a worker accesses a file to work on, it should set the boolean at the array index of the index of the file that it's working on. When each worker finishes it's program run, it indexes through the booleans to find a file that has not been run and sets the boolean for that file index and then runs that one.
Is there any way for the workers to see a common array and also modify this same array?
Right now I'm doing it in a parfor loop that is dimensioned for maximum number of workers. This works fine, but I have a lot more files than workers. Therefore I have to wait for the parfor loop to finish before manually repeating the process for the remaining files. The program takes variable hours to complete on each worker so the parfor loop is bottlenecked by the longest run.
Is there a way to accomplish this using tasks as well? It seems like a simple thing but I have not found any examples of how to do this.
Thank you for reading.
Best Answer