I have created a mutli-threaded version of the C-Mex function of FilterM (you do not need to download this to answer the question). The speed-up scales well with the number of processors for large problems. But for small problems, a smart method is required to choose the optimal number of threads, because starting a thread consumes a remarkable period of time. E.g. starting a 2nd thread for filtering the columns of a [1000 x 2] matrix is 50% slower than perform the job in the single main thread. For my example the columns of the input matrix are distributed to the different threads.
The length of the signal, the number of channels, the order of the filter and the FIR/IIR type, the number of available cores and the current system load matter the best choice.
Matlab uses magic limits for some multi-threaded functions:
- SUM starts one thread per core for a [1 x n] vector and n >= 89000 (in consequence there is a slow-down on a single core CPU)
- FILTER starts one thread per core for matrices of >= 16 columns
A better strategy is to start nCore-1 threads and calculate one chunk of data in the main thread.
I know how to solve a multi-parameter optimization problem, but even with this I would get optimal parameters for my own processor only. And solving this on the individual client computer (and for the current processor load…) is clearly an overkill.
What are standard and smart methods to choose the number of threads for a specific problem?
Best Answer