Hi there,
I have a large data matrix in which I want to calculate the average of elements satisfying certain conditions. The matrix has around 15 million rows.
There are two columns which are stored as a vector containing the conditions to be met:
elementIDs 15m x 1 (Containing the numbers 1 to 9664)gridIDs 15m x 1 (Containing the numbers 1 to 3)
I want to know the average Reynolds number of each element in each grid. The Reynolds numbers are stored in:
Re 15m x 1 (Containing doubles)
The results are to be stored in a matrix with each element a row and each grid a column:
meanRe = 9664 x 3.
To illustrate this problem example I wrote the following:
% Initiate test data
n_rows = 150000 % In practice 15 000 000
elementIDs = randi(9664,n_rows,1);gridIDs = randi(3,n_rows,1);Re = rand(n_rows,1).*1000;% Pre-allocate space
meanRe = zeros(9664,3);% Timer
tic% Loop over subsets to speed up the process
for kk = 1:3 % Select subset using logical indexing
Re_temp = Re(gridIDs==kk); elementIDs_temp = elementIDs(gridIDs==kk); % Loop over each element
for ii = 1:9664 % Calculate mean Reynolds using logical index
meanRe(ii,kk) = mean(Re_temp(elementIDs_temp==ii)); endendtoc
Although for the amount of data the code runs fairly quick I still have to wait several minutes. Is their anyway to speed this code up significantly?
Best Answer