MATLAB: Converting loop to vector operation

loop to vector operation

Hi,
I have a large dataset where I calculate the value of some result in each row based on data in preceding rows. Currently it is taking too long due to the large data. I have included a simplified example below. Can this be converted to a vector operation? Alternatively and less preferably can this be done in parallel? Thank you.
MATLAB code
dataset=table();
dataset.assetname=[{'st1'};{'st1'};{'st1'};{'st2'};{'st2'}];
dataset.time=[1;2;5;2;3];
dataset.price=[1.1;1.2;1.1;2.1;2.2];
dataset.quantity=[10;15;5;20;25];
dataset.ordertype=[{'buy'};{'buy'};{'sell'};{'buy'};{'sell'}];
dataset.last_buy_quantity_price_x=repmat(NaN,height(dataset),1);
price_x=1.1;
for row=1:height(dataset)
filteredrows=strcmp(dataset.assetname(row),dataset.assetname) & dataset.time(row) >=dataset.time;
lastrow=max(find(dataset.price(filteredrows)==price_x & strcmp(dataset.ordertype(filteredrows),'buy')));
if isempty(lastrow)
tempvar=0;
else
tempvar=dataset.quantity(lastrow);
end
dataset.last_buy_quantity_price_x(row)=tempvar;
end

Best Answer

I don't think that you can vectorize this. But an acceleration is possible:
lastrow = max(find(dataset.price(filteredrows)==price_x & ...
strcmp(dataset.ordertype(filteredrows), 'buy')));
% Replacement:
lastrow = find(dataset.price(filteredrows)==price_x & ...
strcmp(dataset.ordertype(filteredrows), 'buy')), 1, 'last');
And:
dataset.last_buy_quantity_price_x = zeros(height(dataset), 1);
...
if ~isempty(lastrow)
dataset.last_buy_quantity_price_x(row) = dataset.quantity(lastrow);
end
end
Compare the ordertype with 'buy' once only before the loop:
isBuy = strcmp(dataset.ordertype, 'buy');
Then inside the loop:
lastrow = find(dataset.price(filteredrows)==price_x & isBux(filteredrows), 1, 'last');
The same works with "dataset.price == price_x" also.
What about removing all or(dataset.price ~= price_x, ~strcmp(dataset.ordertype, 'buy') before the loop? This might reduce the data set massively:
price_x = 1.1;
m = or(dataset.price ~= price_x, ...
~strcmp(dataset.ordertype, 'buy'));
name = dataset.assetname(m);
time = dataset.time(m);
price = dataset.price(m);
quantity = dataset.quantity(m);
last_buy = zeros(numel(name),1);
for row = 1:numel(name)
filtered = strcmp(name(row), name) & (time(row) >= time);
lastrow = find(filtered, 1, 'last');
if ~isempty(lastrow)
last_buy(row) = quantity(lastrow);
end
end
dataset.last_buy_quantity_price_x = zeros(height(dataset), 1);
dataset.last_buy_quantity_price_x(m) = last_buy;
This saves teh time for addressing the struct fields also and I think the simpler the code looks like, the easier is it to maintain.
Related Question