MATLAB: Find repeated expression in array of strings, return logical.

I have data of the type

looking_for = ["apple", "melon"]

my_data = ["The apple is red", "The bee was yellow", "I am eating a melon", "The melon is sweet"]

with

timing = [2.5, 5, 10, 18]

I want to find when a regular expression was repeated consecutively and then return a logical index that pertains to the first observation of the repetition.

My approach:

1) Find out if the string contains one of the regular expression in looking_for, e.g. melon. I solve this using

idx = cellfun(@(x)( ~isempty(x) ), regexp(my_data, "apple"));

2) Then i transpose and multiply my indexing with the timing to get the relevant timings & remove the zeros (not shown here)

apple_timing = transpose(idx).*timing;

Which would give me a cell called apple_timing with a value of 2.5, which is exactly what I want.

I would like a bit of code that returns a variable called repeat_timing. In the case of the melon, this would return 18 – the first observed consecutive repeat of the regular expression melon.

% Data: LF = {'apple', 'melon'}; MD = {'The apple is red','The bee was yellow','I am eating a melon','The melon is sweet'}; TV = [2.5, 5, 10, 18]; % Locate patterns: fun = @(p)~cellfun('isempty',strfind(MD,p)); BM = cell2mat(cellfun(fun,LF(:),'uni',0)); CS = cumsum(BM,2);

>> [R1,C1] = find(CS==1 & BM); % First occurrence. >> LF{R1} ans = apple ans = melon >> TV(C1) ans = 2.5000 10.0000 >> [R2,C2] = find(CS==2 & BM); % Second occurrence. >> LF{R2} ans = melon >> TV(C2) ans = 18

Best Answer

Here is one solution based around cumsum:

You can use this to identify the first, second, third, etc. times that a pattern occurs, and find the related timing value:

You can easily automate this for an arbitrary number of matches, here I locate the first, second, and third occurrences (of which there are none in your sample data):

baz = @(n)find(CS==n & BM);
[row,col] = arrayfun(baz,1:3,'uni',0);
typ = cellfun(@(r)LF(r),row,'uni',0);
val = cellfun(@(c)TV(c),col,'uni',0);

giving:

>> typ{:}
ans =
  'apple'
  'melon'
ans =
  'melon'
ans = {}
>> val{:}
ans =
    2.5000   10.0000
ans =  18
ans = []
>>

Best Answer

Related Solutions

MATLAB: Split information into two columns

MATLAB: Find cell containing part of a string

Related Question