MATLAB: Problem with regular expressions

regexpregular expression

Given the string:
str='""A_3_1"": [""choice_0"", ""choice_1"", ""choice_2"", ""choice_3""], ""A_2_1"": [""choice_1"", ""choice_2""]'
I want to group with regexp from the numbers after the word choice for the for the two different situations A_3_1 and A_2_1.
The output for the A_3_1 will be:
[0 1 2 3]
and the output for the A_2_1 will be:
[1 2]

Best Answer

Here is one way. It is not tremendously robust. For example, it assumes that the "choices" will always be single-digit numbers. However, it should at least give you a rudimentary algorithm that works, as a starting point.
% Identify locations of indicators. Appending the extra 'A' to get the end of the string
indices = regexp([str,'A'],'A');
numberIndices = numel(indices) - 1;
for ni = 1:numberIndices
% Find the substring for this index
substr = str(indices(ni):indices(ni+1)-1);
% Find the location of the beginning of the "choice" strings
choiceIdx = regexp(substr,'choice_');
% Find the locations of the digits following each choice, and convert to numeric.
% Each vector of values is stored in a cell array.
values{ni} = str2num(substr(choiceIdx+7)')'
end