MATLAB: Efficient identification of quoted substrings in a substring

quoted string searchregexpregular expressions

I'm looking for help from the matlab string parsing experts out there to help we come up with a computationally efficient way (perhaps using regular expressions), to identify the quoted parts of a string from random sources of text (e.g. a journal article). The method needs to work regardless of whether the quoted substrings are contained inside single or double quotes. Further the text may contain apostrophes either inside or outside of the of the quoted substrings.
For example, in this sentence:
Sally said "It's a wonderful life" when she heard Molly's sister proclaim "It's a great day".
I would like to identify "It's a wonderful life" and "It's a great day", while in this text:
The attributes of the <table> tag were 'width=80%' and 'align="center"'.
I would like to identity 'width=80%' and 'align="center"'. [Note, I purposedly did not show the above example sentences in matlab code, but rather just showed them as free text, so as to not to confuse my question with how to properly capturing such sentences in a matlab variable.]
I recognize these examples are a bit pedantic, but since the code won't be able to control the source of the text it is searching, it needs to be robust across these cases.
I have been able to do this with a "brute force" linear search through the text, but its pretty inefficient and complex. I am not enough of an regexp expert to figure out a way to do this with regular expressions, but I've seen such experts come up with pretty elegant and efficient solutions to such problems. Hence, I was hoping my case might be tantilizing to one of those experts in this community. Thanks for any suggestions

Best Answer

Have you tried extractBetween?
Related Question