MATLAB: How to extract qualifiers in a GBK file which are not parsed by “genbankread”

Bioinformatics Toolboxgbkgenbankreadregexp

The "genbankread" function does not parse all qualifiers in my GBK file.
For example, I include the "/label" qualifier for the "CDS" feature key, but a corresponding field does not appear in the "CDS" struct outputted by "genbankread".
Is it possible to parse this field into the "CDS" struct?

Best Answer

Unfortunately, the only GBK tags that are automatically parsed by "genbankread" are the ones that have a corresponding field in the outputted struct.
However, the full text of each feature is preserved in the "text" field.
You can use string/text functions, such as "regexp", to extract the desired fields from there.
As an example of how this may be done, please see the attached script that extracts the "/label" qualifier from each "CDS" feature in the attached file and adds a corresponding field (called "label") to the struct outputted by "genbankread".