I am using the DIMERCOUNT function to count dimers in a sequence that includes gaps and/or ambiguous bases. I receive an error when working with a sequence including gaps (represented with a hyphen), and an incorrect plot while working with sequences including ambiguous bases. For example, when I type the following sequence:
dimercount('TAG-TGGCCAAGCGAGCTTG')
I expect a warning about gaps and/or ambiguous bases, as described in the documentation, and then a list of dimers, including an "Others" field. For the sequence given above, I would expect the following output:
Warning: Ambiguous symbols '-' appear in the sequence. These will be in Others. In dimercount at 132 ans = AA: 1 AC: 0 AG: 3 AT: 0 CA: 1 CC: 1 CG: 1 CT: 1 GA: 1 GC: 3 GG: 1 GT: 0 TA: 1 TC: 0 TG: 2 TT: 1 Others: 2
Instead, the DIMERCOUNT function produces an error of the following form:
??? Attempted to access buckets(16,16); index out of bounds because size(buckets)=[15,15]. Error in ==> dimercount at 100 buckets(dna(count),dna(count+1)) = buckets(dna(count),dna(count+1)) + 1; Error in ==> basecount_dimercount_error at 35 dch=dimercount(mySequence)
If I use a different symbol to represent the gaps, or work with a sequence including ambiguous bases, I do not receive this error. If I use the 'Chart' option with a sequence containing ambiguous bases, however, the "Others" field is missing from the resulting plot.
Best Answer