MATLAB: Loop through the DNA array and record all of the locations of the triplets (codons): ‘AAA’, ‘ATC’ and ‘CGG’.

My code so far is functional, but I don't think that it's correct. I am supposed to loop through the cell array and record the locations of each codon, while skipping over the ones that contain a character from preceding codon. For example, if part of the sequence contains [A,T,C,C,G,G] then the section with CCG should be skipped. I'm just not entirely sure what the best way to do that would be.

Here is what I have so far:

fid = fopen('sequence_long.txt','r')
A = textscan(fid,'%3s');
DNA = A{1};
fclose(fid);
i = 1;
%loops through array and counts codon occurrences
%finds the index location of individual codons
while i < length(DNA)  
    i = i + 1;
    
    
    if strcmp(DNA(i),'AAA') 
        num_AAA = nnz(strcmp(DNA,'AAA'));
        loc_AAA = find(strcmp(DNA,'AAA'));
        
    elseif strcmp(DNA(i),'ATC')
        num_ATC = nnz(strcmp(DNA,'ATC'));
        loc_ATC = find(strcmp(DNA,'ATC'));
        
    elseif strcmp(DNA(i),'CGG')
        num_CGG = nnz(strcmp(DNA,'CGG'));
        loc_CGG = find(strcmp(DNA,'CGG')); 
       
    end
    
end
fprintf('The number of AAA values is: %.f',num_AAA)
fprintf('The index location of AAA values: %.f\n',loc_AAA(1:10))
fprintf('The number of ATC values is: %.f',num_ATC)
fprintf('The index location of ATC values: %.f\n',loc_ATC(1:10))
fprintf('The number of CGG values is: %.f',num_CGG)
fprintf('The index location of CGG values: %.f\n',loc_CGG(1:10))

DNA = 'AAATCATCGGCGGATC';%Example sequence i = 1; loc_AAA = []; loc_ATC = []; loc_CGG = []; num_AAA = 0; num_ATC = 0; num_CGG = 0; while i <= length(DNA)-2 if DNA(i)=='A' && DNA(i+1)=='A' && DNA(i+2)=='A' loc_AAA = [loc_AAA i]; num_AAA = num_AAA + 1; i = i + 3; %Skip the next two characters elseif DNA(i)=='A' && DNA(i+1)=='T' && DNA(i+2)=='C' loc_ATC = [loc_ATC i]; num_ATC = num_ATC + 1; i = i + 3; elseif DNA(i)=='C' && DNA(i+1)=='G' && DNA(i+2)=='G' loc_CGG = [loc_CGG i]; num_CGG = num_CGG + 1; i = i + 3; else i = i + 1; end end

MATLAB: Loop through the DNA array and record all of the locations of the triplets (codons): ‘AAA’, ‘ATC’ and ‘CGG’.

Best Answer

Related Question

Best Answer

Related Solutions

MATLAB: Does Matlab support bed, wig, and other usual genomics file formats

MATLAB: How to separate this DNA sequence

Related Question