Thanks to the many who contributed to answering this question. I need to do a little more work in this project and now have a question related to the same general project and so thought I'd post it right here. I remember having read somewhere that it is not good form to use "break" in a program, but I don't know what else I can use in this context. Worse, "break" seems to be misbehaving! This is what is happening:
I have a number of DNA sequences in a file. For each of them (a string) I needed to find a substring that begins with "ATG" and ends with "TGA," "TAG" or "TAA" in the same frame (that is, if you count triplets of letters, the final three need to be one of the three: "TGA," "TAG" or "TAA." So the length of the string needs to be a multiple of 3. Thanks to your help, I have found such a substring for each sequence. Let us say that for the first sequence, I have found a substring that ends with TGA. Now, I need to find if there are any "TGAs," "TAGs" or "TAAs" nested within the sequence in the same frame. If there is even one, the substring is of no use and I need to move to the next ATG in the DNA sequence and find the substring that begins with this ATG and ends with one of the three termination triplets. Without going into the actual code, I have this nested loop structure:
for I := 0 to SequenceFile - 1 do //SequenceFile is a TStringList
begin
.
.
.
for J := 1 to NumATGs do //NumATGs is the number of ATGs in the Ith sequence
begin
//here I find the sequence that meets the criteria of beginning with ATG and ending with TGA
.
.
.
//here I am looking for any nested TGAs (except for the last three which are already TGA). If I find one, then I need to get out of this loop and move to the next where I need to find any nested TAGs.
PosTGA := Pos('TGA', TryORF) + 3;//TryORF is a substring that begins with ATG and ends with one of three termination triplets
for K := 1 to NumTGAs_Temp do
begin
if (PosTGA < lenTryORF) then //if the TGA is in the middle and not the end of the ORF; lenTryORF is the length of TryORF
begin
if ((PosTGA - 1) mod 3 = 0) then //if there is a nested ORF then move to next ATG
break
else PosTGA := PosEx('TGA', TempStr, PosTGA) + 3;//see if the next TGA is nested
end
else PosTGA := PosEx('TGA', TempStr, PosTGA) + 3;
end;
//here I am looking for any nested TAGs (just like above).
PosTAG := Pos('TAG', TryORF) + 3;
for K := 1 to NumTAGs_Temp do
begin
if (PosTAG < lenTryORF) then //if the TAG is in the middle and not the end of the ORF
begin
if ((PosTAG - 1) mod 3 = 0) then //if there is a nested ORF then move to next ATG
break
else PosTAG := PosEx('TAG', TempStr, PosTAG) + 3;
end
else PosTAG := PosEx('TAG', TempStr, PosTAG) + 3;
end;
//here I am looking for any nested TAAs (just like above).
PosTAA := Pos('TAA', TryORF) + 3;
for K := 1 to NumTAAs_Temp do
begin
if (PosTAA < lenTryORF) then //if the TAA is in the middle and not the end of the ORF
begin
if ((PosTAA - 1) mod 3 = 0) then //if there is a nested ORF then move to next ATG
break
else PosTAA := PosEx('TAA', TempStr, PosTAA) + 3;
end
else PosTAA := PosEx('TAA', TempStr, PosTAA) + 3;
end;
end;
end;
By the way, If my program finds a nested termination codon (say TGA in the first loop for index K) it is supposied to break from that loop and move to the next ATG in the sequence (that is the J loop) or move to the next K loop (to look for any nested TAGs. Instead, it moves to the I loop, abandoning the current sequence and moving to the next sequence in the file.
What am I doing wrong?