Thanks David the python script you attached is super helpful!<br><br>I'm using it to generate a list of possible amino acids given each<br>codon. I'll be posting the data some time tomorrow along with some<br>other things.<br>

<br>  mike<br><br><br><br><div class="gmail_quote">On Wed, Jun 23, 2010 at 9:01 AM, David Faden <span dir="ltr"><<a href="mailto:dfaden@gmail.com">dfaden@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

It looks like they cannot be unambiguously mapped to amino acids. I wonder if it would be sensible in this case to invent a new symbol expressing all the possibilities, eg, just glom together the names of the possible amino acids in sorted order. Can we count on them using the same symbol coding for multiple nucleotides at the same sites across sequences? -- probably not, right? Do the sequence matchers already take care of this?<div>


<br></div><div><div>>>> import dna</div><div># First PR.Seq from the training data:</div><div>>>> t = 'CCTCAAATCACTCTTTGGCAACGACCCCTCGTCCCAATAAGGATAGGGGGGCAACTAAAGGAAGCYCTATTAGATACAGGAGCAGATGATACAGTATTAGAAGACATGGAGTTGCCAGGAAGATGGAAACCAAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAARACAGTATGATCAGRTACCCATAGAAATCTATGGACATAAAGCTGTAGGTACAGTATTAATAGGACCTACACCTGTCAACATAATTGGAAGAAATCTGTTGACTCAGCTTGGTTGCACTTTAAATTTY'</div>


<div>>>> dna.DisambiguateAmino(t)Traceback (most recent call last):  File "<stdin>", line 1, in <module></div><div>  File "dna.py", line 40, in DisambiguateAmino</div><div>    raise Exception('Wrong number: <<%s>> for %s' % (', '.join(possibilities), triple))</div>


<div>Exception: Wrong number: <<Arg, Lys>> for ARA</div></div><div><br></div><div>I hope someone can find use for the dictionaries in the attached code anyway.</div>

</blockquote></div><br>