[ml] Kaggle HIV update

David Faden dfaden at gmail.com
Tue Jun 22 15:37:37 UTC 2010


It looks like the sequences are already coded in terms of amino acids rather
than nucleotide triples? <
http://www.biogem.org/Accelrys/Sequencing/symbols_amino_acids.html>

On Mon, Jun 21, 2010 at 10:29 PM, Thomas Lotze <thomas.lotze at gmail.com>wrote:

> I committed some python for generating base pair triplet count features,
> and R code for determining frequency and doing a basic GLM including the
> most frequent triplets.
> (The Noisebridge machine learning sourceforge git repository is here:
> https://sourceforge.net/scm/?type=git&group_id=326816  To download the
> files, run "git clone git://
> ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge"
> or, better yet, ask Mike to give you read/write access to this project so
> you can upload code as well)
>
> This got me to 53.8462 MCE, 36th out of 49 teams.
>
> See you tomorrow night at 9 for fun with Hadoop!
> -Thomas
>
> _______________________________________________
> ml mailing list
> ml at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/ml
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.noisebridge.net/pipermail/ml/attachments/20100622/b26fcf93/attachment.html>


More information about the ml mailing list