Hey MLers,

If anyone is interested in implementing a "drama detection"
algorithm, we should have an ML meetup about it. There's
tons of training examples on the archives of the noisebridge-discuss

At first glance we could probably use tf-idf along with the network
structure of emails to do unsupervised learning, then train a supervised
learner on the manually-improved data.



From: Jake <jake at spaz.org>
Date: Tue, May 8, 2012 at 2:48 PM
Subject: [Noisebridge-discuss] drama-o-meter
To: "noisebridge-discuss at lists.noisebridge.net"
<noisebridge-discuss at lists.noisebridge.net>

can someone code up a thing that looks at the discuss mailing list traffic
and measures what portion of it is drama?  it should be easy.

a human will have to declare threads to be drama or not, and if they are,
and the total traffic for the week exceeds a certain percentage (say 50%)
the moderation automatically chokes off any further posts on any of those
threads until the percentage cuts back down.

i view the discuss list on the web portal like this:


and as you can see, it's really hard to find the hacking-related posts.

