[Noisebridge-discuss] Spam on the wiki

Andy Isaacson adi at hexapodia.org
Wed Jul 6 07:51:52 UTC 2011


On Tue, Jul 05, 2011 at 06:59:47PM -0400, Casey Callendrello wrote:
> I've had some experience de-spamming other MediaWiki installations. It 

Awesome, I really appreciate hearing from people with experience!  It
gives me faith that we're not alone. :)

> seems that most of these spammers have someone (mechanical Turk, chinese 
> gold farm, porno access page) solving RECaptchas as quickly as possible. 

I'm super down on RECaptcha because they often don't work for me on
other sites, and using it makes our site depend on recaptcha.net or
whatever provider we use -- I'd prefer something we can run
independently.

All of the visual captchas are also not friendly to visually impared
users.  Having a special case for audio cues isn't much better, and is
likely to break without being noticed (since most users don't use it).

> This is where their main costs come from. Remember that this is a major 
> volume business. They almost never tweak their scripts to work around 
> the individual quirks of a particular installation. In fact, as a 
> general rule, if you can make your pages the least-bit out of "standard" 
> it will kill spam for a very long time.

Agreed, we've been blissfully ignorant of the problem for a long time
because none of the spammers hit https sites.

Then last fall we had a burst of anonymous page creation spam, so we
turned off anonymous page creation (it was supposed to be a short-term
solution, but I got distracted and never got back to fixing it), and now
we're seeing account creation to allow spam.  Wheee.

> On ones that I managed, we required newly created accounts to pass a 
> CAPTCHA for all spam-potential actions. More specifically, "new" 
> accounts couldn't create or edit any sort of page. That "new" threshold 
> is worth tweaking. By default, it's 3 days. We set it to 4 and that was 
> enough out of "normal" to kill spam.

That sounds like a bummer to me, we get a *lot* of great content created
by brand new accounts.

> I think, for the NB wiki, we should consider adding that restriction. We 
> should also try using different CAPTCHAs for different pages. 
> ReCAPTCHA's failing is its ubiquity. Writing code to work around it has 
> a high payoff. Anything we can do to affect the spammers' bottom lines 
> will be what works.

Sounds like we should code up some local hack like:

What is three plus seven? _____________

and call it a day. :)

-andy



More information about the Noisebridge-discuss mailing list