[Rack] Crawling Noisebridge-discuss Archives

Andy Isaacson adi at hexapodia.org
Sun May 13 05:54:30 UTC 2012


On Sat, May 12, 2012 at 07:05:03PM -0700, Jared Dunne wrote:
> I wanted to give you a heads up that I am crawling:
> https://www.noisebridge.net/pipermail/noisebridge-discuss/
> 
> With the User-Agent of:
> Noisebridge-discuss Drama Detector Crawler
> 
> I am using Nutch 1.4 to with the default fetcher.server.delay setting of 5
> seconds between requests to the same server.  I suspect I can go faster
> than that but I'll err on the side of caution.
> 
> This for a Noisebridge related project.  Please let me know if there are
> any problems with this.

There's a mbox if that would be easier.

-andy



More information about the Rack mailing list