[Rack] Crawling Noisebridge-discuss Archives
Jared Dunne
jareddunne at gmail.com
Sun May 13 02:05:03 UTC 2012
rack-
I wanted to give you a heads up that I am crawling:
https://www.noisebridge.net/pipermail/noisebridge-discuss/
With the User-Agent of:
Noisebridge-discuss Drama Detector Crawler
I am using Nutch 1.4 to with the default fetcher.server.delay setting of 5
seconds between requests to the same server. I suspect I can go faster
than that but I'll err on the side of caution.
This for a Noisebridge related project. Please let me know if there are
any problems with this.
Thanks,
Jared-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.noisebridge.net/pipermail/rack/attachments/20120512/d0bb4625/attachment.html>
More information about the Rack
mailing list