[Rack] robots.txt
Ben Kochie
ben at nerp.net
Tue Apr 30 21:56:15 UTC 2013
One additional amusing thing I found.
I blocked the bingbot IP range with a DROP rule. Within a min of droppint
bingbot's IP range, "msnbot" came by and scraped the new robots.txt from
another IP range.
I removed the block, and it looks to be working correctly now.
-ben
On Tue, 30 Apr 2013, Ben Kochie wrote:
> The Disallow: /wiki/Special: came from the mediawiki examples.
>
> I added the additional /wiki/Special*
>
> Also, would someone who likes doing this kind of thing update our mediawiki:
>
> http://lists.wikimedia.org/pipermail/mediawiki-announce/2013-April/000127.html
> http://lists.wikimedia.org/pipermail/mediawiki-announce/2013-April/000129.html
>
> -ben
>
> On Tue, 30 Apr 2013, Jeff Tchang wrote:
>
>>
>> Googlebot (but not all search engines) respects some pattern matching.
>>
>> * To match a sequence of characters, use an asterisk (*). For instance,
>> to block access to all
>> subdirectories that begin with private:
>>
>> User-agent: Googlebot
>> Disallow: /private*/
>>
>>
>> So in your example
>>
>> User-Agent: *
>> Disallow: /wiki/Special*
>>
>> Will work for google. I am not sure bingbot obeys it.
>>
>> On Tue, Apr 30, 2013 at 2:38 PM, Andy Isaacson <adi at hexapodia.org> wrote:
>> On Tue, Apr 30, 2013 at 02:31:37PM -0700, Ben Kochie wrote:
>> > I added a robots.txt to https://noisebridge.net
>> >
>> > User-agent: *
>> > Disallow: /wiki/Help
>> > Disallow: /wiki/MediaWiki
>> > Disallow: /wiki/Special:
>> > Disallow: /wiki/Template
>> > Disallow: /wiki/skins/
>> >
>> > I noticed bingbot is uselessly crawling the entire contents of
>> > Special:RecentChanges.
>>
>> Is robots.txt a prefix, or a directory based exclusion scheme? Will
>> "Disallow: /wiki/Special:" cause bingbot to skip
>> "/wiki/Special:RecentChanges"?
>>
>> -andy
>> _______________________________________________
>> Rack mailing list
>> Rack at lists.noisebridge.net
>> https://www.noisebridge.net/mailman/listinfo/rack
>>
>>
>>
>
More information about the Rack
mailing list