Manual talk:Robots.txt

From MediaWiki.org
Jump to navigation Jump to search

Excluding a namespace[edit]

I guess it would be pretty easy to exclude a namespace, right? E.g., if you wanted to establish an unsearchable trash namespace:

 Disallow: /wiki/Trash:
 Disallow: /wiki/Trash talk:

Leucosticte (talk) 11:52, 19 October 2013 (UTC)

/api.php[edit]

Mention if /api.php should be disallowed too. Jidanni (talk) 17:41, 6 December 2017 (UTC)

robots.txt + site root level Short URL[edit]

For wiki's with site root level Short URL, another user suggests:

User-agent: *
Disallow: /index.php

I'm liking this suggestion, above. My wiki has short URLs implemented. So all the page links look to robots like example.com/Some_page. And all the action links look like example.com/index.php?title=Page_name&action=edit etc. So robots.txt can Disallow: /index.php for all User-agents: * and Google spiders etc will not crawl the action pages but will crawl all the normal page links because my Short URL LocalSettings have caused those to look like this: example.com/Some_page. Am I correct? --Rogerhc (talk) 05:41, 11 January 2019 (UTC)