Manual talk:Robots.txt

From mediawiki.org
(Redirected from Talk:Robots.txt)

Excluding a namespace[edit]

I guess it would be pretty easy to exclude a namespace, right? E.g., if you wanted to establish an unsearchable trash namespace:

 Disallow: /wiki/Trash:
 Disallow: /wiki/Trash talk:

Leucosticte (talk) 11:52, 19 October 2013 (UTC)Reply[reply]

/api.php[edit]

Mention if /api.php should be disallowed too. Jidanni (talk) 17:41, 6 December 2017 (UTC)Reply[reply]

robots.txt + site root level Short URL[edit]

For wiki's with site root level Short URL, another user suggests:

User-agent: *
Disallow: /index.php

I'm liking this suggestion, above. My wiki has short URLs implemented. So all the page links look to robots like example.com/Some_page. And all the action links look like example.com/index.php?title=Page_name&action=edit etc. So robots.txt can Disallow: /index.php for all User-agents: * and Google spiders etc will not crawl the action pages but will crawl all the normal page links because my Short URL LocalSettings have caused those to look like this: example.com/Some_page. Am I correct? --Rogerhc (talk) 05:41, 11 January 2019 (UTC)Reply[reply]

MediaWiki:Robots.txt[edit]

Wikipedia apparently is able to customize their robots.txt via w:MediaWiki:Robots.txt; on this wiki, MediaWiki:Robots.txt does not exist, but is clearly functional in the same way, if the default content there is any indication. However, this does not seem to be a default of the MediaWiki software - checking a handful of third-party wikis, their MediaWiki:Robots.txt pages do not have this default content. Looking around, there's no obvious documentation on how to set this up, and nothing jumps out on Special:Version either, so how is this done? Can someone add some documentation on this page? ディノ千?!☎ Dinoguy1000 00:11, 30 September 2021 (UTC)Reply[reply]