Topic on Project:Support desk

Disallow crawling of anything with a query string

5
79.183.173.166 (talkcontribs)

MediaWiki 1.36.1 with just one user account.

  • No core customizations
  • No non core skins
  • No non core extensions
  • No images
  • No videos
  • No audios

I want that crawlers won't index or even crawl anything with a URL parameter.

Is there a problem to disallow crawling of anything with a query string?

Disallow: /index.php?

Thank you for your help.

Bawolff (talkcontribs)

This won't prevent indexing if other sites link to a page (see google docs).

It will prevent some crawlers (but not all, since not everyone respects robots.txt)


Assuming that your $wgArticlePath is somewhere different, i don't think it will break anything.

79.183.173.166 (talkcontribs)

@Bawolff

About what you said "This won't prevent indexing if other sites link to a page (see google docs)."


Are you sure that's correct?


I have just read here

developers.google.com/search/docs/advanced/robots/robots_txt

>>>> How Google interprets the robots.txt specification

Under the chapter "Disallow" I didn't find a similar saying.

79.183.173.166 (talkcontribs)

Bawolff I think I get the point now but this might be wrong because:

If Google crawled a page in my website and indexed it before accessing my robots.txt but then Google accessed my robots.txt,

The index will be canceled and won't happen again (as long as the directive is in my robots.txt).

Would you assume otherwise?


Thanks,

Bawolff (talkcontribs)
Reply to "Disallow crawling of anything with a query string"