Topic on Project:Support desk

generatesitemap.php : How search engines should find XML and GZ files?

7
49.230.83.59 (talkcontribs)

With generatesitemap.php I created the following files:

site_dir/sitemap/sitemap-index-example.com.xml

site_dir/sitemap/sitemap-example.com-NS_0-0.xml.gz

site_dir/sitemap/sitemap-example.com-NS_10-0.xml.gz

site_dir/sitemap/sitemap-example.com-NS_14-0.xml.gz

site_dir/sitemap/sitemap-example.com-NS_4-0.xml.gz

site_dir/sitemap/sitemap-example.com-NS_8-0.xml.gz

All these files have -rw-r--r-- so they are accessible to be read by search engines, but I'd like to ask, after not finding clear information in this manual page:

How search engines should find XML and GZ files?

  • Should I link towards them from the HTML body of the homepage or sidebar menu?
  • In Google search console I can add just one file - should I add all 6 or just one?

Thanks,

Ciencia Al Poder (talkcontribs)

In google search console you should add site_dir/sitemap/sitemap-index-example.com.xml (the one with "index" in the name). It references all the others

You can also add a reference to the sitemap in the robots.txt file. See https://support.google.com/webmasters/answer/6062596

49.230.83.59 (talkcontribs)

Hello @Ciencia Al Poder

I understand that it's either the first option or the second and there is no point to use both;

I assume I'll personally use the Search Console option as my robots.txt already contains quite a lot of directives and also, using Search Console might have less chance for bias as robots.txt can be very "dynamic" while Search Consoles in general are, at least in my experience quite "static";

Yet, using robots.txt seems to me more global as it can effect a large number of search engines at once.

Ciencia Al Poder (talkcontribs)

You can use both options without problem.

182.232.48.81 (talkcontribs)

Hello @Ciencia Al Poder I tried to paste a long comment here helping newcomers like myself dealing with this; because it included a link to Webmasters StackExchange I couldn't paste it so I pasted it in Project:Current issues asking for help pasting it here but my thread there was deleted.

It was deleted by @Clump in the blatant description of "nonsense"; I desire gave time writing and formatting this comment and I think I should at least get a copy of it somewhere if it is not pasted here,

Please help me with this,

AhmadF.Cheema (talkcontribs)

Apologies for the confusion.😕🙁 Posting the comment at Current issues was a little unusual.

The following is 182.232.48.81's comment with some slight modifications.


Hello to newcomers like myself; I did what Ciencia Al Poder suggested and the following output is what I get when I visit example.com/sitemap-index-example.com.xml. At the start I wasn't sure how crawlers would treat the .gz files but it seems to be generally okay as reported in the Webmasters' StackExchange session: Why does my sitemap have a “.gz” extension, and how can I edit it?

182.232.48.81 (talkcontribs)

Thanks Ahmad,