Topic on Project:Support desk

XML import not populating categories

5
216.92.130.85 (talkcontribs)

I am exporting a .xml file from an existing wiki of all pages under a category, let's say 100 pages in Category:TestCat. All the pages do contain the [[Category:TestCat]] tag. The export is fine and the file looks good. When I import into my new install, the import is successful, the pages are all listed in the list of ALL pages, and all are fully accessible and correct, they also show the Categorty:TestCat category tag properly, and in blue (not red), BUT they are not listed on the category page itself for "TestCat". After import, if I click the category tag on an imported page, which is blue, it shows me there are 0 things in that category.

I have tried creating the category prior to import, and using all options in the import -> 'upload file' panel, and using the runJobs.php script, none of which have any impact.

However, after import, if I go to the list of ALL pages, select an imported page, and then Edit -> make no changes -> Save, THEN it begins to appear on the "TestCat" category page as it should. I have too many pages to import to open/save each one individually however. (Can I do this in the db or script this?)

I haven't found anything in documentation about this issue, can anyone address either why this is happening or how to correctly do this import of pages under a category such that they actually populate that category without manual edit/save?

Ciencia Al Poder (talkcontribs)

Categories are tracked in a background job called the Manual:Job queue. Depending on your setup, it may take a while to update, if it runs on page load, or you may need to execute runJobs.php to speed it up.

216.92.130.85 (talkcontribs)

I've resolved this. Watching the db tables, the problem is clearly that when you import my XML, the category tags are going into the old_text column of the text table -- as pure text. They are not being parsed in any sense, which makes sense.

Running /maintenance/refreshLinks.php fixes this. This doesn't appear to be documented in the Import documentation.

216.92.130.85 (talkcontribs)

runJobs.php does not fix this.

Ciencia Al Poder (talkcontribs)

This is not true. Importing pages causes them to be reparsed, and when parsed they generate the categories. This is done in the job queue. Maybe your setup causes those jobs to fail, or not being inserted at all. This is often caused by buggy extensions. However, refreshLinks.php would force reparsing the page, which should "fix" this. If you ever know what's the root cause of the problem, feel free to reopen this.