Manual talk:ImportDump.php

Latest comment: 2 months ago by Drjoe42 in topic Parallelizing the import

Problem with --uploads[edit]

I'm having absolutly no luck with this. I can import everything perfectly fine, but if i try to add --uploads it fails with the following.

PHP Warning: XMLReader::open(): Unable to open source data in /www/docrootssl/wiki/includes/libs/XmlTypeCheck.php on line 137 PHP Fatal error: Call to undefined method FileRepoStatus::getXml() in /www/docrootssl/wiki/includes/Import.php on line 1602

Wildcard support for filename?[edit]

Does this script command support wildcards? For example, is there a syntax that would allow import of *.xml in a given directory? Tcrimsonk (talk) 02:41, 12 July 2016 (UTC)Reply[reply]

No. --Nemo 05:38, 12 July 2016 (UTC)Reply[reply]

Import without namespace[edit]

Hi, I'm trying to migrate a wiki, but when I execute the ImportDump the articles don't migrate to their correct namespace. All of them go to Main. How can I do now?


[EDIT] I've made a mistake in my LocalSettings, but I couldn't find it. Now I could resolve it.


Mention that pages will be assigned to existing users[edit]

Mention that pages will be assigned to existing users with the same name, except of course if e.g., "--username-prefix=Imported>" is used. Jidanni (talk) 14:16, 3 January 2020 (UTC)Reply[reply]

Many options different nowadays[edit]

Looking at the current source file, e.g., --username-prefix now is 'Prefix for interwiki usernames', and lots of other changes, perhaps. Jidanni (talk) 15:07, 3 January 2020 (UTC)Reply[reply]

Parallelizing the import[edit]

As per request in the Note in this section:

I stumbled across this hint while already importing simplewiki-20220101-pages-articles-multistream.xml which was taking quite some time. So I opened up to 8 terminals and ran additional instances of importDump.php using the --skipTo option with steps of 25000. Once an instance catches up to its successor, the script speeds up considerably, at which point I terminated the instance.

The slowdown in the course of time mentioned elsewhere still happens (any clues on why this happens?), so I did the above iteratively: when all except one of the initial instances were done (i.e., terminated by Ctrl-C), I partitioned the rest that was left to process with smaller steps for --skipTo.

In the end, Special:Statistics shows 357866 Pages (but no Content pages?). Displaying random pages indicates that the import succeeded.

Second round using the newer simplewiki-20221120-pages-articles-multistream.xml indicates that terminating/restarting instances is not required. Started eight instances with steps of 50000 for the --skipTo option and just let them run. Sorry, forgot to time the calls. After the import, Special:Statistics shows 420701 Pages.

Drjoe42 (talk) 08:50, 28 November 2022 (UTC)Reply[reply]