Topic on Project:Support desk

Import failed: Expected <mediawiki> tag, got

9
86.48.98.98 (talkcontribs)

Hi there

I'm in the middle of a migration from Screwturn to Mediawiki, but the bulk importer doesn't seem to want to play nice.


My Mediawiki is 1.34.2 and PHP is 7.4.7, IIS is 10.


I have wasted a lot of time with the import function, thinking that it was the syntax or other issues carried over from the export of Screwturn.

I used this converter btw: https://github.com/Cyberitas/ScrewturnToMediawiki


In my latest test, I had exportet 2 working pages from mediawiki, which I created myself, put them in a bz2 archive and tried importing them to the same wiki. I also get the error in the topic when doing so, which leaves me baffled..


These are the php errors:

[06-Jul-2020 12:50:54 Europe/Copenhagen] PHP Warning:  XMLReader::read(): uploadsource://0186662cb5a76f3d3d244da07a4070fe:1: parser error : Document is empty in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

[06-Jul-2020 12:50:54 Europe/Copenhagen] PHP Warning:  XMLReader::read(): BZh91AY&amp;SY�=�� in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

[06-Jul-2020 12:50:54 Europe/Copenhagen] PHP Warning:  XMLReader::read(): ^ in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

[06-Jul-2020 13:01:47 Europe/Copenhagen] PHP Warning:  XMLReader::read(): uploadsource://40dc1c20a94f2ca42e01a959344381ae:1: parser error : Document is empty in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

[06-Jul-2020 13:01:47 Europe/Copenhagen] PHP Warning:  XMLReader::read(): BZh91AY&amp;SY�=�� in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

[06-Jul-2020 13:01:47 Europe/Copenhagen] PHP Warning:  XMLReader::read(): ^ in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570


Please let me know where to find other useful logs, thank you for your time!

Taavi (talkcontribs)

Are you using importDump.php or Special:Import? If importDump.php, what exact command are you using to import the file?

86.48.98.98 (talkcontribs)

Hi Majavah


Thanks, good point - I was blinded by the same error message from the Special:Import, so I didn't test the mediawiki export in the commandline.. It worked there...


Here are my tests with only a few converted xml files in the tar.bz2:

c:\Program Files\PHP\v7.4.7>php C:\inetpub\Wiki\maintenance\importDump.php --conf C:\inetpub\Wiki\LocalSettings.php C:\Users\hj_adm\Desktop\Desktop.tar.bz2

Warning: XMLReader::read(): uploadsource://0cc83551072289c1540745a8ff28e526:1: parser error : Document is empty in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

Warning: XMLReader::read(): agents.xml in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

Warning: XMLReader::read(): ^ in C:\inetpub\Wiki\includes\import\WikiImporter.php on line 570

MWException from line 574 of C:\inetpub\Wiki\includes\import\WikiImporter.php: Expected <mediawiki> tag, got

#0 C:\inetpub\Wiki\maintenance\importDump.php(359): WikiImporter->doImport()

#1 C:\inetpub\Wiki\maintenance\importDump.php(292): BackupReader->importFromHandle(Resource id #150)

#2 C:\inetpub\Wiki\maintenance\importDump.php(127): BackupReader->importFromFile('compress.bzip2:...')

#3 C:\inetpub\Wiki\maintenance\doMaintenance.php(99): BackupReader->execute()

#4 C:\inetpub\Wiki\maintenance\importDump.php(364): require_once('C:\\inetpub\\Wiki...')

#5 {main}


They all work individually, the ones I have checked at least.

Taavi (talkcontribs)

I don't think the script can read bzipped files, try giving it the path to the raw XML file

86.48.98.98 (talkcontribs)

Hi again, and thank you for your time!


It worked .. So why would the official documentation point to an archive? :D

php importDump.php --conf ../LocalSettings.php /path_to/dumpfile.xml.gz


Well, since that doesn't help me, do you have a pointer as to importing multiple xml files? :)

Taavi (talkcontribs)

The documentation states "If the file is compressed and that has a .gz or .bz2 file extension, it is decompressed automatically.". My guess would be that it only .bz2 files, not .tar.bz2. I'll do some digging and update the documentation.

86.48.98.98 (talkcontribs)

I tried both with archives made in 7-zip (first tar, then bz2 from the tar) and directly to bz2 on Ubuntu via tar tool..


Guess I'll just build a long list of single import commands and get it overwith ... =)

Thank you so much for your time

Bawolff (talkcontribs)

it probably also depends if your version of php is compiled with bz2 support (i would guess)

86.48.98.98 (talkcontribs)

Hi Bawolf

It is an enabled extension on my PHP, so no worries there :)

I ended up importing 1 page/xml at a time, which was about 300 oneliners, but we're up and running!

Thanks for the quick support, you guys are amazing!