Manual talk:Importing XML dumps
[edit] link tables?
(regarding mwdumper import) I want to avoid the expensive rebuildall.php script. Looking at download:enwiki/20080724/, I'm wondering - should we import ALL of the SQL dump files, or are there any that should be skipped? --JaGa 00:50, 23 August 2008 (UTC)
- OK, I went through maintenance/tables.sql, and compared what an importDump.php populates and what mwdumper populates (only page, revision, and text tables), so I'm thinking this is the list of SQL dumps I'll want after mwdumper finishes:
- category
- categorylinks
- externallinks
- imagelinks
- pagelinks
- redirect
- templatelinks
- Thoughts? --JaGa 07:04, 24 August 2008 (UTC)
When I try to import using this command: C:\Program Files\xampp\htdocs\mediawiki-1.13.2\maintenance>"C:\Program Files\xampp\php\php.exe" importDump.php C:\Users\Matthew\Downloads\enwiki-20080524-pages-articles.xml.bz2
It fails with this error: XML import parse failure at line 1, col 1 (byte 0; "BZh91AY&SYö┌║O☺Ä"): Empty document
What do you think is wrong?
[edit] table prefix
I have a set of wikis with a different table prefix for each of them. How to I tell importDump.php which wiki to use?
- Set $wgDBprefix in AdminSettings.php —Emufarmers(T|C) 11:10, 25 February 2009 (UTC)
[edit] Importing multiple dumps into same database?
If we try to import multiple dumps into the same database, what happens?
Will it work this way?
For example, if there are are two articles with the same title in both databases, what will happen?
Is it possible to import both of them into the same database and distinguish titles with prefixes?
[edit] Merging with an existing wiki
How do I merge the dumps with another wiki I've created without overwriting existing pages/articles?
[edit] .bz2 files decompressed automatically by importDump.php?
It seems inly .gz files, not .bz2, are decompressed on the fly. --Apoc2400 22:40, 18 June 2009 (UTC)
- Filed as bug 19289. —Emufarmers(T|C) 05:15, 19 June 2009 (UTC)
Add
if( preg_match( '/\.bz2$/', $filename ) ) { $filename = 'compress.bzip2://' . $filename; }
to the importFromFile function
[edit] Having trouble with importing XML dumps into database
I have been trying to upload one of the latest version of the dumps, pages-articles.xml.bz2 from download:enwiki/20090604/. I dont want the front end and other things that comes with wikimedia installations, so i thought i would just create the database and upload the dump. I tried using mwdumper, but it breaks with the following error. bugzilla:18328I also tried using mwimport, that also failed due to the same problem. any one have any suggestions to import the dump successfully to the database ?
Thanks Srini
[edit] Error Importing XML Files
A colleague has exported Wikipedia help contents and when attempting to import ran into an error. One of the errors had to do with Template:Seealso. The XML that is produced has a tag <redirect /> which causes the import.php module to error out. If I remove the line from the XML the imports just fine. We are using 1.14.0. Any thoughts?
- I am using 1.15. , and I get the following errors:
- <b>Warning</b>: xml_parse() [<a href='function.xml-parse'>function.xml-parse</a>]: Unable to call handler in_() in <b>/home/content/*/h/s/*hscentral/html/w/includes/Import.php</b> on line <b>437</b><br />
- <br />
- <b>Warning</b>: xml_parse() [<a href='function.xml-parse'>function.xml-parse</a>]: Unable to call handler out_() in <b>/home/content/*/h/s/*hscentral/html/w/includes/Import.php</b> on line <b>437</b><br />
- By analyzing what entries kill the script, I found that it is protected redirects- these errors come when a page has both <redirect /> and the <restrictions></restrictions> lines. Manually removing the restrictions line makes it work. I get these errors both from importdump.php and in my browser window on special:import when there is a protected redirect in the file. 76.244.158.243 02:55, 30 September 2009 (UTC)
simple download updated import.php from here: http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Import.php?view=co and replace original file in /includes directory. work fine!
- xml2sql has the same problem:
xml2sql-0.5/xml2sql -mv commonswiki-latest-pages-articles.xml unexpected element <redirect> xml2sql-0.5/xml2sql: parsing aborted at line 10785 pos 16.
212.55.212.99 12:22, 13 February 2010 (UTC)
[edit] Error message
The error message I get is "Import failed: Loss of session data. Please try again." Ikip 02:50, 27 December 2009 (UTC)
Fix: I got this error while trying to upload a 10 MB file. After cutting it down into 3.5 MB pieces, each individual file received "The file is bigger than the allowed upload size." error messages. 1.8 MB files worked though. --bhandy 19:24, 16 March 2011 (UTC)
- THANK YOU! This was driving me mad! LOL But your fix worked. ;) Zasurus 13:00, 5 September 2011 (UTC)
[edit] Does NOT allow importing of modified data on my installation
If I export a dump of the current version using dumpBackup.php --current, then make changes to that dumped file, then attempt to import the changed file back into the system using importDump.php, NONE of the changes come through, even after running rebuildall.php.
Running MW 1.15.1, SemanticMediaWiki 1.4.3.
Am I doing something wrong, or is there a serious bug that I need to report? --Fungiblename 14:09, 13 April 2010 (UTC)
[edit] And for the necro-bump.... yes, I was doing something wrong.
For anyone else who has run into this problem, you need to delete revision IDs from your XML page dumps if you want to re-import the XML after modifying it. Sorry for not posting this earlier, but this issue was addressed almost instantly as invalid in response to an admittedly invalid bug report that I filed on Bugzilla in 2010: This is exactly how it's supposed to work to keep you from overwriting revisions via XML imports. --Fungiblename 07:57, 21 September 2011 (UTC)
[edit] Error message: PHP Warning: Parameter 3 to parseForum
Two errors:
PHP Deprecated: Comments starting with '#' are deprecated in /etc/php5/cli/conf.d/imagick.ini on line 1 in Unknown on line 0 PHP Warning: Parameter 3 to parseForum() expected to be a reference, value given in /home/t/public_html/deadwiki.com/public/includes/parser/Parser.php on line 3243 100 (30.59 pages/sec 118.68 revs/sec)
Adamtheclown 05:11, 30 November 2010 (UTC)
[edit] XML that does NOT come from a wiki dump
Can this feature be used on an xml file that was not created as, or by, a wiki dump? I am looking for a way to import a lot of text documents at once, that can be wikified later. Your advice, wisdom, insight, etc, greatly appreciated.
- NO - XML is a structure not a format, so the mediawiki xml-reader only accepts xml-dumps for mediawiki or simalary formated xml.