Manual talk:MWDumper

GFDL
From http://mail.wikipedia.org/pipermail/wikitech-l/2006-February/033975.html:


 * I hereby declare it GFDL and RTFM-compatible. :) -- brion vibber


 * So this article, which started as a the README file from MWDumper, is allowed on the wiki. This might be good, as I tend to read wikis more than I read READMEs! --Kernigh 04:53, 12 February 2006 (UTC)

Example (in)correct?
Is the parameter -d correct described in this example?
 * java -jar mwdumper.jar --format=sql:1.5 pages_full.xml.bz2 | mysql -u -p
 * My mysql tells me
 * -p, --password[=name] Password to use when connecting to server ...
 * -D, --database=name Database to use.
 * (mysql Ver 14.7 Distrib 4.1.15, for pc-linux-gnu (i486) using readline 5.1)
 * Would this be better?
 * java -jar mwdumper.jar --format=sql:1.5 pages_full.xml.bz2 | mysql -u -D -p
 * (if password is given per command line there must be no space between -p and the actual password)
 * Or if the password is ommited it is requested interactively:
 * java -jar mwdumper.jar --format=sql:1.5 pages_full.xml.bz2 | mysql -u -D -p

MWDumper error
Running WinXP, XAMPP, JRE 1.5.0_08, MySQL JDBC 3.1.13

http://f.foto.radikal.ru/0610/4d1d041f3fd7.png --89.178.61.174 22:09, 9 October 2006 (UTC)

MWDumper Issues
Using MWDumper, how would I convert a Wikipedia/Wikibooks XML dump to an SQL file?

Problems with MWDumper
When I run: java -jar mwdumper.jar -–format=sql:1.5 enwiki-latest-pages-articles.xml.bz2 | c:\wamp\mysql\bin\mysql -u wikiuser -p wikidb

I get:

Exception in thread "main" java.io.FileNotFoundException: -ûformat=sql:1.5 (The system cannot find the file specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream. (Unknown Source) at java.io.FileInputStream. (Unknown Source) at org.mediawiki.dumper.Tools.openInputFile(Unknown Source) at org.mediawiki.dumper.Dumper.main(Unknown Source)

Please help!

SOLUTION:
For the above problem here is the fix:

java -jar mwdumper.jar -–format=sql:1.5 enwiki-latest-pages-articles.xml.bz2 | c:\wamp\mysql\bin\mysql -u wikiuser -p wikidb

Notice "-ûformat=sql:1.5" in the error message? The problem is one of the "–" is using the wrong char (caused by copy&paste)...just edit and replace (type) them by hand. so replace "-û" "--" next to format=sql:1.5

P.S For a really fast dump (60min vs 24hrs) unbzip the enwiki-latest-pages-articles.xml.bz2 file so that it becomes enwiki-latest-pages-articles.xml Then use the command: java -jar mwdumper.jar -–format=sql:1.5 enwiki-latest-pages-articles.xml | c:\wamp\mysql\bin\mysql -u wikiuser -p wikidb

Page Limitations?
I'm attempting to import a Wikipedia database dump comprized of about 4,800,000 files on a Windows XP system. I'm using the following command: java -jar mwdumper.jar --format=sql:1.5 enwiki-20070402-pages-articles.xml" | mysql -u root -p wikidb

Everything appears to go smootly, the progress indicator goes up to the expected 4 million and someting, but only 432,000 pages are actually imported into the MySQL database. Why is this? Any assistance is greatly appriciated. Uiop 02:31, 15 April 2007 (UTC)


 * MySQL experienced some error, and the error message scrolled off your screen. To aid in debugging, either save the output from mysql's stderr stream, or run mwdumper to a file first, etc. --brion 21:15, 20 April 2007 (UTC)