Extension:DumpHTML

dumpHTML.php is an extension for making a simple HTML-dump including images and media files of a MediaWiki installation. It was formerly written and distributed as a script. You cannot execute DumpHTML with your browser via the network, because you aren't able to use parameters. Instead you need command line access and have to issue a console-based command like  for Linux.

In MediaWiki 1.11.x images (and therefore also equations) were broken. See 12122 and 13061 for details. 1.11.x was the last branch to have dumpHTML.php as a maintenance script. For upcoming releases the functionality was moved to an extension, http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/DumpHTML/.

Parameters
Example to create a complete snapshot including image and media files and image thumbnail files in directory wikidump (LINUX) /usr/bin/php /srv/www/mediawiki/extension/DumpHTML/dumpHTML.php -d /srv/www/mediawiki/wikidump -k monobook --image-snapshot --force-copy

If you intend to use the wikidump on a CD/DVD-ROM or on a Windows filesystem, and if your wiki pages or files had non-ASCII characters, which is likely, then you probably need to change the directory and filenames from UTF-8 (on LINUX) to the character encoding on your Windows, for example to codepage 1252 for Western-European systems. A useful LINUX tool for this task is convmv hosted on http://freshmeat.net/projects/convmv/. Be aware, that even after converting the filenames older browsers might still have problems to access pages with non-ASCII characters.

As a final solution for interoperability and a maximum of compatibility on different filesystems, DumpHTML is likely to be changed in the future to create wiki snapshots and links with ISO-9660 or hashed filenames which only use ASCII characters.

convm -f utf-8 -t cp1252 --notest -r /srv/www/mediawiki/wikidump