Manual:Importing external content

From MediaWiki.org
Jump to: navigation, search

For existing sites it is always a tough task to migrate to a Wiki structure; the process of wikifying existing content from text files, HTML websites, or even office documents can be automated, but you'll have to write appropriate scripts on your own. As far as we know, there are no general-user ready-to-run scripts available for this purpose. Opposed to commercial Content Management Systems like Hyperwave or HTML editors like Microsoft FrontPage, MediaWiki and other Open Source WikiWiki software does not include import filters. There are some exceptions which will be discussed in the following sections.

Wikis[edit | edit source]

Converting content from a UseMod Wiki[edit | edit source]

Prior to MediaWiki (Wikipedia Software Phase III and Phase II), Wikipedia ran on the UseMod Wiki software written by Clifford Adams. UseModWiki is a Perl script which uses a database of text files to generate a WikiWiki site. It usually runs as a CGI script in response to web requests, but can be called directly by other Perl programs.

The storage format of UseMod Wiki is well documented.

Converting content from a PHPWiki[edit | edit source]

If you only have a few pages to convert, and the content isn't sensitive, you might want to try WebForce's online markup converter.

For larger PHPWikis, Isaac Wilcox has written a Perl script to do the conversion. It converts all the commonly used markup (still not 100% of markup, but most PHPWikis will only need minor tweaks after conversion; patches are welcome). It's written for the Mediawiki 1.4.x database schema, though updating it to handle 1.5.x should be fairly easy (again, patches welcome).

The above script works well, but the schema has changed quite a bit since it was written. I found it easier to install the last stable 1.4.x version, import my data, then upgrade mediawiki. The script did an excellent job of preserving almost all of the formatting.

Also see PhpWiki conversion for a solution that uses "sed".

Another solution (combination of already mentioned ones): User:Atrox/Phpwiki2Mediawiki.

Converting JSPWiki format to MediaWiki format[edit | edit source]

You can use jspwiki2mediawiki.pl to convert JSPWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the JSPWiki format.

Converting TracWiki format to MediaWiki format[edit | edit source]

You can use tracwiki2mediawiki.pl to convert TracWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the TracWiki format.

Converting MoinMoin format to MediaWiki format[edit | edit source]

There are various scripts for this, all dodgy. See MoinMoin.

Converting WackoWiki to MediaWiki[edit | edit source]

There is WackoWiki converter (developed for http://freesource.info/ migration to http://altlinux.org/), however it will need additional tweaking before use.

Converting TikiWiki format to MediaWiki format[edit | edit source]

You can convert TikiWiki pages to MediaWiki format using this script.

Converting content from a CSV text file[edit | edit source]

If you are using Windows you can try csv2other. It produces an output file with .txt extension containing code for a wiki table.

Converting content from HTML text file[edit | edit source]

If you have only a HTML excerpt or a few pages to convert, you might want to try Diberri's html2wiki converter, which uses HTML::WikiConverter Perl module from CPAN. For larger collection of files one should probably use the module itself, directly or with this package.

You could also try MwImporter, a php script for importing entire websites; it uses html2wiki and other MediaWiki maintenance scripts to import entire directories of static html and image files while preserving relative links, etc.

There are probably other HTML to Wiki markup converters.

See the section below for importing into MediaWiki.

Converting content from a MS-Word document[edit | edit source]

Word2MediaWikiPlus.

Microsoft Office Word Add-in For MediaWiki saves documents from Microsoft Office Word straight into MediaWiki.

OpenOffice also does a good job of reading MS Word and a usable job of exporting as MediaWiki wikitext.

Converting content from other sources[edit | edit source]

If you are able and willing to do some scripting by yourself, it is possible to import almost any existing textual content with a documented file format into MediaWiki.

Example: CIA World Factbook 2002[edit | edit source]

As an example, there is the public domain data from the CIA World Factbook 2002 which was imported into the MediaWiki Wikitravel.

This is a one-time script; most paths and coding are hard-coded, and lots of the code is for parsing the CIA World Factbook print pages, but it might serve as a good example of what can be done.

Importing content in Windows PowerShell[edit | edit source]

Manual:Importing XML dumps describes various tools to import XML dumps of wiki pages, including the Special:Import wiki page. In addition, this script is a Windows PowerShell script that creates a MediaWiki-compliant XML file from your wikitext and fakes an html form submission to this wiki page with the correct parameters.