Manual:Importing external content

From MediaWiki.org
Jump to: navigation, search

For existing sites it is always a tough task to migrate to a Wiki structure; the process of wikifying existing content from text files, HTML websites, or even office documents can be automated, but you'll have to write appropriate scripts on your own. As far as we know, there are no general-user ready-to-run scripts available for this purpose. Opposed to commercial Content Management Systems like Hyperwave or HTML editors like Microsoft FrontPage, MediaWiki and other Open Source WikiWiki software does not include import filters. There are some exceptions which will be discussed in the following sections.

Contents

[edit] Wikis

[edit] Converting content from a UseMod Wiki

Prior to MediaWiki (Wikipedia Software Phase III and Phase II), the Wikipedia utilized the UseMod Wiki software written by Clifford Adams. UseModWiki is a Perl script which uses a database of text files to generate a WikiWiki site. Its primary access method is through CGI via the Web, but can be called directly by other Perl programs. To convert an existing UseMod Wiki site, there is a script available in the /maintenance subdirectory of your MediaWiki source folder.

The storage format of UseMod Wiki is well documented: DataBase.

[edit] Converting content from a PHPWiki

If you only have a few pages to convert, and the content isn't sensitive, you might want to try WebForce's online markup converter.

For larger PHPWikis, Isaac Wilcox has written a Perl script to do the conversion. It converts all the commonly used markup (still not 100% of markup, but most PHPWikis will only need minor tweaks after conversion; patches are welcome). It's written for the Mediawiki 1.4.x database schema, though updating it to handle 1.5.x should be fairly easy (again, patches welcome).

The above script works well, but the schema has changed quite a bit since it was written. I found it easier to install the last stable 1.4.x version, import my data, then upgrade mediawiki. The script did an excellent job of preserving almost all of the formatting.

Also see PhpWiki conversion for a solution that uses "sed".

Another solution (combination of already mentioned ones): User:Atrox/Phpwiki2Mediawiki.

[edit] Converting JSPWiki format to MediaWiki format

You can use jspwiki2mediawiki.pl to convert JSPWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the JSPWiki format.

[edit] Converting TracWiki format to MediaWiki format

You can use tracwiki2mediawiki.pl to convert TracWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the TracWiki format.

[edit] Converting MoinMoin format to MediaWiki format

There are various scripts for this, all dodgy. See MoinMoin.

[edit] Converting WackoWiki to MediaWiki

There is WackoWiki converter (developed for http://freesource.info/ migration to http://altlinux.org/), however it will need additional tweaking before use.

[edit] Converting TikiWiki format to MediaWiki format

You can convert TikiWiki pages to MediaWiki format using this script.

[edit] Converting content from a CSV text file

If you are using Windows you can try csv2other. It produces an output file with .txt extension containing code for a wiki table.

[edit] Converting content from HTML text file

If you have only a HTML excerpt or a few pages to convert, you might want to try Diberri's html2wiki converter, which uses HTML::WikiConverter Perl module from CPAN. For larger collection of files one should probably use the module itself.

You could also try MwImporter, a php script for importing entire websites; it uses html2wiki and other MediaWiki maintenance scripts to import entire directories of static html and image files while preserving relative links, etc.

There are probably other HTML to Wiki markup converters.

See the section below for importing into MediaWiki.

[edit] Converting content from a MS-Word document

Try: Word2MediaWikiPlus.

OpenOffice also does a good job of reading MS Word and a usable job of exporting as MediaWiki wikitext.

[edit] Converting content from other sources

If you are able and willing to do some scripting by yourself, it is possible to import almost any existing textual content with a documented file format into MediaWiki.

[edit] Importing content from a powershell

If you are also using your mediawiki for internal documentation or as a team knowledge base you might want to automate page creation from a shell. You can use a powershell script that works with HTTP - so it should work from most of your servers without firewall problems.

Mediawiki offers a Special page where you can import pages. But this is not a webservice API or something like that - it's an html form with an xml file upload. So you have to create a mediawiki-compliant xml file and fake an html form submission with the correct parameters ... you can find the script at slash4 powershell mediawiki script

[edit] Example: CIA World Factbook 2002

As an example, there is the public domain data from the CIA World Factbook 2002 which was imported into the MediaWiki Wikitravel.

This is a one-time script; most paths and coding are hard-coded, and lots of the code is for parsing the CIA World Factbook print pages, but it might serve as a good example of what can be done.

Personal tools
Namespaces
Variants
Actions
Site
Support
Download
Development
Communication
Print/export
Toolbox