Manual:Importing external content
|It has been suggested that this page or section be merged with commons:Commons:Convert tables and charts to wiki code or image files. (Discuss)|
For existing sites it is always a tough task to migrate to a Wiki structure; the process of wikifying existing content from text files, HTML websites, or even office documents can be automated, but you'll have to write appropriate scripts on your own. As far as we know, there are no general-user ready-to-run scripts available for this purpose. Opposed to commercial Content Management Systems like Hyperwave or HTML editors like Microsoft FrontPage, MediaWiki and other Open Source WikiWiki software does not include import filters. There are some exceptions which will be discussed in the following sections.
- 1 Wikis
- 1.1 Converting content from a UseMod Wiki
- 1.2 Converting content from a PHPWiki
- 1.3 Converting JSPWiki format to MediaWiki format
- 1.4 Converting TracWiki format to MediaWiki format
- 1.5 Converting MoinMoin format to MediaWiki format
- 1.6 Converting WackoWiki to MediaWiki
- 1.7 Converting TikiWiki format to MediaWiki format
- 1.8 Converting GoogleCode Wiki to MediaWiki
- 2 Converting content from a CSV text file
- 3 Converting content from HTML
- 4 Converting content from a MS-Word document
- 5 Converting content from other sources
- 6 Importing content in Windows PowerShell
Wikis[edit | edit source]
Converting content from a UseMod Wiki[edit | edit source]
Prior to MediaWiki (Wikipedia Software Phase III and Phase II), Wikipedia ran on the UseMod Wiki software written by Clifford Adams. UseModWiki is a Perl script which uses a database of text files to generate a WikiWiki site. It usually runs as a CGI script in response to web requests, but can be called directly by other Perl programs.
The storage format of UseMod Wiki is well documented.
Converting content from a PHPWiki[edit | edit source]
For larger PHPWikis, Isaac Wilcox has written a Perl script to do the conversion. It converts all the commonly used markup (still not 100% of markup, but most PHPWikis will only need minor tweaks after conversion; patches are welcome). It's written for the Mediawiki 1.4.x database schema, though updating it to handle 1.5.x should be fairly easy (again, patches welcome).
The above script works well, but the schema has changed quite a bit since it was written. I found it easier to install the last stable 1.4.x version, import my data, then upgrade mediawiki. The script did an excellent job of preserving almost all of the formatting.
Also see PhpWiki conversion for a solution that uses "sed".
Another solution (combination of already mentioned ones): User:Atrox/Phpwiki2Mediawiki.
Converting JSPWiki format to MediaWiki format[edit | edit source]
You can use jspwiki2mediawiki.pl to convert JSPWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the JSPWiki format.
Converting TracWiki format to MediaWiki format[edit | edit source]
You can use tracwiki2mediawiki.pl to convert TracWiki pages to MediaWiki format.
The basis for this tool is php2mediawiki by Isaac Wilcox. php2mediawiki provided a convenient basis for this converter and the modifications added to it were introduced to support the conversion of the TracWiki format.
Converting MoinMoin format to MediaWiki format[edit | edit source]
There are various scripts for this, all dodgy. See MoinMoin.
Converting WackoWiki to MediaWiki[edit | edit source]
Converting TikiWiki format to MediaWiki format[edit | edit source]
Converting GoogleCode Wiki to MediaWiki[edit | edit source]
It allows to store pages in both formats - .gw (googlecode) and .mw (mediawiki), and scripts to support bidirectional svn - mediawiki transfer.
Converting content from a CSV text file[edit | edit source]
If you are using Windows you can try csv2other. It produces an output file with .txt extension containing code for a wiki table.
Converting content from HTML[edit | edit source]
- The Html2Wiki extension. The extension relies on Pandoc, which is a command-line document conversion tool. The extension allows users to import HTML content directly into the wiki, including images. Import entire websites, or complete web pages that you save from the browser (such as Google Docs).
- Pandoc has an online demo. It has command line tool, integration to Python/Ruby and written in Haskell.
- HTML2Mediawiki Java library with online demo.
Older tools[edit | edit source]
- https://tools.wmflabs.org/magnustools/html2wiki.php can convert HTML tables into MediaWiki table syntax
- HTML-WikiConverter-0.68 Perl module
- MwImporter, a php script for importing entire websites; it uses html2wiki and other MediaWiki maintenance scripts to import entire directories of static html and image files while preserving relative links, etc.
Converting content from a MS-Word document[edit | edit source]
Microsoft Office Word Add-in For MediaWiki saves documents from Microsoft Office Word straight into MediaWiki.
OpenOffice also does a good job of reading MS Word and a usable job of exporting as MediaWiki wikitext.
Converting content from other sources[edit | edit source]
If you are able and willing to do some scripting by yourself, it is possible to import almost any existing textual content with a documented file format into MediaWiki.
Example: CIA World Factbook 2002[edit | edit source]
This is a one-time script; most paths and coding are hard-coded, and lots of the code is for parsing the CIA World Factbook print pages, but it might serve as a good example of what can be done.
Importing content in Windows PowerShell[edit | edit source]
Manual:Importing XML dumps describes various tools to import XML dumps of wiki pages, including the Special:Import wiki page. In addition, this script is a Windows PowerShell script that creates a MediaWiki-compliant XML file from your wikitext and fakes an html form submission to this wiki page with the correct parameters.