Project:PD help/export

Goals

 * To provide public domain help content for the MediaWiki software
 * The help should be available in as many languages as possible
 * Our mechanism should be able to scale to hundreds of languages.
 * To have a simple set of guidelines for people creating the help content
 * As few rules as possible.
 * To automate the process of converting the on-wiki help content into downloadable files, ready for import.
 * A secondary aim might be to have an automated script to import the files too
 * To make the help content available in the following forms:
 * Single language, in the main (localised) Help: namespace.
 * All languages combined (a mirror of Help: on MW.org)
 * Multiple arbitrary languages, one in the main Help: namespace, the rest as sub-pages (as per MW.org).
 * With or without images (though this could be tricky).



Dumps
The dumps will be in the standard MW export format. The following dumps will be available:
 * A single dump containing all languages. This will mirror the current Help: namespace.
 * Individual dumps for each language, ready to be imported to the main Help: namespace.
 * Individual dumps for each language, ready to be imported into appropriate sub-pages within the Help: namespace.

We also need to consider how images are handled.

Exporting the data
Exporting the data will be an automated process, that will create the above dumps from the pages in the Help: namespace. The format of the exported code is already defined (it is the standard export format generated by Special:Export). The program checks all pages in the namespace and adds them to the appropriate language file, with the following modifications to the wiki text:


 * All template inclusions that do not start Help: are removed.
 * All interwiki links are expanded to full URLs, using the data in the interwiki table.
 * All internal links that do not point to the Help: namespace are rewritten as full URLs pointing to MW.org
 * All internal links within the help namespace are left as they are, with the following exceptions:
 * If exporting the English pages as sub-pages, all pages are rewritten from Help:Name to Help:Name/en
 * If exporting non-English pages as main pages, all pages are rewritten from Help:Name/lang to Help:Name
 * The same translation is performed on template inclusions within the Help: namespace.
 * The log will contain warnings about help pages that link to other-language pages, but these will not be modified.
 * Category links that do not begin 'Category:Help:' are removed.
 * Links to our special template (which will have a pre-defined name) will be changed to link to an alternative template that is simply displays the 'correct' name of the page.
 * These names have not been finalised, and there will be multiple versions of the templates (one for each language). Example:   might become.

A dump is made for each language as a 'main' page and as a 'sub-page'. In addition the English 'main' pages and all the other sub-pages are combined into the single complete dump.

The following points should also be noted:


 * is not exported.
 * Only the most recent version of a page is exported - the history is not exported.
 * Blank pages are not exported.
 * Redirects to pages outside of the Help: namespace are not exported (these are generally pages which have been moved)
 * Useful redirects within the Help: namespace will be kept, so that they can be exported too. Redirects that result from a page move and which are not useful should be deleted.
 * We should standardize soft redirects, so the scripts can recognize those, as well. (please expand on this...)
 * Page author is 'MediaWiki default' (as per default interface messages).
 * The edit summary is 'Imported from MWURL', where MWURL is a clickable link to the original page source.
 * Note that this assumes the required entry is in the interwiki table, since we need the syntax to make the link.
 * The edit date is the date of export (though ideally it will be the date of import to the target wiki).

All categories that begin 'Help:' are also exported, using the same rules as above.

Foreign/RemoteHelp
Rather than dumping and importing as real articles, another approach would be to use the API to fetch (and then locally cache) the documentation/help. (a bit like InstantCommons), with the difference that they are still editable (a bit like MediaWiki:-messages: while non-existant they show the default (mw.org's current content), once edited/created the edit page starts with the current content from mw.org and once saved they are static.. And they can be deleted at any time to serve the default. This also makes updating from our a lot easier.

Redirects on MW.org could solve "Native language for page title". ie. http://mysite.org/wiki/Help:Hilfe_mir would fetch http://mediawiki.org/wiki/Help:Help_me/de if Help:Hilfe_mir is a redirect to Help:Help_me/de.

Something that should be thought about though is different releases. ie. if ForeignHelp would come into existance and a 1.17 wiki is using it. What if 1.18 comes out and we update help pages. Etc. A possible solution would be to use subpages for versioning (the way jQuery UI used to do it). So the English version of Help:Contents would instead be at: "Help:1.17/Contents" or "Help:Contents/1.18/en".

Native language for page title
It would be good to have the main pages named in the appropriate language when imported into the new wiki. In order for this to be feasible, we will also require an import script to do whatever conversions are required in order to make this work. Without an import script we should stick to the above method of page naming otherwise it will not be possible to 'add' a language to a wiki. However, once this is implemented, it should be possible to achieve a fully-flexible naming system on target wikis without altering the layout of MW.org.

Vandalism
If we are automating the dump process, we probably need some way of flagging 'safe' (non-vandalised) copies of the help content. We should not be hosting dumps that contain vandalism - it will damage our reputation to provide downloadable help files that say "FSDF YOU'RE GAY." This could be done by installing the Flagged Revisions extension.