Extension:MediaWikiFarm

The extension MediaWikiFarm creates farms of wikis with the following main features:
 * Multiversion: As a standard MediaWiki extension, it turns MediaWiki into a single-version farm; installed "in front of MediaWiki" it creates multiversion farms;
 * Hierarchical configuration: The configuration can be written either in YAML, PHP, or JSON, in a hierarchical manner: default value, default value for the wikis family, value for a single wiki;
 * Performance: Caching, caching, caching; there is one cache per file, caches are written as PHP arrays to benefit from OPcache extension and a final per-wiki LocalSettings.php (in PHP hence) is issued; with this cache architecture, this extension adds a mean time of 85µs per request compared to a standard LocalSettings.php (on a personal computer; PHP7+OPcache).

Detailed features

 * MediaWiki farm: each wiki is identified by its host, with dedicated version, configuration, and data directories;
 * Multiversion farm: each wiki is one of the installed MediaWiki version, with extensions and skins selected for this version, and extensions and skins activated or not for this wiki;
 * Hierarchical configuration: each parameter can be specified at different scales: for every wiki in the farm, for a specific family, or for a single wiki;
 * Multi-files configuration: the configuration can be distributed accross files with possibly different permissions, e.g. all passwords in a secret file, other parameters in a public file;
 * Configuration in YAML, JSON, or PHP: these three file formats are supported (YAML needs an external library);
 * Transparent from MediaWiki point of view: the configuration is a classic  (after compilation from the configuration files) and the initial bootstraping code is hiding itself as much as possible;
 * Per-wiki activation of extensions, even Composer-managed extensions: each extension can be activated per-wiki; Composer-managed extensions must be handled specifically during their installation, but they become then per-wiki activable (together with their dependent librairies and extensions obviously);
 * Caches: all configuration files and final configuration are cached as PHP arrays, interoperable accross PHP versions and cacheable in OPcache;
 * CLI compatible: a command-line script selects the right wiki just like in the Web version;
 * Multi-farms: multiple farms can be created, each one with its own configuration files, possibly shared accross farms to e.g. create pre-production wikis with slightly different configurations;
 * Syntax-error-proof: syntax errors in configuration files do not crash the farm, just silently ignored when cache is activated;
 * Delayed version switch: during version switch, the version does not change until the  script is run;
 * Custom 404: it can be displayed a customised HTTP 404 page for missing wikis, written in PHP or HTML;
 * Classical LocalSettings.php: although only visible in the cache directory, there is a classical ;
 * Whole range of MediaWiki versions: every MediaWiki version from 1.1 to 1.28 works in multiversion farm (small issue for 1.1 and 1.2 still pending);
 * Wide range of PHP versions: every PHP version from 5.2 to 7.1 works (YAML library requires 5.3.2 or 5.5.9, so YAML-to-PHP must be handled separately for old MediaWiki versions);
 * Documented: all functions are documented and a general documentation is available in a subdirectory;
 * Translated: the description on Special:Version is translatable on TranslateWiki although, well, there is only string in the user interface;
 * Unit-tested, well: about 100 tests (~256 assertions) run in 10-15 seconds in the MediaWiki-PHPUnit framework with recent MediaWiki versions with PHP 5.6 to 7.1, totalling almost 100% code coverage with strict coverage activated and global variables under watch; standalone PHPUnit also works (runs in 0.5 seconds);
 * Performance-tested: a basic testing infrastructure measures the performance in order to be aware of the performance impact of various strategies during development (e.g. the per-wiki LocalSettings.php was added because it was 32% (77µs/239µs) quicker than the previous implementation).

Installation
There are two installation modes:
 * an almost classical installation in the  subdirectory, but limited to single-version MediaWiki farms;
 * a more complex and completely unusual installation for multiversion MediaWiki farms.

These instructions are only minimal; you should read the documentation available in the subdirectory  to understand the goals, the basic concepts, and prepare your naming scheme (the URLs) and directory organisations before you start installing it.

Quick installation

 * Download the MediaWikiFarm extension and copy the directory in the usual subdirectory ;
 * 1) Optionally run Composer in the subdirectory extension (to read YAML files);
 * 2) Create a directory where you want to put your configuration, e.g.  ;
 * 3) Create and configure the main configuration file , e.g. from the sample file;
 * 4) Move your existing LocalSettings.php in this configuration directory;
 * 5) Copy the placeholder LocalSettings.php provided by MediaWikiFarm in your MediaWiki directory.

Multiversion installation
C.f. documentation in the subdirectory

Documentation
An extensive documentation is available in the subdirectory  of the extension. Some basic instructions should be transferred here (installation, basic configuration).

Be aware it is non-trivial to set up a MediaWiki farm; this extension only improves the MediaWiki part, after that it remains farm-type configuration of: DNS, webserver, MySQL, HTTPS, domain names, external services (memcached, Parsoid, *oid…), this is the job of a configuration management tool (puppet, Chef, Ansible, Cfengine, etc.).

Overview
[[File:Call graph of extension MediaWikiFarm.svg|thumb|300px|Static call graph of extension code, featuring the three classes.

An alternative view is a visualisation of the interactions between data (object properties) on this page "MediaWikiFarm static analysis"]]

This extension is the core to run MediaWiki farms, and it is particularly important it runs: 1/ quickly, 2/ on all MediaWiki versions, 3/ on all PHP versions since 5.2. Unfortunately it makes development with more constraints but it can be taken as a challenge :) If a feature cannot meet these requirements, possibly oldest MediaWiki versions can run in a degraded mode (it is already the case because the YAML library does not run on PHP 5.2), but it should be avoided as much as possible. The syntax must be understood by the PHP 5.2 parser (notably no namespaces) and some care must be taken for some details (no short array syntax, no __DIR__). Non-core features (e.g. create statistics) should be created in a separate extension.

Before development, you have to install the farm on your local computer. It was only tested on GNU/Linux Debian with nginx, so please report if you have issues on other systems. The next advices are for the development tools in use; it is better if you can follow them, but it’s fine if you create code review requests without it, if you don’t know these tools. You can browse the documentation in the  subdirectory and you can run PHPDoc to get a nice HTML summary of the methods (see in Quality tools below how to run it).

You can send code review requests (pull requests in GitHub language) with Gerrit (quick cheat guide: install git-review, create an account and set up your credentials in Gerrit, "git clone" the repository, "git review -s" in the repo, "git review -R" to submit a request, "git commit -a --amend" then "git review -R" to create an amended revision).

Quality tools
A PHPUnit test suite can be run with the following command, you have to run it in your farm with a valid MediaWiki version greater than 1.19 (when PHPUnit tests were introduced) (PHPUnit must be installed with Composer’s require-dev in the considered MediaWiki). It should not alter the database, but it cannot be guaranteed. To get coverage results, you have to modify the file  and add in the   section something like:. If possible, please write tests to keep the 100% code coverage.

php bin/mwscript.php --wiki=WIKI.YOUR-LOCAL-FARM.net tests/phpunit/phpunit.php --coverage-html coverage-MediaWikiFarm --strict-coverage --group MediaWikiFarm -v

A performance test can be run with a dedicated small script. Before it, you have to change your webserver config so that the main entry point  must be   and check MediaWiki runs without errors when you navigate with this special entry point (if there are errors like 500, the performance script will be quicker but will not issue errors). You can set the sample size on the command line (5000 recommended before issuing a code review request).

php tests/perfs/perfs.php --wiki=WIKI.YOUR-LOCAL-FARM.net https://WIKI.YOUR-LOCAL-FARM.net 5000

It is better to run the code syntax tool PHP_CodeSniffer (you have to run the Composer’s require-dev section in farm directory).

./vendor/bin/phpcs

A JSON schema is available in the subdirectory  to describe the main config file  ; it should be updated when needed.

A developer documentation is created with PHPDoc (you have to run the Composer’s require-dev section in farm directory). Please document the new methods/parameters/etc.

composer phpdoc

Testing
The extension has been tested with a multiversion installation with MediaWiki ranging from 1.1 to 1.28alpha, with PHP 5.2 (MediaWiki 1.1-1.12), PHP 5.6 (MediaWiki 1.13-1.21), PHP 7.0 (MediaWiki >= 1.22), and PHP 7.1RC1 (MediaWiki 1.28alpha), with nginx 1.6.2 with fastcgi module. One minor issue is still pending to run MW 1.1 and 1.2.


 * See also User:Seb35/MediaWiki Archaeology.

There is a unit test suite with about 100 tests (250 assertions). This can be run with the customised MediaWiki PHPUnit (see  in MW directory). This specific test suite can be run separately with. It should run without error with strict coverage, with a total code coverage of 100%. Only raw scripts and some untestable methods are code-coverage-ignored (4 methods out of 38). Note there is a global function that PHPUnit doesn’t code-coverage, see.

MediaWiki deactivated backup of global variables, but it can be re-enabled in this test suite by modifying the object property in constructor of MediaWikiFarmTestCase (slightly slower); in this case, expectedly-modified global variables are declared at the beginning of the test and any other global variable modified reports the test as risky. If a test is really long, possibly PHPUnit computes the differences of values of global variables between the beginning and the end of the test (count 10 minutes for a heavily modified global state).

A performance test is implemented in subdirectory. It is recommended to use it only in a development environment. To use it, you must have the farm installed and point, in the webserver config, the  to , then launch in command line the script. You must either delete the cache directory just before or give permissions to the script to do it. The first execution creates the cached LocalSettings.php, so it’s normal to see a "high" time in 'config farm', but it then divided by the sample size. With PHP 7.0 and MediaWiki 1.28alpha with the farm cache activated, I have figures (sample size=5000): 157µs for bootstraping farm, 219µs for executing the LS.php of farm (most (~177µs) of it is initialisation of ExtensionRegistry), 221µs for executing the classical LS.php (the two LS.php are strictly the same file, so it is not significant), and 9.984ms for the compilation of configuration (when cache is invalidated). With the same sample size, after the introduction of the cache for server names, the bootstraping time dropped to 85µs.

Ideas
Some ideas:
 * 1) take into account the special nature of the MediaWiki parameter $wgExtensionDirectory to research extensions (similar for skins) in the right directory
 * 2) be able to run maintenance scripts on multiple wikis in a single command call (require to census wikis, see previous point also)
 * 3) handle constants like CACHE_NONE, or more generally PHP syntax to use e.g. references; possibly this would be another config file with list of PHP-enabled parameters (or even a restricted list of PHP syntaxes authorised)
 * 4) logging is now implemented, next step is a better handling of errors/warnings
 * 5) add tags for transversal selection of wikis: think about file format, how to store them: lists and/or dictionaries
 * 6) reorganise tests to focus more specifically on *unit* tests (=tests on small parts) and think about creating higher-view tests (mock some MW parts and check for instance an extension is really activated, or test scenarios like updating a config file or transitionning from no-cache to cache to no-cache)
 * 7) issue a version 1.0
 * 8) define the list of features above of what should be in version 1.0
 * 9) move documentation from the /docs directory to MediaWiki.org and improve it
 * 10) last check on the architecture and method names to avoid a major change in the near-future
 * 11) request a tag on Phabricator for the issues
 * 12) BUG: mwcomposer has an issue with composer.json with keys 'authors' and 'keywords': it says they should be an array but is an object (they can be temporarily removed from the composer.json, but it should be fixed)
 * 13) Check when a skin is default skin and a Composer-installed one, but is not activated (example Chameleon)
 * 14) Verify the cache is updated when a variable has one value removed from its list, e.g. a wiki is deleted
 * 15) mwcomposer should execute in the current directory instead of searching the version of a specific wiki (typically, you want to execute composer *before* updating the version)
 * Check existence of a wiki in the config files, similarly to this issue
 * 1) Split out the main class MediaWikiFarm into two classes: a main "startup" class used in normal operation when cache is read, and an auxiliary class used when cache must be written and configuration must be compiled; goal are to reduce the weight in Opcache and return back to a reasonable size for mental understanding purposes, a first raw experiment decresed the file size of src/MediaWikiFarm.php from 64KiB to 41KiB and size in OpCache from 215KiB to 111KiB; for this task this "static analysis" table can be useful with the "loaded" profile, and the "readFile" method should be splitted into two methods (readCachedFile and readFile).

Implemented:
 * 1) reduce the initial bootstraping from 3 files read (farms.yml.php, versions.yml.php, config-version.php) to 2 (metadata.php, config-version.php); additionally, such a file could be used to collect all wikis defined on the farms (will be useful to execute whole-farm operations like update.php or census wikis (e.g. for stats)) done but it currently don’t allow to census wikis
 * 2) research if it would be possible to handle Composer-installed extensions (see this discussion), better is only the "activation" feature, but more probably it would have to hook Composer to create isolated "autoloading parts for a given wiki" with ideally all other code (MW extensions and Composer libraries) laying in their respective directories -> Implemented and will soon become official (gerrit I027251fabb32d6543), also I wrote a "technical report" about Composer-managed extensions in a wiki farm
 * 3) collect more errors and warnings -> implemented in 0.4.0 with syslog
 * 4) think about ways to display errors+warnings, either on-wiki, on a central wiki, or on-file -> partially implemented in 0.4.0 with syslog

Related extensions

 * Extension:Farmer
 * Extension:Simple Farm
 * Extension:WikiFarm
 * Extension:Configure
 * Extension:Shared codebase
 * Extension:CreateWiki