Library infrastructure for MediaWiki/status

From mediawiki.org

Last update on: 2014-12-monthly


2014-10-23[edit]

This project has been chosen as a Wikimedia Engineering Top Priority project for FY2014-15, Q2.

2014-10-monthly[edit]

The project kicked off mid-month with the merge of patches that enable PSR-3-based logging by MediaWiki. These changes are being tested in beta and will begin to roll out to the production cluster in early November with the 1.25wmf6 release branch.

An investigation into the possibility of using a package manager for JavaScript libraries in MediaWiki closed with the consensus opinion that we are not ready to choose a package manager at this time. The frontend standards group will revisit this decision in three to six months. It was agreed however that as far as possible JavaScript libraries should follow the guidelines for library development that are being worked on for PHP code.

Initial work has begun on a Profiler implementation that uses XHProf to collect information about the runtime costs of MediaWiki code. This approach to profiling will enable collection of information on code running on Wikimedia servers without relying on explicit wfProfileIn() and wfProfileOut() calls. This in turn will make splitting code out of MediaWiki core easier.

2014-11-18[edit]

Mid quarter progress update

At (or at least near) the midpoint of our initial three month project, we are making good progress on our major commitments and have picked up some interesting additional work.

Aaron Schulz and Chad Horohoe have joined the team. Both are helping make updates to the Profiler classes used to measure the performance of MediaWiki related to the Better PHP profiling RFC. Profiling was identified early on as a common entanglement for many generally useful utility libraries found in the MediaWiki codebase. The work that is progressing here should enable us to remove many explicit wfProfileIn() and wfProfileOut() calls in the MediaWiki PHP code while still getting the benefit of performance measurements via the XHProf profiling library.

The cssjanus library has been removed from MediaWiki's core repository and replaced with a Composer managed import from the offical upstream. We have also extracted the CDB library originally written by Tim Starling and published it on Packagist.

Several classes have been moved from includes/utils to includes/libs (ArrayUtils, MapCacheLRU, Cookie/CookieJar) which makes them easy candidates for publication in stand alone libraries in the future. Aaron is working on a list of possible libraries to create from the MediaWiki codebase that would group several useful classes together. This should produce more sustainable projects than having literally dozens of libraries made up of only a single class.

Bryan is continuing to work on structured logging changes and hopes to soon test a Monolog based logging pipeline in Beta to replace the current system.

2014-11-monthly[edit]

Aaron Schulz and Chad Horohoe joined the team. Both are helping make updates to the Profiler classes used to measure the performance of MediaWiki related to the Better PHP profiling RFC. Profiling was identified early on as a common entanglement for many generally useful utility libraries found in the MediaWiki codebase. The work that is progressing here should enable us to remove many explicit wfProfileIn() and wfProfileOut() calls in the MediaWiki PHP code while still getting the benefit of performance measurements via the XHProf profiling library. Profiling via the XHProf functionality built into HHVM is currently in use in both the beta and production clusters and helping drive some low hanging fruit code improvements.

Bryan is continuing to work on structured logging changes and is testing a Monolog based logging pipeline in Beta to replace the current system. MWFunction::newObj has been deprecated and all usage in the core or MediaWiki replaced with the new ObjectFactory class which was introduced by the PSR-3 logging changes.

The cssjanus library has been removed from MediaWiki's core repository and replaced with a Composer managed import from the official upstream. The lessphp CSS pre-processor which was historically manually copied into MediaWiki's git repository is now imported via Composer.

The CDB library originally written by Tim Starling has been extracted to its own git repository and published on Packagist. Both MediaWiki itself and the "multiversion" scripts that are used to manage the WMF wiki family are now importing CDB via Composer instead of the old practice of keeping two copies of the code updated manually in the respective repositories.

The simplei18n PHP library that was developed for the IEG's Grant review application based on code from the Wikimania Scholarships application was transferred from Bryan's personal github account to the official Wikimedia account.

External dependencies for the BounceHandler and Elastica extensions have been removed from the extension git repositories and replaced with Composer managed imports. For the WMF cluster, these dependent libraries have been added to the mediawiki/vendor.git repository. ExtensionDistributor has been updated to package composer managed dependencies in the tarballs it generates for installing extensions. The php-composer-validate test is now applied to all extensions and skins to validate the syntax of composer.json when changes are uploaded to gerrit.

Several classes have been moved from includes/utils to includes/libs (ArrayUtils, MapCacheLRU, Cookie/CookieJar) which makes them easy candidates for publication in stand alone libraries in the future. Aaron is working on a list of possible libraries to create from the MediaWiki codebase that would group several useful classes together. This should produce more sustainable projects than having literally dozens of libraries made up of only a single class.


2014-12-monthly[edit]

Work on integrating XHProf as the preferred method of profiling an individual request has largely been completed. Ori Livneh has also built a parallel infrastructure for use on the Wikimedia Foundation servers that profiles HHVM using a extension named Xenon. The flame graph output of the Xenon based reporting can be seen at performance.wikimedia.org and the scripts that power it are available on GitHub.

Documentation on the addition of external libraries to MediaWiki core is underway. An RFC on guildelines for extracting, publishing and managing libraries has also been drafted and will be under discussion by the MediaWiki developer community. A blog post describing the accomplishments and next steps for the continuation of the project is being drafted as well and is expected to be published in January 2015.

Antoine Musso and Timo Tijhof are finalizing a standard practice for testing Composer managed projects using CDB as a test bed for refining the techniques.

Structured logging implementation is continuing with the use of Monolog and Redis transport being tested in the beta cluster and a subset of the production Wikimedia Foundation cluster. A new Monolog handler was written and upstreamed to Monolog to support the use case of randomly sampled log event streams. Chris Steipp also contributed a security related patch to Monolog based on a security review he did for deployment on the Wikimedia Foundation cluster.

The project's "top priority" status expires with the end of the fiscal year quarter, but organic contributions and a raised awareness of the functionality offered by library extraction and integration of high quality third party code is expected to continue. A list of future projects has been started and Bryan Davis and others who have participated in the project have high hopes for the transformative changes that this project has unblocked in MediaWiki.