How to become a MediaWiki hacker


 * Some of this information may be outdated or incorrect. If you are familiar with the topic, please attempt to bring this article up to date.

This guide will explain and link to pages containing information on MediaWiki's development process, and answer questions from neophyte developers. If you plan to help us code and develop the software, but don't have the necessary skillset yet, this is a good place to start.

Operating systems
The MediaWiki software is written in PHP and uses the MySQL database. Both have been ported to a variety of operating systems, including, but not limited to, most Unix variants (Linux, etc) and Microsoft Windows. It is therefore possible to install and use MediaWiki under both systems. Note that if you do use Windows, certain features involving external utilities will be unavailable, or only available with special downloads and configuration. OS-dependent bugs are occasionally observed, it is best to have some knowledge of the difference between the various platforms regardless of which OS you develop on.

The PHP programming language
If you have no knowledge of PHP (PHP stands for "PHP: Hypertext Preprocessor") but know how to program in other object-oriented programming languages, have no fear, PHP will be easy for you to learn.

If you have no knowledge of PHP or other object-oriented programming languages, you should familiarize yourself with concepts such as classes, objects, methods, events and inheritance.

If you have no knowledge of any programming language, PHP is a good language to start with, as it is reasonably similar to other modern languages, although it is specific in the way it is executed.

PHP scripts can run from the command line, or a window manager is enough to call the interpreter. e.g. (Linux/UNIX)

/usr/bin/php -q < phpshell.php

Usually, for websites a PHP script is executed when you request a file with the ".php" extension (among others) from a webserver. As you do that, the web server, in our case Apache, calls the PHP interpreter (which may be built into the webserver), interprets the PHP file and returns the result to your browser. The PHP file can contain both regular HTML and PHP code, which makes it relatively simple to add dynamic functionality to a static webpage.

Related links

 * PHP tutorial (available in many different languages)
 * The PHP manual (available in many different languages)
 * PHP at Wikibooks

Database
MediaWiki currently uses MySQL as the primary database backend. It also supports other DBMSes, such as PostgreSQL. However, almost all developers use MySQL and don't test other DBs, which consequently break on a regular basis. On the other hand, commits that break MySQL, or (even worse) don't appear to break it but then turn out to not execute efficiently on Wikipedia and slow down or crash the site, will be met with hellfire and brimstone cast down from the sysadmins.

You're therefore advised to use MySQL when testing patches, unless you're specifically trying to improve support for another DB. In the latter case, make sure you're careful not to break MySQL, because people will get very annoyed at you. "Breaking" MySQL includes adjusting a query so that it's more compatible with your database, but confuses the tiny brain of the MySQL optimizer and causes a filesort of the entire page table because that's obviously a better idea than reading ten rows in order from an index, or whatever. This kind of breakage is particularly fun, because if you're unlucky nobody will notice until the code goes live and Wikipedia dies, after which everyone will yell at you.

Although the WMF has now moved on from MySQL 4.0, it's important to not intentionally break MySQL 4.0 support. MySQL 4.0 is missing a lot of features of later MySQL versions (never mind other DBMSes): if you aren't sure, double-check in the manual first! The most commonly used feature missing from MySQL 4.0 is subqueries; don't use those outside of code specific to a non-MySQL DBMS.

Installing MediaWiki
Get the latest sources from SVN before creating patches. See Download from SVN for how to get the sources from SVN

Follow the instructions in the INSTALL file in the source. You could also read the installation guide.

It's not necessary to download Wikipedia database dumps in order to develop MediaWiki features. In fact, in many cases it's easier to use a near-empty database with a few specially-crafted test pages. However, if for some reason you want to have a copy of Wikipedia, you can get a dump from data dumps.

You may also find that you get an error complaining that access was denied to the wiki database. Make sure that you have created a file AdminSettings.php in your top-level MediaWiki install directory (the same place as LocalSettings.php is found). An AdminSettings.sample file is provided for you to customise - make sure your MySQL administration username and password is set correctly. See Manual:Upgrading for more details.

Rebuilding the link tables may take a long time, particularly if you've installed the English database, which is quite big. (Note also that you can skip the old table if you wish.) See Manual:Database layout on what rebuildall.php is good for.

Note that if you want to create a public mirror of Wikipedia, this probably isn't the best way to go about it. If you do set up a mirror this way, please tweak the code to note that you're looking at a mirror and include links back to the main site. See Forks and Mirrors for more info.

The MediaWiki codebase
The MediaWiki codebase is large and ugly. Don't be overwhelmed by it. When you're first starting off, aim to write features or fix bugs which are constrained to a small region of code.

You can browse the generated documentation (warning: huge page will be loaded).

One of the best ways to learn about MediaWiki is to read the code. Here are some starting points:


 * index.php is the main entry point, although where things go from there is not very obvious.
 * Article.php contains code for page view, delete, rollback, watch and unwatch. It also contains some general utilities for dealing with articles, such as fetching a revision or saving a page.
 * EditPage.php has about half of the code related to editing, the half that's close to the user interface. The rest is in Article.php and the various *Update.php files.
 * Parser.php has most of the code that converts wikitext to HTML. A few bits and pieces are in Skin.php
 * Linker.php has functions to generate the HTML for links and images
 * Code for most special pages is in the Special*.php files.
 * Database.php contains stacks of functions for accessing the database.
 * OutputPage.php is the home of the OutputPage class, which is an output buffer. Send your text here and it will be sent to stdout just before the script exits.
 * Title.php is all about titles -- and that includes interwiki titles and "#" fragments. There are some functions in here that will fetch information about an article from the database.
 * User.php contains the User class, which represents user preferences and permissions.
 * UserMailer.php, a collection of static functions for sending mail.
 * Setup.php does all sorts of initialisation, and seems to account for a large proportion of running time. Among other things, it initialises lots of global variables, mostly containing objects.
 * DefaultSettings.php contains defaults for lots of global variables, which may or may not be overridden in LocalSettings.php. Don't use, always add a default for any global variable you introduce.

For getting started with debugging, see How to debug.

See also Manual:Code.

Your first feature
Here are some ideas:
 * Code something that interests you;
 * Fix an annoying little bug that nobody else could be bothered with;
 * Write a special page to provide some handy information;
 * Write a parser hook;
 * Write a simple extension.

For more specific suggestions, please come and talk to the developers on #mediawiki. If you already have an idea for a feature you want to implement, it's also a good idea to talk to a senior developer before you start, especially if you're not sure how your feature will affect other parts of the code.

When you have a feature ready to go, ask for Subversion write access, so that you can commit it. Alternatively, you can post a patch in Bugzilla -- this can be a slower process and at times frustrating, but by doing it once or twice you demonstrate your good faith, and your ability to write reasonably stable code. In this regard, before you commit your feature, make sure it can be disabled easily.

Don't ask for shell access to the Wikimedia servers. There is no way to restrict shell access to some sort of sandbox, so shell access is only given to people whom we really trust. It pains us to turn people down, but often we must. Wait until it is offered, or if it's taking a long time, discreetly probe for support.

Testing
Use E_STRICT in your php.ini to have unnecessary warnings and notices reported early.

When adding features, it's vital to verify you didn't break existing functionality. The usual tool for this is automated testing frameworks. Unfortunately, MediaWiki has none. The only tests we have are parser tests (see maintenance/parserTests.php), which only test the parser. Try running php maintenance/parserTests.php --quick --quiet to see how those work. Everything should pass, in theory. You can add new tests or fix existing ones by editing maintenance/parserTests.txt.

If you want to add other automated tests, you'll have to add a test framework first. This is not recommended unless a) you already have commit access, and b) you're willing to put in the time and effort to getting decent coverage and stabbing people who cause test failures until hopefully everyone becomes resigned to put in the extra effort and figure out how the thing works. Otherwise your effort will join t/ and tests/ in the graveyard of MediaWiki testing frameworks that no one uses, maintains, or cares about. You have been warned. But everyone would appreciate it if you did it anyway, because not having a decent test framework is kind of pathetic for a project this size, and horribly un-trendy.

Alternative opinion to the above paragraph... the tests under /t seem relatively stable, and worth extending. You will get a bonus prize for every new function that you write tests for there.

Anyway, for the time being, do manual testing. If you cause breakage too often, people will get annoyed at you, especially if it isn't caught until it goes live on Wikipedia. Revocation of commit access has been threatened in the past occasionally. At the very least, expect serious indignation if you check in syntax errors – try at least loading your wiki, or php -l.

Posting a patch
If you have created and tested a patch, get a diff of the modified file by using:

svn diff path/to/modified_file.php > my.patch

Then post the patch as an attachment to the appropriate bug report.