How to become a MediaWiki hacker

From MediaWiki.org

Jump to: navigation, search
Some of this information may be outdated or incorrect. If you are familiar with the topic, please attempt to bring this article up to date.

This guide will explain and link to pages containing information on MediaWiki's development process, and answer questions from neophyte developers. If you plan to help us code and develop the software, but don't have the necessary skillset yet, this is a good place to start.

Contents

[edit] Essential reading

[edit] Operating systems

The MediaWiki software is written in PHP and uses the MySQL database. Both have been ported to a variety of operating systems, including, but not limited to, most Unix variants and Microsoft Windows. It is therefore possible to install and use MediaWiki under both systems. Note that if you do use Windows, certain features involving external utilities will be unavailable, or only available with special downloads and configuration. OS-dependent bugs are occasionally observed, it is best to have some knowledge of the difference between the various platforms regardless of which OS you develop on.

[edit] The PHP programming language

If you have no knowledge of PHP (PHP stands for "PHP: Hypertext Preprocessor") but know how to program in other object-oriented programming languages, have no fear, PHP will be easy for you to learn.

If you have no knowledge of PHP or other object-oriented programming languages, you should familiarize yourself with concepts such as classes, objects, methods, events and inheritance.

If you have no knowledge of any programming language, PHP is a good language to start with, as it is reasonably similar to other modern languages, although it is specific in the way it is executed.

PHP scripts can run from the command line or a window manager is enough to call the interpreter. e.g. (Linux/UNIX)

/usr/bin/php -q < phpshell.php

Usually, for websites a PHP script is executed when you request a file with the ".php" extension (among others) from a webserver. As you do that, the web server, in our case Apache, calls the PHP interpreter (which may be built into the webserver), interprets the PHP file and returns the result to your browser. The PHP file can contain both regular HTML and PHP code, which makes it relatively simple to add dynamic functionality to a static webpage.

[edit] Related links

[edit] Database

MediaWiki currently uses MySQL as the primary database backend. It also supports other DBMSes, such as PostgreSQL. However, almost all developers use MySQL and don't test other DBs, which consequently break on a regular basis. On the other hand, commits that break MySQL, or (even worse) don't appear to break it but then turn out to not execute efficiently on Wikipedia and slow down or crash the site, will be met with hellfire and brimstone cast down from the sysadmins.

You're therefore advised to use MySQL when testing patches, unless you're specifically trying to improve support for another DB. In the latter case, make sure you're careful not to break MySQL, because people will get very annoyed at you. "Breaking" MySQL includes adjusting a query so that it's more compatible with your database, but confuses the tiny brain of the MySQL optimizer and causes a filesort of the entire page table because that's obviously a better idea than reading ten rows in order from an index, or whatever. This kind of breakage is particularly fun, because if you're unlucky nobody will notice until the code goes live and Wikipedia dies, after which everyone will yell at you.

One particular thing to keep in mind is that Wikipedia runs MySQL 4.0, and MediaWiki therefore must maintain perfect support for MySQL 4.0. MySQL 4.0 is missing a lot of features of later MySQL versions (never mind other DBMSes): if you aren't sure, double-check in the manual first! The most commonly used feature missing from MySQL 4.0 is subqueries; don't use those outside of code specific to a non-MySQL DBMS.

[edit] Installing MediaWiki

Get the latest sources from SVN before creating patches. See Download from SVN for how to get the sources from SVN

Follow the instructions in the INSTALL file in the source. You could also read the installation guide.

It's not necessary to download Wikipedia database dumps in order to develop MediaWiki features. In fact, in many cases it's easier to use a near-empty database with a few specially-crafted test pages. However, if for some reason you want to have a copy of Wikipedia, you can get a dump from meta:data dumps.

You may also find that you get an error complaining that access was denied to the wiki database. Make sure that you have created a file AdminSettings.php in your top-level MediaWiki install directory (the same place as LocalSettings.php is found). An AdminSettings.sample file is provided for you to customise - make sure your MySQL administration username and password is set correctly. See Manual:Upgrading for more details.

Rebuilding the link tables may take a long time, particularly if you've installed the English database, which is quite big. (Note also that you can skip the old table if you wish.) See Manual:Database layout on what rebuildall.php is good for.

Note that if you want to create a public mirror of Wikipedia, this probably isn't the best way to go about it. If you do set up a mirror this way, please tweak the code to note that you're looking at a mirror and include links back to the main site. See Forks and Mirrors for more info.

[edit] The MediaWiki codebase

The MediaWiki codebase is large and ugly. Don't be overwhelmed by it. When you're first starting off, aim to write features or fix bugs which are constrained to a small region of code.

You can browse the generated documentation (warning: huge page will be loaded).

One of the best ways to learn about MediaWiki is to read the code. Here are some starting points:

  • index.php is the main entry point, although where things go from there is not very obvious.
  • Article.php contains code for page view, delete, rollback, watch and unwatch. It also contains some general utilities for dealing with articles, such as fetching a revision or saving a page.
  • EditPage.php has about half of the code related to editing, the half that's close to the user interface. The rest is in Article.php and the various *Update.php files.
  • Parser.php has most of the code that converts wikitext to HTML. A few bits and pieces are in Skin.php
  • Linker.php has functions to generate the HTML for links and images
  • Code for most special pages is in the Special*.php files.
  • Database.php contains stacks of functions for accessing the database.
  • OutputPage.php is the home of the OutputPage class, which is an output buffer. Send your text here and it will be sent to stdout just before the script exits.
  • Title.php is all about titles -- and that includes interwiki titles and "#" fragments. There are some functions in here that will fetch information about an article from the database.
  • User.php contains the User class, which represents user preferences and permissions.
  • Setup.php does all sorts of initialisation, and seems to account for a large proportion of running time. Among other things, it initialises lots of global variables, mostly containing objects.
  • DefaultSettings.php contains defaults for lots of global variables, which may or may not be overridden in LocalSettings.php. Don't use isset(), always add a default for any global variable you introduce.

For getting started with debugging, see How to debug.

See also Manual:Code.

[edit] Your first feature

Here are some ideas:

  • Code something that interests you.
  • Code a simple crowd-pleasing feature, an aesthetic improvement.
  • Write a special page to provide some handy information. You can even make a modular special page, there are examples in the extensions directory, and the extensions module.
  • Write a parser hook, see Manual:Extending wiki markup for more information.
  • Fix an annoying little bug that nobody else could be bothered with.

For more specific ideas, please come and talk to the developers on #mediawiki. Don't be put off for a lack of ideas. We have enough ideas between us to keep half a dozen programmers busy for a year.

It's a good idea to talk to a senior developer (e.g. Brion or Tim) on #mediawiki before you start, especially if you're not sure how your feature will affect other parts of the code.

When you have a feature ready to go, press for Subversion write access, so that you can commit it (for more on this, see commit access). Posting patches can be frustrating, although you may have to do it once or twice to demonstrate good faith. Before you commit your feature, make sure it can be disabled easily.

Don't ask for shell access to the Wikimedia servers. There is no way to restrict shell access to some sort of sandbox, so shell access is only given to people whom we really trust. It pains us to turn people down, but often we must. Wait until it is offered, or if it's taking a long time, discreetly test for support.

[edit] Testing

Use E_STRICT in your php.ini to have unnecessary warnings and notices reported early.

When adding features, it's vital to verify you didn't break existing functionality. The usual tool for this is automated testing frameworks. Unfortunately, MediaWiki has none. The only tests we have are parser tests (see maintenance/parserTests.php), which only test the parser. Try running php maintenance/parserTests.php --quick --quiet to see how those work. Everything should pass, in theory. You can add new tests or fix existing ones by editing maintenance/parserTests.txt.

If you want to add other automated tests, you'll have to add a test framework first. This is not recommended unless a) you already have commit access, and b) you're willing to put in the time and effort to getting decent coverage and stabbing people who cause test failures until hopefully everyone becomes resigned to put in the extra effort and figure out how the thing works. Otherwise your effort will join t/ and tests/ in the graveyard of MediaWiki testing frameworks that no one uses, maintains, or cares about. You have been warned. But everyone would appreciate it if you did it anyway, because not having a decent test framework is kind of pathetic for a project this size, and horribly un-trendy.

Anyway, for the time being, do manual testing. If you cause breakage too often, people will get annoyed at you, especially if it isn't caught until it goes live on Wikipedia. Revocation of commit access has been threatened in the past occasionally. At the very least, expect serious indignation if you check in syntax errors – try at least loading your wiki, or php -l.

[edit] Posting a patch

If you have created and tested a patch, get a diff of the modified file by using:

svn diff path/to/modified_file.php > my.patch

Then post the patch as an attachment to the appropriate bug report.

[edit] See also