How to become a MediaWiki hacker

Other languages: 日本語 (ja) Français (fr:)...Translate this page!

This page will try to collect information about MediaWiki's development process, and to answer questions by neophyte developers. If you plan to help us code, but don't have the necessary skillset yet, this is a good place to start.

First, some crucial links:


 * The Wikipedia-code web page contains general information about development
 * The Wikipedia SourceForge Project page allows you to check out the code and to report bugs. Note that we do not use SourceForge's patch tracker (if you don't know what patches are, see below).
 * Mailing lists: wikitech-l for development, mediawiki-l for support, mediawiki-cvs for CVS notifications
 * MediaWiki architecture has more documentation!

I'm modifying my MediaWiki. To give the next poor fool an easier learning curve, I'm documenting what I do and how stuff works. I have brief writeups on PHP, PHPTAL, mySQL, CSS, and MediaWiki, and how they all fit together. Also, step-by-step descriptions of how I learned what to change. See my programming notes.

Operating systems
The MediaWiki software is written in PHP and uses the MySQL database. Both have been ported to a variety of operating systems, including, but not limited to, most Unix variants and Microsoft Windows. It is therefore possible to install and use Wikipedia under both systems. Note that if you do use Windows, certain features involving external utilities will be unavailable, or only available with special downloads and configuration. OS-dependent bugs are occasionally observed, it is best to have some knowledge of the difference between the various platforms regardless of which OS you develop on.

The PHP programming language
If you have no knowledge of PHP (PHP stands for "PHP: Hypertext Preprocessor") but know how to program in other object-oriented programming languages, have no fear, PHP will be easy for you to learn.

If you have no knowledge of PHP or other object-oriented programming languages, you should familiarize yourself with concepts such as classes, objects, methods, events, inheritance.

If you have no knowledge of any programming language, PHP is a good language to start with, as it is reasonably similar to other modern languages, although it is specific in the way it is executed.

Unlike most programs, PHP scripts are typically not run from the command line or a window manager. Instead, a PHP script is executed when you request a file with the ".php" extension (among others) from a webserver. As you do that, the web server, in our case Apache, calls the PHP interpreter (which may be built into the webserver), interprets the PHP file and returns the result to your browser. The PHP file can contain both regular HTML and PHP code, which makes it relatively simple to add dynamic functionality to a static webpage.

Related links:


 * PHP tutorial (available in many different languages)
 * The PHP manual (available in many different languages)
 * PHP wiki (German)
 * PHP

SQL and MySQL
Wikipedia currently uses MySQL as the database backend. Make sure MySQL support is compiled into PHP!

We're also trying to make the wiki work with other database backends, particularly postgresql, out of performance and portability concerns.

Installing MediaWiki
On how to get the sources from CVS see MediaWiki from CVS.

You'll find instructions in the INSTALL file in the source. Try to follow them. You could also try checking out MediaWiki User's Guide: Installation.

It's not necessary to download Wikipedia database dumps in order to develop MediaWiki features. In fact, in many cases it's easier to use a near-empty database with a few specially crafted test pages. However, if for some reason you want to have a copy of Wikipedia, you can get a dump from SQL dumps Import them like so:
 * Linux
 * gzip -dc cur_table.sql.gz | mysql -u wikiadmin -padminpass wikidb
 * gzip -dc old_table.sql.gz | mysql -u wikiadmin -padminpass wikidb
 * cd maintenance ; php rebuildlinks.php

unzip the file
 * Windows (may need -u wikiadmin -padminpass wikidb as above)
 * mysql < cur_table.sql
 * mysql < old_table.sql
 * cd maintenance ; php rebuildlinks.php

Rebuilding the link tables may take a long time, particularly if you've installed the English database, which is quite big. (Note also that you can skip the old table if you wish.) See Database layout on what rebuilding.php is good for.

Note that if you want to create a public mirror of Wikipedia, this probably isn't the best way to go about it. If you do set up a mirror this way, please tweak the code to note that you're looking at a mirror and include links back to the main site.

The MediaWiki codebase
The MediaWiki codebase is large and ugly. Don't be overwhelmed by it. When you're first starting off, aim to write features or fix bugs which are contained to a small region of code.

The best way to learn about MediaWiki is to read the code. Here are some starting points:


 * index.php is the main entry point, although where things go from there is not very obvious.
 * Article.php contains code for page view, delete, rollback, watch and unwatch. It also contains some general utilities for dealing with articles, such as fetching a revision or saving a page.
 * EditPage.php has about half of the code related to editing, the half that's close to the user interface. The rest is in Article.php and the various *Update.php files.
 * Parser.php has most of the code that converts wikitext to HTML. A few bits and pieces are in Skin.php
 * Skin.php is basically a collection of functions that generate HTML for various other components, including RC and the parser.
 * Code for most special pages is in the Special*.php files.
 * Database.php contains stacks of functions for accessing the database.
 * OutputPage.php is the home of the OutputPage class, which is an output buffer. Send your text here and it will be sent to stdout just before the script exits.
 * Title.php is all about titles -- and that includes interwiki titles and "#" fragments. There's some functions in here that will fetch information about an article from the database.
 * User.php contains the User class, which represents user preferences and permissions.
 * Setup.php does all sorts of initialisation, and seems to account for a large proportion of running time. Among other things, it initialises lots of global variables, mostly containing objects.
 * DefaultSettings.php contains defaults for lots of global variables, which may or may not be overridden in LocalSettings.php. Don't use isset, always add a default for any global variable you introduce.

Your first feature
Here are some ideas:
 * Code something that interests you.
 * Code a simple crowd-pleasing feature, an aesthetic improvement
 * Write a special page to provide some handy information. You can even make a modular special page, there's examples in the extensions directory, and the extensions module.
 * Write a parser hook, see Write your own MediaWiki extension for more information
 * Fix an annoying little bug that nobody else could be bothered with

For more specific ideas, please come and talk to the developers on #mediawiki. Don't be put off for a lack of ideas. We have enough ideas between us to keep half a dozen programmers busy for a year.

It's a good idea to talk to a senior developer (e.g. Brion or Tim) on #mediawiki before you start, especially if you're not sure how your feature will affect other parts of the code.

When you have a feature ready to go, press for CVS write access, so that you can commit it. Posting patches can be frustrating, although you may have to do it once or twice to demonstrate good faith. Before you commit your feature, make sure it can be disabled easily.

Don't ask for shell access to the Wikimedia servers. There is no way to restrict shell access to some sort of sandbox, so shell access is only given to people who we really trust. It pains us to turn people down, but often we must. Wait until it is offerred, or if it's taking a long time, discreetly test for support.

See also: Development policy