Gerrit/Code review

Guide to reviewing MediaWiki code.

Goals

 * Encourage new developers to continue contributing to MediaWiki.
 * Be nice. Frame your comments as helpful suggestions.
 * Tolerate idiosyncrasies where possible. Developers should feel that their creative input is valued and preserved.
 * Encourage community developers to contribute to the code under review.
 * Ensure that the code is readable, and has a style which regular developers will find reasonably familiar.
 * Contributed code should work as advertised, that is, bugs should be fixed.
 * Protect MediaWiki users by ensuring an absence of security vulnerabilities.
 * Aim for interface stability and backwards compatibility.
 * Review public interfaces carefully, so that future changes can be kept to a minimum.

Style

 * Ensure the style guide is followed, such as whitespace, line length, brace placement, etc. This is easy to check. It is essential for core contributions. For extension contributions, style deviations can be tolerated initially.

Readability

 * Functions should do what their names say. Choosing the correct verb is essential, a get* function should get something, a set* function should set something.
 * Variables names should use English, abbreviations should be avoided where possible, and explained with a comment if they can't be avoided.
 * Doc comments on functions are preferred.
 * Overly-complicated code should be simplified. This usually means making it longer. For instance:
 * Ternary operators (?:) may need to be replaced with if/else.
 * Long expressions may need to be broken up into several statements.
 * Clever use of operator precedence, shortcut evaluation, assignment expressions, etc. may need to be rewritten.
 * Duplication should be avoided.
 * Duplication, whether within files or across files, is bad for readability since it's necessary to play "spot the difference" to work out what has changed. Reading many near-copies of some code necessarily takes longer than reading a single abstraction.
 * It's also bad for maintainability, since if a bug (or missing feature) is found in the copied code, all instances of it have to be fixed.
 * It's common for new developers to copy large sections of code from other extensions or from the core, and to change some minor details in it. If a developer seems to be writing code which is "too good" for their level of experience, try grepping the code base for some code fragments, to identify the source. Comparing the original and the copy allows you to get a true feeling for the developer's level of expertise. Then guide the developer towards either rewriting or refactoring, to avoid the duplication.

Security

 * The reviewer should have read and understood security guide and should be aware of the security issues discussed there.
 * There should not be the remotest possibility of arbitrary server-side code execution. This means that there should be no eval or create_function, and no /e modifier on preg_replace.
 * Check for text protocol injection issues (XSS, SQL injection, etc.) Insist on output-side escaping.
 * Check for register_globals issues, especially classic arbitrary inclusion vulnerabilities.
 * Check any write actions for CSRF.
 * Be wary of special entry points which may bypass the security provisions in WebStart.php.
 * Be wary of unnecessary duplication of security-relevant MW core functionality, such as using $_REQUEST instead of $wgRequest, or escaping SQL with addslashes instead of $dbr->addQuotes.

Architecture

 * Names which are in a shared (global) namespace should be prefixed (or otherwise specific to the extension in question) to avoid conflicts between extensions. This includes:
 * Global variables
 * Global functions
 * Class names
 * Message names
 * The aim of modularity is to separate concerns. Modules should not depend on each other in unexpected or complex ways. The interfaces between modules should be as simple as possible without resorting to code duplication.

Logic

 * Point out shortcuts and ask the developer to do a proper job.

Programmers have a model of what the code is meant to do in their heads. As long as you're a fast typist, programming is easiest when you just translate that model into code in the obvious way. But sometimes programmers think that it will be easier to write the code if it is shorter, so they figure out a new model which is almost the same as the old one, except it requires slightly less code. They think that it will do the same thing in all the cases they have in their heads, therefore it is good enough. This is a common source of bugs, since unexpected cases later come to light.

Taking shortcuts is counterproductive, since the amount of time spent figuring out the shortcut and verifying that it works could have been spent just typing out the original idea in full. The exception to this is if the developer is a very slow typist, but most of our developers can actually type pretty well.