Manual:Coding conventions/PHP

shortcut: CC/PHP
From mediawiki.org

This page describes the coding conventions used within files of the MediaWiki codebase written in PHP. See also the general conventions that apply to all program languages, including PHP. If you would like a short checklist to help you review your commits, try using the Pre-commit checklist .

Most of the code style rules can be automatically fixed, or at least detected, by PHP_CodeSniffer (aka PHPCS), using a custom ruleset for MediaWiki. For more information, see Continuous integration/PHP CodeSniffer .

Code structure[edit]

Spaces[edit]

MediaWiki favors a heavily-spaced style for optimum readability.

Indent with tabs, not spaces. Limit lines to 120 characters (given a tab-width of 4 characters).

Put spaces on either side of binary operators, for example:

// No:
$a=$b+$c;

// Yes:
$a = $b + $c;

Put spaces next to parentheses on the inside, except where the parentheses are empty. Do not put a space following a function name.

$a = getFoo( $b );
$c = getBar();

Put a space after the : in the function return type hint, but not before:

function square( int $x ): int {
    return $x * $x;
}

Put spaces in brackets when declaring an array, except where the array is empty. Do not put spaces in brackets when accessing array elements.

// Yes
$a = [ 'foo', 'bar' ];
$c = $a[0];
$x = [];

// No
$a = ['foo', 'bar'];
$c = $a[ 0 ];
$x = [ ];

Control structures such as if, while, for, foreach, switch, as well as the catch keyword, should be followed by a space:

// Yes
if ( isFoo() ) {
	$a = 'foo';
}

// No
if( isFoo() ) {
	$a = 'foo';
}

When type casting, do not use a space within or after the cast operator:

// Yes
(int)$foo;

// No
(int) $bar;
( int )$bar;
( int ) $bar;

In comments there should be one space between the # or // character and the comment.

// Yes: Proper inline comment
//No: Missing space
/***** Do not comment like this ***/

Ternary operator[edit]

The ternary operator can be used profitably if the expressions are very short and obvious:

$title = $page ? $page->getTitle() : Title::newMainPage();

But if you're considering a multi-line expression with a ternary operator, please consider using an if () block instead. Remember, disk space is cheap, code readability is everything, "if" is English and "?:" is not. If you are using a multi-line ternary expression, the question mark and colon should go at the beginning of the second and third lines and not the end of the first and second (in contrast to MediaWiki's JavaScript convention).

Since MediaWiki requires PHP 7.4.3 or later, use of the shorthand ternary operator (?:) also known as the elvis operator, introduced in PHP 5.3, is allowed.

Since PHP 7.0 the null coalescing operator is also available and can replace the ternary operator in some use cases.

For example, instead of

$wiki = isset( $this->mParams['wiki'] ) ? $this->mParams['wiki'] : false;

you could instead write the following:

$wiki = $this->mParams['wiki'] ?? false;

String literals[edit]

Single quotes are preferred in all cases where they are equivalent to double quotes. Code using single quotes is less error-prone and easier to review, as it cannot accidentally contain escape sequences or variables. For example, the regular expression "/\\n+/" requires an extra backslash, making it slightly more confusing and error-prone than '/\n+/'. Also for people using US/UK qwerty keyboards, they are easier to type, since it avoids the need to press shift.

However, do not be afraid of using PHP's double-quoted string interpolation feature: $elementId = "myextension-$index"; This has slightly better performance characteristics than the equivalent using the concatenation (dot) operator, and it looks nicer too.

Heredoc-style strings are sometimes useful:

$s = <<<EOT
<div class="mw-some-class">
$boxContents
</div>
EOT;

Some authors like to use END as the ending token, which is also the name of a PHP function.

Functions and parameters[edit]

Avoid passing huge numbers of parameters to functions or constructors:

// Constructor for Block.php from 1.17 to 1.26. DO NOT do this!
function __construct( $address = '', $user = 0, $by = 0, $reason = '',
	$timestamp = 0, $auto = 0, $expiry = '', $anonOnly = 0, $createAccount = 0, $enableAutoblock = 0,
	$hideName = 0, $blockEmail = 0, $allowUsertalk = 0
) {
	...
}

It quickly becomes impossible to remember the order of parameters, and you will inevitably end up having to hardcode all the defaults in callers just to customise a parameter at the end of the list. If you are tempted to code a function like this, consider passing an associative array of named parameters instead.

In general, using boolean parameters is discouraged in functions. In $object->getSomething( $input, true, true, false ), without looking up the documentation for MyClass::getSomething(), it is impossible to know what those parameters are meant to indicate. Much better is to either use class constants, and make a generic flag parameter:

$myResult = MyClass::getSomething( $input, MyClass::FROM_DB | MyClass::PUBLIC_ONLY );

Or to make your function accept an array of named parameters:

$myResult = MyClass::getSomething( $input, [ 'fromDB', 'publicOnly' ] );

Try not to repurpose variables over the course of a function, and avoid modifying the parameters passed to a function (unless they're passed by reference and that's the whole point of the function, obviously).

Assignment expressions[edit]

Using assignment as an expression is surprising to the reader and looks like an error. Do not write code like this:

if ( $a = foo() ) {
    bar();
}

Space is cheap, and you're a fast typist, so instead use:

$a = foo();
if ( $a ) {
    bar();
}

Using assignment in a while() clause used to be legitimate, for iteration:

$res = $dbr->query( 'SELECT * FROM some_table' );
while ( $row = $dbr->fetchObject( $res ) ) {
    showRow( $row );
}

This is unnecessary in new code; instead use:

$res = $dbr->query( 'SELECT * FROM some_table' );
foreach ( $res as $row ) {
    showRow( $row );
}

C borrowings[edit]

The PHP language was designed by people who love C and wanted to bring souvenirs from that language into PHP. But PHP has some important differences from C.

In C, constants are implemented as preprocessor macros and are fast. In PHP, they are implemented by doing a runtime hashtable lookup for the constant name, and are slower than just using a string literal. In most places where you would use an enum or enum-like set of macros in C, you can use string literals in PHP.

PHP has three special literals for which upper-/lower-/mixed-case is insignificant in the language (since PHP 5.1.3), but for which our convention is always lowercase: true, false and null.

Use elseif not else if. They have subtly different meanings:

// This:
if ( $foo === 'bar' ) {
	echo 'Hello world';
} else if ( $foo === 'Bar' ) {
	echo 'Hello world';
} else if ( $baz === $foo ) {
	echo 'Hello baz';
} else {
	echo 'Eh?';
}

// Is actually equivalent to:
if ( $foo === 'bar' ) {
	echo 'Hello world';
} else {
	if ( $foo == 'Bar' ) {
		echo 'Hello world';
	} else {
		if ( $baz == $foo ) {
			echo 'Hello baz';
		} else {
			echo 'Eh?';
		}
	}
}

And the latter has poorer performance.

Alternative syntax for control structures[edit]

PHP offers an alternative syntax for control structures using colons and keywords such as endif, endwhile, etc.:

if ( $foo == $bar ):
    echo "<div>Hello world</div>";
endif;

This syntax should be avoided, as it prevents many text editors from automatically matching and folding braces. Braces should be used instead:

if ( $foo == $bar ) {
    echo "<div>Hello world</div>";
}

Brace placement[edit]

See Manual:Coding conventions#Indenting and alignment.

For anonymous functions, prefer arrow functions when the anonymous function consists only of one line. Arrow functions are more concise and readable than regular anonymous functions and neatly side-steps formatting issues that arise with single-line anonymous functions.[1]

Type declarations in function parameters[edit]

Use type declarations and return type declarations (type hinting) when applicable. (But see #Don't add type declarations for "big" legacy classes below.)

Note that before 7.4 PHP cannot handle type hint restriction / relaxation in subclasses.

Scalar typehints are allowed as of MediaWiki 1.35, following the switch to PHP 7.2 (T231710).

Use PHP 7.1 syntax for nullable parameters: choose

public function foo ( ?MyClass $mc ) {}

instead of

public function foo ( MyClass $mc = null ) {}

The former conveys precisely the nullability of a parameter, without risking any ambiguity with optional parameters. IDEs and static analysis tools will also recognize it as such, and will not complain if a non-nullable parameter follows a nullable one.

Naming[edit]

Use lowerCamelCase when naming functions or variables. For example:

private function doSomething( $userPrefs, $editSummary )

Use UpperCamelCase when naming classes: class ImportantClass. Use uppercase with underscores for global and class constants: DB_PRIMARY, Revision::REV_DELETED_TEXT. Other variables are usually lowercase or lowerCamelCase; avoid using underscores in variable names.

There are also some prefixes used in different places:

Functions[edit]

  • wf (wiki functions) – top-level functions, e.g.
    function wfFuncname() { ... }
    
  • ef (extension functions) = global functions in extensions, although "in most cases modern style puts hook functions as static methods on a class, leaving few or no raw top-level functions to be so named." (-- brion in Manual_talk:Coding_conventions#ef_prefix_9510)

Verb phrases are preferred: use getReturnText() instead of returnText(). When exposing functions for use in testing, mark these as @internal per the Stable interface policy. Misuse or unofficial reliance on these is more problematic than most internal methods, and as such we tend to make these throw if they run outside of a test environment.

/**
 * Reset example data cache.
 *
 * @internal For testing only
 */
public static function clearCacheForTest(): void {
	if ( !defined( 'MW_PHPUNIT_TEST' ) ) {
		throw new RuntimeException( 'Not allowed outside tests' );
	}
	self::$exampleDataCache = [];
}

Variables[edit]

  • $wg – global variables, e.g.$wgTitle . Always use this for new globals, so that it's easy to spot missing "global $wgFoo" declarations. In extensions, the extension name should be used as a namespace delimiter. For example, $wgAbuseFilterConditionLimit, not $wgConditionLimit.
  • Global declarations should be at the beginning of a function so dependencies can be determined without having to read the whole function.

It is common to work with an instance of the Database class; we have a naming convention for these which helps keep track of the nature of the server to which we are connected. This is of particular importance in replicated environments, such as Wikimedia and other large wikis; in development environments, there is usually no difference between the two types, which can conceal subtle errors.

  • $dbw – a Database object for writing (a primary connection)
  • $dbr – a Database object for non-concurrency-sensitive reading (this may be a read-only replica, slightly behind primary state, so don't ever try to write to the database with it, or get an "authoritative" answer to important queries like permissions and block status)

The following may be seen in old code but are discouraged in new code:

  • $ws – Session variables, e.g. $_SESSION['wsSessionName']
  • $wc – Cookie variables, e.g. $_COOKIE['wcCookieName']
  • $wp – Post variables (submitted via form fields), e.g. $wgRequest->getText( 'wpLoginName' )
  • $m – object member variables: $this->mPage. This is discouraged in new code, but try to stay consistent within a class.

Pitfalls[edit]

empty()[edit]

The empty() function should only be used when you want to suppress errors. Otherwise just use ! (boolean conversion).

  • empty( $var ) essentially does !isset( $var ) || !$var.
    Common use case: Optional boolean configuration keys that default to false. $this->enableFoo = !empty( $options['foo'] );
  • Beware of boolean conversion pitfalls.
  • It suppresses errors about undefined properties and variables. If only intending to test for undefined, use !isset(). If only intending to test for "empty" values (e.g. false, 0, [], etc.), use !.

isset()[edit]

Do not use isset() to test for null. Using isset in this situation could introduce errors by hiding misspelled variable names. Instead, use $var === null.

Boolean conversion[edit]

if ( !$var ) {
    
}
  • Do not use ! or empty to test if a string or array is empty, because PHP considers '0' to be falsy – but '0' is a valid title and valid user name in MediaWiki. Use === '' or === [] instead.
  • Study the rules for conversion to boolean. Be careful when converting strings to boolean.

Other[edit]

  • Array plus does not renumber the keys of numerically-indexed arrays, so [ 'a' ] + [ 'b' ] === [ 'a' ]. If you want keys to be renumbered, use array_merge(): array_merge( [ 'a' ], [ 'b' ] ) === [ 'a', 'b' ]
  • Make sure you have error_reporting() set to -1. This will notify you of undefined variables and other subtle gotchas that stock PHP will ignore. See also Manual:How to debug .
  • When working in a pure PHP file (e.g. not an HTML template), omit any trailing ?> tags. These tags often cause issues with trailing white-space and "headers already sent" error messages (cf. bugzilla:17642 and http://news.php.net/php.general/280796). It is conventional in version control for files to have a new line at end-of-file (which editors may add automatically), which would then trigger this error.
  • Do not use the goto() syntax introduced in 5.3. PHP may have introduced the feature, but that does not mean we should use it.
  • Do not pass by reference when traversing an array with foreach unless you have to. Even then, be aware of the consequences. (See https://web.archive.org/web/20220924191559/https://www.intracto.com/en-be/blog/php-quirks-passing-an-array-by-reference/ and the PHP manual)
  • PHP lets you declare static variables even within a non-static method of a class. This has led to subtle bugs in some cases, as the variables are shared between instances. Where you would not use a private static property, do not use a static variable either.

Equality operators[edit]

Be careful with double-equals comparison operators. Triple-equals (===) is generally more intuitive and should be preferred unless you have a reason to use double-equals (==).

  • '000' == '0' is true (!)
  • '000' === '0' is false
  • To check if two scalars that are supposed to be numeric are equal, use ==, e.g. 5 == "5" is true.
  • To check if two variables are both of type 'string' and are the same sequence of characters, use ===, e.g. "1.e6" === "1.0e6" is false.

Watch out for internal functions and constructs that use weak comparisons; for instance, provide the third parameter to in_array, and don't mix scalar types in switch constructs.

Do not use Yoda conditionals.

JSON number precision[edit]

JSON uses JavaScript's type system, so all numbers are represented as 64bit IEEE floating point numbers. This means that numbers lose precision when getting bigger, to the point where some whole numbers become indistinguishable: Numbers beyond 2^52 will have a precision worse than ±0.5, so a large integer may end up changing to a different integer. To avoid this issue, represent potentially large integers as strings in JSON.

Dos and don'ts[edit]

Don't use built in serialization[edit]

PHP's built in serialization mechanism (the serialize() and unserialize() functions) should not be used for data stored (or read from) outside of the current process. Use JSON based serialization instead (however, beware the pitfalls). This is policy established by RFC T161647.

The reason is twofold: (1) data serialized with this mechanism cannot reliably be unserialized with a later version of the same class. And (2) crafted serialized data can be used to execute malicious code, posing a serious security risk.

Sometimes, your code will not control the serialization mechanism, but will be using some library or driver that uses it internally. In such cases, steps should be taken to mitigate risk. The first issue mentioned above can be mitigated by converting any data to arrays or plain anonymous objects before serialization. The second issue can perhaps be mitigated using the whitelisting feature PHP 7 introduces for unserialization.

Don't add type declarations for "big" legacy classes[edit]

MediaWiki contains some big classes that are going to be split up or replaced sooner or later. This will be done in a way that keeps code compatible for a transition period, but it can break extension code that expects the legacy classes in parameter types, return types, property types, or similar. For instance, a hook handler's $title parameter may be passed some kind of MockTitleCompat class instead of a real Title.

Such big legacy classes should therefore not be used in type hints, only in PHPDoc.[2][3] The classes include:

  • Title
  • Article
  • WikiPage
  • User
  • MediaWiki
  • OutputPage
  • WebRequest
  • EditPage

Comments and documentation[edit]

It is essential that your code be well documented so that other developers and bug fixers can easily navigate the logic of your code. New classes, methods, and member variables should include comments providing brief descriptions of their functionality (unless it is obvious), even if private. In addition, all new methods should document their parameters and return values.

We use the Doxygen documentation style (it is very similar to PHPDoc for the subset that we use) to produce auto-generated documentation from code comments (see Manual:Mwdocgen.php ). Begin a block of Doxygen comments with /**, instead of the Qt-style formatting /*!. Doxygen structural commands start with @tagname. (Use @ rather than \ as the escape character – both styles work in Doxygen, but for backwards and future compatibility MediaWiki has chosen the @param style.) They organize the generated documentation (using @ingroup) and identify authors (using @author tags).

They describe a function or method, the parameters it takes (using @param), and what the function returns (using @return). The format for parameters is:

@param type $paramName Description of parameter

If a parameter can be of multiple types, separate them with the pipe '|' character, for example:

@param string|Language|bool $lang Language for the ToC title, defaults to user language

Continue sentences belonging to an annotation on the next line, indented with one additional space.

For every public interface (method, class, variable, whatever) you add or change, provide a @since VERSION tag, so people extending the code via this interface know they are breaking compatibility with older versions of the code.

class Foo {

	/**
	 * @var array Description here
	 * @example [ 'foo' => Bar, 'quux' => Bar, .. ]
	 */
	protected $bar;

	/**
	 * Description here, following by documentation of the parameters.
	 *
	 * Some example:
	 * @code
	 * ...
	 * @endcode
	 *
	 * @since 1.24
	 * @param FooContext $context context for decoding Foos
	 * @param array|string $options Optionally pass extra options. Either a string
	 *  or an array of strings.
	 * @return Foo|null New instance of Foo or null if quuxification failed.
	 */
	public function makeQuuxificatedFoo( FooContext $context = null, $options = [] ) {
		/* .. */
	}

}

FIXME usually means something is bad or broken. TODO means that improvements are needed; it does not necessarily mean that the person adding the comment is going to do it. HACK means that a quick but inelegant, awkward or otherwise suboptimal solution to an immediate problem was made, and that eventually a more thorough rewrite of the code should be done.

Source file headers[edit]

In order to be compliant with most licenses you should have something similar to the following (specific to GPLv2 applications) at the top of every source file.

<?php
/**
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 * http://www.gnu.org/copyleft/gpl.html
 *
 * @file
 */

Doxygen tags[edit]

We use the following annotations which Doxygen recognizes. Use them in this order, for consistency:

File level:

  • @file

Class, class member, or global member:

  • @todo
  • @var
  • @stable, @newable, @deprecated, @internal, @private
  • @see
  • @since
  • @ingroup
  • @param
  • @return
  • @throws
  • @author

Test annotations[edit]

In tests, we use the following annotations among others. These aren't merely documentation, they mean something to PHPUnit and affect test execution.

  • @depends
  • @group
  • @covers
  • @dataProvider


Integration[edit]

There are a few pieces of code in the MediaWiki codebase which are intended to be standalone and easily portable to other applications. While some of these now exist as separate libraries, others remain within the MediaWiki source tree (e.g. the files in /includes/libs). Apart from these, code should be integrated into the rest of the MediaWiki environment, and should allow other areas of the codebase to integrate with it in return.

Visibility[edit]

Mark code as private unless there is a reason to make it more visible. Don't just make everything protected (= public to subclasses) or public.

Global objects[edit]

Do not access the PHP superglobals $_GET, $_POST, etc, directly; use $request->get*( 'param' ) instead; there are various functions depending on what type of value you want.You can get a WebRequest from the nearest RequestContext, or if absolutely necessary RequestContext::getMain(). Equally, do not access $_SERVER directly; use $request->getIP() if you want to get the IP address of the current user.

Static methods and properties[edit]

Static methods and properties can be used in PHP, but care should be taken when inheriting to distinguish between the self and static keywords. self will always refer to the class in which it was defined, whereas static will refer to the particular sub-class invoking it. See the PHP documentation on Late Static Bindings for more details.

Classes[edit]

Encapsulate your code in an object-oriented class, or add functionality to existing classes; do not add new global functions or variables. Try to be mindful of the distinction between 'backend' classes, which represent entities in the database (e.g. User, Block, RevisionRecord, etc.), and 'frontend' classes, which represent pages or interfaces visible to the user (SpecialPage, Article, ChangesList, etc. Even if your code is not obviously object-oriented, you can put it in a static class (e.g. IP or Html).

As a holdover from PHP 4's lack of private class members and methods, older code will be marked with comments such as /** @private */ to indicate the intention; respect this as if it were enforced by the interpreter.

Mark new code with proper visibility modifiers, including public if appropriate, but do not add visibility to existing code without first checking, testing and refactoring as required. It's generally a good idea to avoid visibility changes unless you're making changes to the function which would break old uses of it anyway.

Error handling[edit]

In general, you should not suppress PHP errors. The proper method of handling errors is to actually handle the errors.

For example, if you are thinking of using an error suppression operator to suppress an invalid array index warning, you should instead perform an isset check on the array index before trying to access it. When possible, always catch or naturally prevent PHP errors.

Only if there is a situation where you are expecting an unavoidable PHP warning, you may use PHP's @ operator. This is for cases where:

  1. It is impossible to anticipate the error that is about to occur; and
  2. You are planning on handling the error in an appropriate manner after it occurs.

We use PHPCS to warn against use of the at-operator. If you really need to use it, you'll also need to instruct PHPCS to make an exemption, like so:

// phpcs:ignore Generic.PHP.NoSilencedErrors.Discouraged
$content = @file_get_contents( $path );

An example use case is opening a file with fopen(). You can try to predict the error by calling file_exists() and is_readable(), but unlike isset(), such file operations add significant overhead and make for unstable code. For example, the file may be deleted or changed between the check and the actual fopen() call (see TOC/TOU).

In this case, write the code to just try the main operation you need to do. Then handle the case of the file failing to open, by using the @ operator to prevent PHP from being noisy, and then check the result afterwards. For fopen() and filemtime(), that means checking for a boolean false return, and then performing a fallback, or throw an exception.

AtEase[edit]

For PHP 5 and earlier, MediaWiki developers discouraged use of the @ operator due to it causing unlogged and unexplained fatal errors (r39789). Instead, we used custom AtEase::suppressWarnings() and AtEase::restoreWarnings() methods from the at-ease library. The reason is that the at-operator caused PHP to not provide error messages or stack traces upon fatal errors. While the at-operator is mainly intended for non-fatal errors (not exceptions or fatals), if a fatal were to happen it would make for a very poor developer experience.

use Wikimedia\AtEase\AtEase;

AtEase::suppressWarnings();
$content = file_get_contents( $path );
AtEase::restoreWarnings();

In PHP 7, the exception handler was fixed (example) to always provide such errors, including a stack trace, regardless of error suppression. In 2020, use of AtEase started a phase out, reinstating the at-operator. (T253461)

Exception handling[edit]

Exceptions can be checked (meaning callers are expected to catch them) or unchecked (meaning callers must not catch them).

Unchecked exceptions are commonly used for programming errors, such as invalid arguments passed to a function. These exceptions should generally use (either directly or by subclassing) the SPL[4] exception classes, and must not be documented with @throws annotations.

Checked exceptions, on the other hand, must always be documented with @throws annotations. When calling a method that can throw a checked exception, said exception must either be caught, or documented in the caller's doc comment. Checked exceptions should generally use dedicated exception classes extending Exception. It's recommended not to use SPL exceptions as base classes for checked exceptions, so that correct usage of exception classes can be enforced with static code analyzers.

The base Exception class must never be thrown directly: use more specific exception classes instead. It can be used in a catch clause if the intention is to catch all possible exceptions, but Throwable is usually more correct for that purpose.

In legacy code it is relatively common to throw or subclass the MWException class. This class must be avoided in new code, as it does not provide any advantage, and could actually be confusing (T86704).

When creating a new exception class, consider implementing INormalizedException if the exception message contains variable parts, and ILocalizedException if the exception message is shown to users.

See also[edit]

Notes[edit]