手册:代码编写约定

快捷方式: CC
From mediawiki.org
This page is a translated version of the page Manual:Coding conventions and the translation is 59% complete.
Outdated translations are marked like this.

本页描述了MediaWiki代码库及扩展内的代码编写约定,用于维基媒体网站,包含适当的命名常规。 不符合这些约定的代码可能会被代码检查者投负票,这应该视为请求修复风格问题和更新补丁。

本页列举了所有MediaWiki代码的一般约定,无论语言写在何处。 有关适用于MediaWiki中特定组件或文件类型的指南,请参阅:

在wikitech(至少应用于operations/puppet):

代码结构

文件格式

缩进大小

各行应使用每级缩进用一个制表符缩进。不应该对每个制表符的空格数做任何假设。大多数MediaWiki开发人员发现每个制表符4个空格宽最可读,但许多系统配置为每个制表符使用8个空格,一些开发人员可能每个制表符使用2个空格。

对于vim用于,一种进行设置的方式是在$HOME/.vimrc中添加如下内容:

autocmd Filetype php setlocal ts=4 sw=4

对CSS、HTML、JavaScript也有类似行

然而,对于Python,应该遵循PEP 8的空白字符指引,新项目推荐空格。

新行

所有文件都应该使用Unix样式的换行符(单个LF字符,而不是CR+LF组合)。

  • Windows上的git在提交时,会默认自动将CR+LF换行符转化成LF。

所有文件末尾都应该有个换行。

  • 这样做因为所有其他行的末尾都有换行符。
  • 这样使以非二进制格式(如差异)传递数据更容易。
  • 命令行工具,如catwc,处理文件时,如果没有末尾换行,就可能出现问题(或者至少,不是按照应该或者预期的样式处理)。

编码

所有文本文件必须编码为不带字节顺序标记zh:UTF-8格式。

不要使用Microsoft记事本编辑文件,因为总是会加入BOM。 BOM阻止PHP文件工作,因为是文件顶部的特殊字符,并将由网页浏览器输出到客户端。

简而言之,确保你的编辑器支持不带BOM的UTF-8。

尾随空白字符

使用IDE时,按Home和End键(以及其他键盘快捷键)通常会忽略尾随空格,按照预期跳转到代码的末尾。但是,在非IDE文本编辑器中,按End将跳转到行尾,这意味着开发人员必须把尾随空格删掉才能到达他们实际想要输入的位置。

在大多数文本编辑器中,删除尾随空格是项琐碎的操作。开发人员应避免添加尾随空格,主要是在包含其他可见代码的行上。

一些工具可以方便完成:

  • nano: GNU nano 3.2;
  • 科莫多编辑 :在“Edit > Preferences”的Save Options中,启用“Clean trailing whitespace and EOL markers”和“Only clean changed lines”。
  • Kate:可以启用“Highlight trailing spaces”选项来查看尾随空格,该选项位于“Settings > Configure Kate > Appearance”中。你还可以在“Settings > Configure Kate > Open/Save”中告诉Kate清理尾随空格。
  • vim:各种自动清理插件;
  • Sublime Text:TrailingSpaces插件.

关键字

非必要不使用带关键字的括号(如require_oncerequire)。

缩进与对齐

一般规则

MediaWiki的缩进格式类似于“One True Brace Style”。大括号与函数、条件、循环等的开头放在同一行。else/elseif与前一个右大括号放在同一行。

function wfTimestampOrNull( $outputtype = TS_UNIX, $ts = null ) {
	if ( $ts === null ) {
		return null;
	} else {
		return wfTimestamp( $outputtype, $ts );
	}
}

多行语句是在第二行和后续行缩进一级的情况下编写的:

使用缩进与折行澄清你的代码逻辑结构。嵌套多级括号或类似结构的表达式,可能会在每级嵌套中新增一级缩进:

$wgAutopromote = [
	'autoconfirmed' => [ '&',
		[ APCOND_EDITCOUNT, &$wgAutoConfirmCount ],
		[ APCOND_AGE, &$wgAutoConfirmAge ],
	],
];

垂直对齐

避免垂直对齐。垂直对齐往往会产生难以解释的差异,因为项目越来越多后,左列容留的宽度也得不断增加。

大多数差异工具提供忽略空白字符更改的选项。
Git: git diff -w

如有需要,应使用空格来实现行中垂直对齐,而不是制表符,比如:

$namespaceNames = [
	NS_MEDIA            => 'Media',
	NS_SPECIAL          => 'Special',
	NS_MAIN             => '',
];

如下实现,其中空格用点表示。

$namespaceNames·=·[
 →  NS_MEDIA············=>·'Media',
 →  NS_SPECIAL··········=>·'Special',
 →  NS_MAIN·············=>·'',
];

(如果使用tabular vim插件,输入:Tabularize /=即可对齐“=”号。)

续行

行应该在80至100列时折断。有少数例外,但是接收大量参数的函数不属于例外。 The idea is that code should not overflow off the screen when word wrap is turned off.

分隔两行的运算符的放置应当一致(始终位于行尾或始终位于行首)。个别语言可能有更具体的规则。

return strtolower( $val ) === 'on'
	|| strtolower( $val ) === 'true'
	|| strtolower( $val ) === 'yes'
	|| preg_match( '/^\s*[+-]?0*[1-9]/', $val );
$foo->dobar(
	Xml::fieldset( wfMessage( 'importinterwiki' )->text() ) .
		Xml::openElement( 'form', [ 'method' => 'post', 'action' => $action, 'id' => 'mw-import-interwiki-form' ] ) .
		wfMessage( 'import-interwiki-text' )->parse() .
		Xml::hidden( 'action', 'submit' ) .
		Xml::hidden( 'source', 'interwiki' ) .
		Xml::hidden( 'editToken', $wgUser->editToken() ),
	'secondArgument'
);

方法运算符应始终放在下一行的开头。

$this->getMockBuilder( Message::class )->setMethods( [ 'fetchMessage' ] )
	->disableOriginalConstructor()
	->getMock();

续if语句时,切换到阿尔曼风格的大括号使条件和主体之间的分离更清晰:

if ( $.inArray( mw.config.get( 'wgNamespaceNumber' ), whitelistedNamespaces ) !== -1 &&
	mw.config.get( 'wgArticleId' ) > 0 &&
	( mw.config.get( 'wgAction' ) == 'view' || mw.config.get( 'wgAction' ) == 'purge' ) &&
	mw.util.getParamValue( 'redirect' ) !== 'no' &&
	mw.util.getParamValue( 'printable' ) !== 'yes'
) {
	
}

关于条件部分应使用的缩进量有不同意见。使用与主体不同的缩进量可以更清楚地表明条件部分不是主体,但这并非普遍观察到的。

条件和长表达式的续行往往难看,无论如何使用。因此有时最好用临时变量的方法将其破开。

不带大括号的控制结构

不要将“块”写成单行,这样可读性低,因为读者寻找重要语句从左边距开始的。记住,把代码变短并不会让代码更简单。编码风格的目标是与人类有效交流,而非将计算机可读的文本塞进狭小空间。

// No:
if ( $done ) return;

// No:
if ( $done ) { return; }

// Yes:
if ( $done ) {
	return;
}

这避免了开发人员在使用没有“智能缩进”功能的文本编辑器时容易犯的常见逻辑错误。单行块稍后扩展到两行时,发生错误了:

if ( $done )
	return;

然后变成:

if ( $done )
	$this->cleanup();
	return;

这样就存在潜在的细微漏洞。

emacs风格

在emacs中,使用nXHTML节点中的php-mode.el,你可以在你的.emacs文件中设立MediaWiki minor mode。

(defconst mw-style
  '((indent-tabs-mode . t)
    (tab-width . 4)
    (c-basic-offset . 4)
    (c-offsets-alist . ((case-label . +)
                        (arglist-cont-nonempty . +)
                        (arglist-close . 0)
                        (cpp-macro . (lambda(x) (cdr x)))
                        (comment-intro . 0)))
    (c-hanging-braces-alist
        (defun-open after)
        (block-open after)
        (defun-close))))

(c-add-style "MediaWiki" mw-style)

(define-minor-mode mah/mw-mode
  "tweak style for mediawiki"
  nil " MW" nil
  (delete-trailing-whitespace)
  (tabify (point-min) (point-max))
  (subword-mode 1)) ;; If this gives an error, try (c-subword-mode 1)), which is the earlier name for it

;; Add other sniffers as needed
(defun mah/sniff-php-style (filename)
  "Given a filename, provide a cons cell of
   (style-name . function)
where style-name is the style to use and function
sets the minor-mode"
  (cond ((string-match "/\\(mw[^/]*\\|mediawiki\\)/"
                       filename)
         (cons "MediaWiki" 'mah/mw-mode))
        (t
         (cons "cc-mode" (lambda (n) t)))))

(add-hook 'php-mode-hook (lambda () (let ((ans (when (buffer-file-name)
                                                 (mah/sniff-php-style (buffer-file-name)))))
                                      (c-set-style (car ans))
                                      (funcall (cdr ans) 1))))

上面的mah/sniff-php-style函数会检查你的路径,会调用php-mode来查看是否包含“mw”或“mediawiki”并设置缓冲使用mw-mode minor mode来编辑MediaWiki源代码。你应该知道缓冲使用了mw-mode因为你有时会在模式行中看到像“PHP MW”或“PHP/lw MW”这样的内容。 You will know that the buffer is using mw-mode because you'll see something like “PHP MW” or “PHP/lw MW” in the mode line.

数据操纵

构建URL

别手动用字符串连接(或类似方式)构造URL。始终使用你的代码产生的完整URL格式进行请求(尤其是POST和背景请求)。

你可以使用PHP中的Linker Title 方法,维基文本中的fullurl 魔术字,JavaScript中的mw.util.getUrl()方法,或其他语言中的类似方法。 这样避免未预期的短URL配置问题。

文件命名

Files which contain server-side code should be named in UpperCamelCase. This is also our naming convention for extensions.[1] Name the file after the most important class it contains; most files will contain only one class, or a base class and a number of descendants. For example, Title.php contains only the Title class; WebRequest.php contains the WebRequest class, and also its descendants FauxRequest and DerivativeRequest.

接入点文件

Name "access point" files, such as SQL, and PHP entry points such as index.php and foobar.sql, in lowercase. Maintenance scripts are generally in lowerCamelCase, although this varies somewhat. Files intended for the site administrator, such as readmes, licenses and changelogs, are usually in UPPERCASE.

Never include spaces in file names or directories, and never use non-ASCII characters. For lowercase titles, hyphens are preferred to underscores.

JS、CSS和媒体文件

For JavaScript, CSS and other frontend files (usually registered via ResourceLoader) should be placed in directory named after the module bundle in which they are registered. For example, module mediawiki.foo might have files mediawiki.foo/Bar.js and mediawiki.foo/baz.css

JavaScript files that define classes should match exactly the name of the class they define. The class TitleWidget should be in a file named as, or ending with, TitleWidget.js. This allows for rapid navigation in text editors by navigating to files named after a selected class name (such as "Goto Anything [P]" in Sublime, or "Find File [P]" in Atom).

Large projects may have classes in a hierarchy with names that would overlap or be ambiguous without some additional way of organizing files. We generally approach this with subdirectories like ext.foo/bar/TitleWidget.js (for Package files), or longer class and file names like mw.foo.bar.TitleWidget in ext.foo/bar.TitleWidget.js.

Modules bundles registered by extensions should follow names like ext.myExtension, for example MyExtension/modules/ext.myExtension/index.js. This makes it easy to get started with working on a module in a text editor, by directly finding the source code files from only the public module name (T193826).

帮助文档

The language-specific subpages have more information on the exact syntax for code comments in files, e.g. comments in PHP for doxygen. Using precise syntax allows us to generate documentation from source code at doc.wikimedia.org.

High level concepts, subsystems, and data flows should be documented in the /docs folder.

源文件头

In order to be compliant with most licenses you should have something similar to the following (specific to GPLv2 PHP applications) at the top of every source file.

<?php
/**
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 * http://www.gnu.org/copyleft/gpl.html
 * 
 * @file
 */

许可证

Licenses are generally referred to by their full name or acronym as per SPDX standard. See also Manual:$wgExtensionCredits#license.

动态标识符

It is generally recommended to avoid dynamically constructing identifiers such as interface message keys, CSS class names, or file names. When possible, write them out and select between them (e.g. using a conditional, ternary, or switch). This improves code stabilty and developer productivity through: easier code review, higher confidence during debugging, usage discovery, git-grep, Codesearch, etc.

If code is considered to be a better reflection of the logical structure, or if required to be fully variable, then you may concatenate the identifier with a variable instead. In that case, you must leave a comment nearby with the possible (or most common) values to demonstrate behaviour and to aid search and discovery.

参见:

// No: Avoid composing message keys
$context->msg( 'templatesused-' . ( $section ? 'section' : 'page' ) );
// Yes: Prefer full message keys
$context->msg( $section ? 'templatesused-section' : 'templatesused-page' );
// If needed, concatenate and write explicit references in a comment

// Messages:
// * myextension-connect-success
// * myextension-connect-warning
// * myextension-connect-error
var text = mw.msg( 'myextension-connect-' + status );
// The following classes are used here:
// * mw-editfont-monospace
// * mw-editfont-sans-serif
// * mw-editfont-serif
$texarea.addClass( 'mw-editfont-' + mw.user.options.get( 'editfont' ) );
// Load example/foo.json, or example/foo.php
$thing->load( "$path/foo.$ext" );

发行说明

You must document all significant changes (including all fixed bug reports) to the core software which might affect wiki users, server administrators, or extension authors in the RELEASE-NOTES-N.NN file. RELEASE-NOTES-1.43 is in development; on every release we move the past release notes into the HISTORY file and start afresh. RELEASE-NOTES-N.NN is generally divided into three sections:

  • Configuration changes is the place to put changes to accepted default behavior, backwards-incompatible changes, or other things which need a server administrator to look at and decide "is this change right for my wiki?". Try to include a brief explanation of how the previous functionality can be recovered if desired.
  • Bug fixes is the place to note changes which fix behavior which is accepted to be problematic or undesirable. These will often be issues reported in Phabricator , but needn't necessarily.
  • New features is, unsurprisingly, to note the addition of new functionality.

There may be additional sections for specific components (e.g. the Action API) or for miscellaneous changes that don't fall into one of the above categories.

In all cases, if your change is in response to one or more issues reported in Phabricator, include the task ID(s) at the start of the entry. Add new entries in chronological order at the end of the section.

系统消息

When creating a new system message, use hyphens (-) where possible instead of CamelCase or snake_case. So for example, some-new-message is a good name, while someNewMessage and some_new_message are not.

If the message is going to be used as a label which can have a colon (:) after it, don't hardcode the colon; instead, put the colon inside the message text. Some languages (such as French which require a space before) need to handle colons in a different way, which is impossible if the colon is hardcoded. The same holds for several other types of interpunctuation.

Try to use message keys "whole" in code, rather than building them on the fly; as this makes it easier to search for them in the codebase. For instance, the following shows how a search for templatesused-section will not find this use of the message key if they are not used as a whole.

// No:
return wfMessage( 'templatesused-' . ( $section ? 'section' : 'page' ) );

// Yes:
$msgKey = $section ? 'templatesused-section' : 'templatesused-page';
return wfMessage( $msgKey );

If you feel that you have to build messages on the fly, put a comment with all possible whole messages nearby:

// Messages that can be used here:
// * myextension-connection-success
// * myextension-connection-warning
// * myextension-connection-error
$text = wfMessage( 'myextension-connection-' . $status )->parse();

See Localisation for more conventions about creating, using, documenting and maintaining message keys.

推荐的拼写

It is just as important to have consistent spelling in the UI and codebase as it is to have consistent UI. By long standing history, 'American English' is the preferred spelling for English language messages, comments, and documentation.

消息键中的简写

ph
占位符(输入字段中的文本)
tip
提示文本
tog-xx
toggle options in user preferences

标点

Non-title error messages are considered as sentences and should have punctuation.

改进内核

If you need some additional functionality from a MediaWiki core component (PHP class, JS module etc.), or you need a function that does something similar but slightly different, prefer to improve the core component. Avoid duplicating the code to an extension or elsewhere in core and modifying it there.

重构

Refactor code as changes are made: don't let the code keep getting worse with each change.

However, use separate commits if the refactoring is large. See also Architecture guidelines (draft).

HTML

MediaWiki HTTP responses output HTML that can be generated by one of two sources. The MediaWiki PHP code is a trusted source for the user interface, it can output any arbitrary HTML. The Parser converts user-generated wikitext into HTML, this is an untrusted source. Complex HTML created by users via wikitext is often found in the "Template" namespace. HTML produced by the Parser is subject to sanitization before output.

Most data-* attributes are allowed to be used by users in wikitext and templates. But, the following prefixes have been restricted and are not allowed in wikitext and will be removed from the output HTML. This enables client JavaScript code to determine whether a DOM element came from a trusted source:

  • data-oouiThis attribute is present in HTML generated by OOUI widgets.
  • data-parsoidreserved attribute for internal use by Parsoid.
  • data-mwdata-mw-...reserved attribute for internal use by MediaWiki core, skins and extensions. The data-mw attribute is used by Parsoid; other core code should use data-mw-*.

When selecting elements in JavaScript, one can specify an attribute key/value to ensure only DOM elements from the intended trusted source are considered. Example: Only trigger 'wikipage.diff' hook for official diffs.

注释

外部链接