Requests for comment/Scripting

This page documents the history and current status of wikitext scripting.

Background
Initially MediaWiki templates were pieces of wikitext that were substituted into pages instead of copy-pasting. By 2005, any other use was rare and, to some extent, controversial. In 2006, ParserFunctions were enabled, allowing users to use constructions such as  and , essentially turning wikitext into a purely functional programming language (i.e., a language that has no concept of state at any level and one part of the code may not affect any other part, it can only change its own output). This eventually caused several problems, including performance (some pages are overloaded with templates and require 40 seconds or more to parse/render) and readability (just take a look at this).

Proposed solutions
In order to resolve these issues, we are looking for a scripting solution that will replace our current template system. Options:
 * 1) Lua (implemented)
 * 2) WikiScripts (implemented)
 * 3) Probably some JavaScript backend (Node.JS)

Objectives
What do we want from the scripting system:
 * 1) Performance. The current headliner of the profiling is PPFrame_DOM::expand. The scripting solution must allow us to render pages like en:Barack Obama in much less time than 40 seconds.
 * 2) Portability. Parser functions are currently the most popular MediaWiki extension and our templates are reused all around the world. If we choose the solution that would require users to have something unportable (for example, do not support Windows or require PHP extension most hostings do not have), that may have serious negative impact on our vision.
 * 3) * This imposes restriction on dual-implementation of the scripts. If we have solution A (in PHP) and solution B (faster version in C used on WMF), they must implement equal features, not a subset. If you have certain features in implementation B and do not have those features in implementation A, the users of B will soon start using those non-compliant features and their code will become non-reusable for the users of A.
 * 4) Sandboxing. Users must not be allowed to abuse the server resources using the scripting language. The solution must be protected from DoS attacks and security exploits.
 * Determinism. The same script with the same engine configuration must always produce the same result. Two backends must always produce the same result (see portability note above).
 * 1) * When combined with portability requirement, this also shows why things like CPU limit are bad for sandboxing. CPU time on different machines would be different.
 * 2) * I don't think I agree with this. Using CPU limits will allow scripts to become more complex and do more useful things as the system which executes them improves. And CPU limits are more closely related to the actual server costs than any deterministic proxy. The negative effects of having a script succeed on one server and fail on another can be mitigated in other ways, such as by having a "warning" limit, where an attempt to save a page is rejected and an error message is displayed to the user, and a hard limit, where the render is aborted regardless of the calling context. -- Tim Starling 04:10, 6 September 2011 (UTC)
 * Agreed. Striking this out. vvvt 20:53, 29 January 2012 (UTC)
 * 1) Extensibility. We would need to insert our own functions into that language to allow users interact more easily with MediaWiki.
 * 2) Isolation. No template should be able to affect the execution of any other template except by transcluding another template and passing arguments into it. Due to fact that each template is stored on a separate page, such non-local effects would be difficult to debug.
 * 3) * It's not clear to me whether this is a useful goal. Extension:Lua makes a feature out of its lack of isolation, allowing variables to be defined by one template and then used by another template. It's true that this behaviour does have some negative consequences, such as exposing the empty expansion cache to the user. -- Tim Starling 04:10, 6 September 2011 (UTC)
 * Those variables are essentially globals, and it would be a pain to control how they get defined, where, as well as to detect all possible name collisions and double-transclusions of the same template. I do not think that forcing users to insert C-style header safeguards in the "library" templates is good idea. IMO we should force users to write modules on separate pages (like CommonJS), it is safer and allows to apply code editor and highlight vvvt 08:26, 9 September 2011 (UTC)
 * 1) Usability. Ease of use, learning time, as well as presence of syntax highlighting and code editor.