Manual talk:MediaWiki architecture

Feedback is welcome, and even encouraged. If you see that something's wrong, please fix it or report it below

Factual errors
Ideally, please fix the errors directly if you notice any. If you really don't want to, you can leave a note below.


 * "The code executed from  performs security checks, loads default configuration settings from , guesses configuration with   and then applies site settings contained in  ."  I actually don't know whether this is right -- could someone check?  Does setup.php get hit every time someone hits index.php?
 * I asked Chad to look at the Configuration section. "'Global configuration variables offered better performance than other configuration methods in older versions of PHP' ... Doesn't make sense. They still offer better performance over my new config model. '...and makes it more difficult to optimize the start-up process.' I kind of get what you're saying here, but I think it's superfluous and will just confuse readers." Sumanah 16:38, 31 October 2011 (UTC)
 * Also from Chad: I would also rephrase "and hurt third-party reuse of MediaWiki's code (since most other projects cannot share the same MediaWiki global variable names)"... Basically the idea there is "MediaWiki pollutes the global namespace" but the phrasing is a bit awkward. Re: "The future: storing configuration in the database" -- Please don't promise that just yet ;-) I'm not 100% settled on the model yet.

Stuff that's missing
If you see that a major piece of information related to MediaWiki's architecture is missing from this document, please add it below. Ideally, you would add the content directly to the page, or at least provide pointers to where relevant information can be found.

Please do keep in mind, though, that we are limited to ±5000 words, and we're already way over the limit, so we can't go into too much detail, and we can't mention everything. Also, this document is specifically about MediaWiki's architecture, so it's normal that every single feature isn't included in it.


 * To trim the length, I'd recommend removing some more from the caching section--it's a little verbose as is ^demon 14:41, 26 October 2011 (UTC)

Content organization and major changes
If you'd like to suggest major refactoring or reorganization of content, please do so here before editing the page, to minimize disruption.



Review
If you've reviewed the whole document, or sections of it, please add your name below and say what you've reviewed. This will help identify what has been reviewed and what hasn't, in order that the whole document is accurate.

Tim's comments

 * "The object/parser cache used by Wikimedia is memcached, with dozens of servers dedicated to it"
 * We use MySQL for the parser cache now.


 * "ResourceLoader is a particularly interesting case, as it's one of the few core components of MediaWiki that benefited from proper architecting prior to development."
 * Seems unfair, potentially offensive.


 * "such as the impossibility to write native names in a language that required a different encoding"
 * The grammar is incorrect here. Also there is a technical point: we could use foreign scripts, we just had to use HTML entities. However, in page titles and usernames, they couldn't be used. I suggest "For example, foreign scripts could not be used in page titles", as its own sentence, omit the part about HTML entities. Also "Latin-1 support was dropped in 2005": to be precise, support for character sets other than UTF-8 was dropped, Latin-1/CP1252 was just one of them.


 * "Characters not available on the editors keyboard can be customized and inserted via MediaWiki's Edittools, or its JavaScript version"
 * Suggest explaining what Edittools does.


 * "Localization of the user interface messages was implemented in many different ways in the early years of MediaWiki, especially in MediaWiki extensions. Efforts were made to standardize them; interface messages are now all stored in PHP arrays of key-values pairs. Each message is identified by a unique key, which is assigned different values across languages. This standard was established for legacy reasons, and also because other systems were deemed not to be flexible enough for MediaWiki. For example, gettext doesn't support plural forms for multiple variables."
 * I don't understand this whole paragraph. Everything in it seems to be incorrect. Messages were always stored in PHP arrays with key/value pairs, even in extensions. Messages always had a unique key which had different values in different languages. Only the registration interface has changed. How can a system be established for "legacy reasons"? I would have thought that when something is established, by definition there's no legacy. I don't know why Lee set up the i18n system the way he did, but I'm sure he didn't have plurals of multiple variables in mind. I'm not sure if he even considered gettext, but if he had, he probably would have discarded it on the basis that it's not compiled into PHP by default. I don't recall Lee using optional PHP extensions at all.


 * "MediaWiki extensions provide such features to some extent, but they are often fragile at best."
 * "often" contradicts "at best". I think "at best" is unfair and suggest removing it, since we now have fine-grained edit permissions hooks which support edit all sorts of edit restrictions (e.g. AbuseFilter) quite robustly.

Has this page been useful to you?
If you've learned something new about MediaWiki's architecture while reading this document, please leave a message here. It's really difficult to assess the impact and usefulness of projects like writing this document, so any feedback is appreciated. It'll help determine if similar projects should be attempted in the future.


 * Yes. I especially appreciate:
 * Execution workflow of a web request: I intend on integrating this into the intro to MediaWiki hacking workshop as soon as possible.
 * The mentions of specific historical figures, like Lee Daniel Crocker, whom new developers now wouldn't run into. Yay for recognizing important legacies!
 * Explanations of how the importance of performance affected how we do database stuff.
 * Customizing and extending MediaWiki -- again, this is the high-level overview that I will surely be giving new developers as they start learning MediaWiki.
 * And learning what paucal numbers are! :-) Sumanah 05:13, 31 October 2011 (UTC)

Other comments

 * People connect with images much better than just plain text. Could you include some architectural diagrams maybe? Probably not super detailed for the entire system. You could create a more detailed diagram per section when you talk about specifics in the text.
 * Here's a suggested diagram to add. Sumanah 20:39, 24 October 2011 (UTC) MediaWiki_database_schema_latest.png
 * Might want to update it for 1.18 (not a lot of changes), if we're going to go down that road. However, it's maybe too detailed, and too large to be easily included in a book... Reedy 02:01, 27 October 2011 (UTC)

Stuffs
In the introduction, the usage of the full stop to format numbers looks very wrong to me. I know it's a cultural/location/language thing. Should it be a comma?

Phase I "A few weeks later, Wikipedia enabled the new version of UseModWiki" - Is enabled the correct term? Upgraded to?

Execution workflow of a web request

"and crates a Title object" - Maybe note it's called $wgTitle, as it's somewhat infamous. Similar for "to create an Article object" later on - $wgArticle

Should references be before or after the full stop?

When talking about the language not being specified, is it worth mentioning that it can't be represented as a formal grammar?

Good work! :)

Reedy 02:00, 27 October 2011 (UTC)

Cross-wiki features
It seems a bit weird to me that things like CentralAuth, which are the major architectural headache for most people who tried to get into that area, are not mentioned. vvvt 22:36, 27 October 2011 (UTC)