User:Ilmari Karonen/Performance tuning

MediaWiki is a pretty heavyweight framework. To make things worse, many of the default settings tend to be optimized for large wiki farms like Wikimedia's, while others are fail-safe defaults that may be robust but provide suboptimal performance. As the maintainer of a small wiki running on a shared webhost, I've collected here some tweaks and tricks I've found useful for making pages load faster. Using these tricks, I've managed to get the mean loading time for an unlogged-in page view on my wiki down from more than a second to a few dozen milliseconds — that's about two orders of magnitude faster!

General techniques
First, install Firebug and the Google Page Speed extension in your browser and learn to use them. Firebug is invaluable for actually testing how fast your pages load, and Page Speed nicely augments it by offering suggestions on how to improve it. (Not all of the suggestions are perfect, but they at least provide a useful checklist to look through.)

Second, learn to use the MediaWiki logging and profiling features. They can tell you what's actually going on during requests and what's taking the most time. Yes, the output is basically a random barf of debug messages, but once you get familiar with it there's plenty of useful info there.

Use a PHP cache
The first thing to do, when trying to make MediaWiki run faster, is to make sure you're using a PHP opcode cache like APC. In my case, it turned out my webhost already had APC installed; all I had to do was add the line extension=apc.so to my php.ini file. If you're running your own server, you may need to install APC first. Your OS distro may already have a package for it (for example, Ubuntu calls it php-apc), or you can get it from PECL.

APC also provides a data cache, which MediaWiki can make good use of. Just set $wgMainCacheType = CACHE_ACCEL; (or CACHE_ANYTHING) in your LocalSettings.php. You should also do this if you have any other suitable cache available, such as WinCache on Windows servers.

Increase parse cache lifetime
The default value of $wgParserCacheExpireTime is only 86400 seconds (24 hours), which means that the parser cache rarely does anything for any pages viewed less that once a day. This may be a reasonable value for high-traffic wikis like Wikipedia, but on small wikis that may make the parser cache almost useless. On my wiki, I set $wgParserCacheExpireTime = 2592000; (30 days), but it might be even better to just set it to something like 10 years or more.

Of course, you don't want to keep the parsed pages in APC cache that long, so set $wgParserCacheType = CACHE_DB; to put them in the database instead. Unlike most caches MediaWiki uses, the parser cache can grow quite large, but is only accessed relatively infrequently (once per page load, more or less), so putting it in the DB makes sense for low traffic wikis.

Optimize interface messages
While the parsed pages are loaded from the cache, MediaWiki still has to rebuild the navigation interface on every page view. That means loading and parsing lots of interface messages.

The first step is to make sure these messages are efficiently cached. If you installed MediaWiki relatively recently, you should already have something like $wgCacheDirectory = "$IP/cache";</tt> in your LocalSettings.php. If not, add it there and make sure the directory exists and has appropriate permissions (chmod 700</tt> on Unix).

Second, you should enable the sidebar cache: just set $wgEnableSidebarCache = true;</tt>.

Finally, there are a few messages which are used on every page and which contain the  </tt> variable. This means they have to be fed through the parser on every page load. Since you're not likely to rename your wiki any time soon, editing these messages to replace  </tt> with the literal name of your wiki (or   </tt>, which expands to it when saved) will save a few precious cycles. These messages include at least aboutsite, opensearch-desc, pagetitle, pagetitle-view-mainpage, tagline and tooltip-search. (Also, depending on your wiki language, any other variables or parser functions, like  </tt> or  NaN undefineds </tt>, should also be substed in those messages.  The important thing is that the characters "{{</tt>" should not occur anywhere in the message.)

Run jobs from cron
By default, MediaWiki uses a rather inefficient system for running background jobs. If you have access to cron (or some other system for scheduling repeating tasks) on your webserver, set it to run runJobs.php</tt> regularly and set $wgJobRunRate = 0;</tt>. (The appropriate frequency for the cron job depends on how busy your wiki is and how much latency you and your users will tolerate. I'd recommend making the script run fairly often — say, every 5 minutes — and using the --maxjobs</tt> argument to make sure it will not run too long even if there are lots of jobs queued.)

Use the file cache
This is an aggressive optimization, but in my experience well worth it. The recommended way to set up MediaWiki is to put it behind a caching reverse proxy like Squid or Varnish, but this may not be practical on shared hosting. The file cache is a crude but effective way to get much the same effect by saving rendered pages in flat files.

To turn on the file cache, just set $wgUseFileCache = true;</tt> and point $wgFileCacheDirectory</tt> to a suitable directory (and make sure it exists). You may also try setting $wgUseGzip = true;</tt>, but be careful of the interactions documented on that page.

For small wikis running new MediaWiki version (1.17+), you may also want to consider setting $wgFileCacheDepth = 0;</tt>. This doesn't do much on its own (although it probably does save a few cycles), but if you configure your webserver right, you may be able to make it serve cached pages directly, without having to invoke MediaWiki at all. See the documentation page for the variable for details. I'm doing this on my wiki and it makes a huge difference: even with all the other optimizations, a typical MediaWiki request can take hundreds of milliseconds. Apache serves the cached pages directly about ten times faster than that.

Remember to run rebuildFileCache.php whenever you change your config settings, especially if you use the rewrite trick to bypass MediaWiki. If you have pages with changing content (say, your main page changes daily), you can set up a cron job to purge it as needed. (Just a simple <tt>rm cache/Main_Page.html</tt> should do.)

Upgrade MediaWiki
Recent versions of MediaWiki contain several performance improvements not available in older versions. Adventurous users may get it from Git. Actually, I've personally found the development to be no less buggy than the "stable" releases. Sure, new bugs are introduced sometimes, but they also get fixed fast and you don't need to wait for anyone to backport the fix. Just make sure you do a few tests to see that your wiki still works after every <tt>git pull</tt>.

Check your extensions
MediaWiki extension files run on every request, and so a single badly written extension can ruin a lot of careful optimization effort. Old extensions can be particularly problematic, since many of the currently recommended ways of writing efficient extensions were only introduced in recent MediaWiki versions.

One common problem I've seen is extensions needlessly unstubbing the parser just to register tags or parser functions. Generally, such things should be done in a ParserFirstCallInit hook, when it's actually known that the parser will be needed; see Manual:Tag extensions and Manual:Parser functions for examples of how to do it right. A lot of old (and even some new) extensions, however, will use a $wgExtensionFunctions callback instead, causing the parser to be unstubbed on every request whether it is needed or not.

In fact, I'd consider any use of $wgExtensionFunctions to be at least a warning sign. I'm sure it does have legitimate uses, but, at least in modern MediaWiki versions, there are usually more efficient ways.

Configure your web server
A typical MediaWiki page includes a lot of images and other files that are not actually served by MediaWiki. (If you use the file cache trick mentioned above, there might not even be any files served by MediaWiki directly.) You should configure your webserver to deliver these as efficiently as possible. In particular, there are plenty of files under the <tt>skins</tt> directory that rarely if ever change; setting a sufficiently long expiration time for them avoids needless HTTP requests. On Apache with mod_expires, the following lines in a <tt>skins/.htaccess</tt> file should do the trick: ExpiresActive On ExpiresDefault "access plus 1 month" You should also check that static HTML, CSS and JS pages are served gzipped to clients that request it (mod_deflate) and that they have <tt>charset=UTF-8</tt> specified in their Content-Type header (<tt>AddDefaultCharset UTF-8</tt>). Use Firebug (see above) to check that these are all working as intended.