Manual:MediaWiki file usage

<- MediaWiki architecture

MediaWiki stores some data in the local filesystem. If you were to use webserver clustering to serve a big wiki, you'd want to take this into account.

Uploaded media files
These go into a LocalSettings-specified directory, usually called "upload". Filenames are MD5-hashed, and since many filesystems don't handle large directories well, uploads are broken into subdirectories with the first and first two hex digits of the hash, eg:


 * upload/7/79/Wikicontestlogo598932.jpg
 * upload/a/a6/Stygian_Wiki_Logo_Proto_Grey.png
 * upload/f/fb/Wikipedia_sub.jpg

Security note: Even though the upload system contains a filetype blacklist, you should ensure that your webserver is not configured to execute PHP or CGI scripts in the upload directory, just in case. Other filetypes may be unsafe downloads for some client systems.

Archived uploads
When a new version of a file is uploaded, the old one is moved into the upload/archive directory, with its timestamp and "!" prepended. Again, hash subdirs are used:
 * upload/8/85/Edit_this_page_intl.png - current revision
 * upload/archive/8/85/20030628125544!Edit_this_page_intl.png - archived revision

Rasterized TeX
texvc generates rasterized PNG images from inline TeX code (if this option is enabled). These are dumped in a web-accessible LocalSettings-specified directory, usually called "math".

The filename is the hex MD5 hash of the normalized TeX input, plus ".png":
 * math/47e7849b634fe4487aaf981b32825e18.png

These files just accumulate, and are not automatically removed if the equations they represent are removed from articles (or existed only during preview rendering).

The images can be manually deleted, since the wiki can regenerate them, but if you do you'll want to fix the database as well:
 * Clear the affected entries in the math table, or the wiki will think it's already rendered those bits
 * TODO: doublecheck that the file exists before deciding that an entry in the math table is valid
 * If using file cacheing, do one of the following to invalidate the cached pages or visits by anon users won't trigger regeneration of the images:
 * remove all (affected) pages from the cache (consider grep)
 * Update cur_touched fields to present time for affected entries (check for "&lt;math>" in cur_text)
 * Update the global $wgCacheEpoch timestamp in LocalSettings, forcing all cached pages to be regenerated without going to the bother of deleting anything.

File cache
Optionally, rendered HTML pages may be kept in a cache directory and served to anonymous visitors. See more details at file cache.

Read-only lock file
The developer "Make database read-only" function writes a lock file, whose name and location may be specified in LocalSettings. If the file exists, the wiki will (hopefully) avoid operations which write to the database. The contents of the file are a message to be displayed to the poor users experiencing the editing lockout.

It need not be web-shared, but it shouldn't be a security risk to do so. The default location is in the upload directory.

TeX temporary files
TeX rendering requires a temporary directory to store intermediate files. It need not be web-shared, but it shouldn't be a security risk to do so. These files are only used during the course of producing the rasterized PNG files, and need not be kept or shared.

Debug log
One may optionally enable a debug log, which prints all kinds of annoying messages. It should not generally be left on, as this will waste time and space. The debug log includes URLs of all requests to the wiki and sometimes IP addresses, so you may not want to make it publicly available.

Session data
A little bit of data for login sessions is handled by PHP's session handling. This stores a bunch of small files in the /tmp directory; you can probably configure PHP to store them elsewhere.