Compression

mo: prerequisite software Data Compression as it relates to mediawiki.

what is compression
Data_compression

Compression of dumps
The WikiMedia Database is quite large, so wikimedia compresses the database dumps using bzip2.

Compression of articles
It is possible, with apache and mod_deflate, to compress individual pages served. Both the browser and the server must support it, and it is normally negotiated, (with uncompressed version available). This, however, can put extra stress on certain parts of the server infrastructure.

Anthony DiPierro is looking into the feasibility of using Huffman coding. A preliminary article space character count is available.

On or about 2004-02-20 the old table and archive table were changed to allow some articles in the history table to be compressed. Old entries marked with old_flags="gzip" have their old_text compressed with zlib's deflate algorithm, with no header bytes. PHP's gzinflate will accept this text plainly; in Perl etc set the window size to -MAX_WSIZE to disable the header bytes.

History compression
It is also possible to compress the history table in a way which exploits the similar data in the different versions, such as Reverse diff version control. See History compression for some actual numbers.

Cache compression
File cache talks about compression in the cached copies of pages. Now that the Wikimedia projects use squids, it's unclear how much of this is obsolete.

Request for deletion
--Elian 00:39, 23 Jul 2004 (UTC)

Please see the talk page for this article's rationale.