Requests for comment/Simplify thumbnail cache

This is a request for comment about changing the thumbnail storage and caching pipeline for Wikimedia projects.

Background
There is a significant amount of complexity both for software developers and operations engineers related to the management of scaled media files (thumbnails) in the Wikimedia projects. The current implementation tightly couples backend storage with frontend caching somewhat to the detriment of both systems. This topic has been discussed in the past but as yet has no resolution.

Problem

 * Issuing HTCP purge messages from PHP in response to media file change or deletion requires enumerating all potentially cached thumbnails
 * Lots and lots of HTCP purge messages may be needed to clean up the thumbnails for a given media delete, to the point that the end-user request may timeout
 * HTCP purges (?action=purge) are not idempotent; objects get deleted and the purges for thumbnails cannot be repeated to e.g. recover from multicast packet loss.
 * Thumbnails of all sizes are stored forever, until the original gets deleted. Unused & rarely used thumbnails are never cleaned up.
 * Swift has been configured somewhat awkwardly to support wildcard listing of stored thumbnails for enumeration
 * Swift has an extra layer of complexity to handle 404s in thumbs in a special & fragile way to fetch from imagescalers
 * Thumbnails take up 25-30% of the on disk storage footprint in Swift

Treat thumbnails as a CDN only concern

 * 1) Configure Varnish so that a single purge message drops all variants of a given media file's thumbnails
 * 2) Configure Varnish to pattern-match thumbnails and switch backend from Swift to imagescalers
 * 3) Configure MediaWiki imagescalers to stop storing generated thumbnails in Swift
 * 4) Generate individual thumbs in real-time in response to cache misses

Tim, Asher Mark and Faidon have all weighed in on this general idea in the past but no major work has been undertaken to implement or verify the idea.

Some media types may still need durable storage of generated thumbnails. TimedMediaHandler for example uses reference thumbnails to render other thumbnails at the same time of a video and should continue storing these thumbnails somewhere in swift.

Benefits

 * Only one HTCP purge message needed
 * Simplifies php code by removing a list generation and traversal
 * Reduces Swift load, I/O pressure & hardware cost significantly by eliminating wildcard enumeration requests
 * No need to delete superseded thumbnail files from swift
 * Reduces Swift I/O load significantly
 * Removes a potential point of failure for a delete/move operation on the base file
 * Lots of disk reclaimed from swift
 * Reduces hardware cost of Swift clusters
 * Reduces maintenance cost of Swift clusters

Drawbacks

 * Use of  in this way is untested and thus carries unknown risks
 * Varnish currently tracks items mapped to the same hash key in a linked list . This could become a bottleneck for media such as multi-page TIFF, DjVu or PDF files that have page variants as well as size variants. Research would be needed to determine a reasonable upper limit for variants to collapse into a single hash and/or find a more efficient data structure to implement in Varnish itself.
 * Varnish 4 may include surrogate key/secondary hashing (Fastly & Varnish were working independently on this) but while the release is imminent, deployment at Wikimedia is probably months away.
 * Increased utilization of image scalers
 * Faidon estimates that image scaler jobs would grow from current ~75/s (avg) & ~110/s (max) to ~500-950/s to handle request volume with the current size of the Varnish cache and current usage patterns (and assuming no Varnish caches are out of commision, see below).
 * Increased latency for CDN misses
 * The requests that are currently satisfied by Swift fetches of generated thumbnails would instead require a fetch of the original media and a scaling transformation.
 * May not be reasonable for media types that have high thumbnail generation costs or a potentially huge number of thumbnails
 * Specifically, there are: a) multiple multi-page TIFF, PDF and DjVu files that tend to have a huge number of thumbnails, b) photographs or paintings, mostly TIFF, that are hundreds of megabytes large.
 * Reduced hardware failure tolerance
 * Swift keeps 3 copies of each thumb distributed across the storage cluster to provide HA access to stored files.
 * Varnish uses URL persistent hashing to ensure the same backend Varnish is hit every time resulting in a single node holding a thumb in cache. This makes storage in Varnish more susceptible to hardware failures, which in turn will mean increased imagescaler load with all of the above drawbacks.
 * The Varnish cluster in eqiad has 8 boxes, which means that ~12.5% of the cache resides in each one of them.

Other strategies
Increased baseline image scaler load and the potential for wasting processing power in the image scaler cluster in order to handle traffic spikes may be issues that are too large to ignore in the solution to this problem. These concerns might be addressable via slightly more complex Swift caching strategies.


 * 1) Rather than storing generated thumbnails as permanent media, add an   header that specifies a TTL for the stored file.
 * 2) * This would allow Swift to purge files after some reasonable time rather than holding them indefinitely.
 * 3) * The right TTL would be one that strikes a balance between the cost of capacity for generating new thumbnails and the cost of storage for storing previously generated ones.
 * 4) * This, unlike Varnish, would still expire popular thumbnails and increase page latency for even popular/frequently hit pages.
 * 5) * It is still unknown whether the mechanism for deleting files in Swifts scale, both on the container level as well as in the filesystem level: past attempts of issuing DELETEs have maxed Swift out at about 150-200 DELETE/s with 80% I/O wait on backends.
 * 6) Use TTLs in Swift, but bump the TTL on hit with a sliding window (similar to Linux's   feature) to simulate LRU cache deletion.
 * 7) * Similar benefits as prior option, but TTL tuning may be easier.
 * 8) Only store generated thumbnails in Swift for certain thumbnail sizes that are determined to be in widespread/suggested use. This option has also been presented in the Standardized thumbnails sizes RfC.
 * 9) * Most similar to current behavior but attempts to minimize long term storage costs by selective caching.
 * 10) * Different projects/language wikis have in the past requested different default thumbnail sizes, so "widespread/suggested" may be hard to pinpoint.
 * 11) Store "standard" thumbnails permanently and others with TTL (and possibly last use updating)
 * 12) * This variant would make update calls to the Swift layer less common than other TTL based approaches and still have LRU-ish purge characteristics.
 * 13) Add a group of Varnish servers backed by hard disk to store thumbs instead of using Swift.
 * 14) * Varnish is a purpose-built HTTP cache system with TTL and LRU eviction.
 * 15) * No special purge mechanism needed as it could share the HTCP stream with the frontend cache.
 * 16) * Does not address the reduced failure tolerance or  drawbacks of primary proposal.