Extension:SwiftMedia

http://OpenStack.org has an object store system called Swift. This code allows you to use a Swift repository to store MediaWiki media files. There are two parts to this code. The first is middleware for Swift's proxy server which converts the MediaWiki image URLs into the URL format needed by Swift. The second is an extension to MediaWiki.

Swift middleware
Swift hands its files out to users via a proxy. You can actually access the cluster directly, but you need to know as much about the system as the proxy knows, so unless you want to go to that effort, you should use the proxy, and we do. The proxy requires a URL with three parts: an account name, a container name, and an object name. The account name is a function of the authentication system, and is a long hex string; effectively a UUID. The container is simply an opaque string which doesn't have slashes. The object name may have anything in it.

Our media store URLs, on the other hand, start with the name of the host (or possibly a separate host), the string 'images' (by default), possibly several hashed subdirectory levels, and the name of the object. In the case of Wikipedia, the host is 'upload.wikimedia.org, followed by "wikipedia/commons" instead of 'images', followed by two levels of hashing, and the name of the file. Thumbnails, archived files, and deleted files have a prefix on the hashing. These are generally published, so we have to work with them.

The middleware inserts the account name into the URL, converts the "wikipedia/commons" section into a Swift container name by replacing slash with %2F, adds "%2Fthumb" or "%2Farchived" or "%2Fdeleted" to the container name and adds the rest of the hashing and filename as the object name. Swift doesn't need the hashing since it does its own hashing; it can take or leave our hashing. For backwards compatibility and ease of finding files, we leave it there. Once the URL has been rewritten, it gets handed to the remainder of the Swift proxy, which then hands the file back.

So yes, Swift's proxy is serving up image files to our caching front-ends. Usually a token is needed to access files, but we've marked some containers as "public", meaning that no token is needed.

404 handler
The middleware intercepts the return value from Swift, and looks at the result. If it's a 404 error, the 404 handler is invoked. Currently it contacts the existing thumbnail server and fetches the file. In the future it will create a scaled version of the file.

MediaWiki Extension
Swift provides no access to a filesystem; it is an object server, not a file server. In order to allow our media handlers to do their work, The extension pulls files in from Swift, runs the media handler, and writes the resulting file out to the object store in the appropriate location. When a file is uploaded, rather than store it in the filesystem, it gets uploaded as a Swift object.