Extension:ExtensionDistributor/tardist

tardist is a proposed service to enhance the current ExtensionDistributor functionality. Some of it is based on top of the current functionality available at extdist.wmflabs.org (code). It is proposed that the service would live in WMF production. The service would not be exposed to the world directly, but would be exposed through api.php (maybe, I'm not sure about this).

Rather than update extensions using an hourly cron like exdist, we'll use a push system where jenkins will notify the service whenever a new change is merged to the extension and which branch needs updating. There will be a flask app running, which communicates with the continuously-running worker script through a redis queue. tarballs will be stored with a directory somewhere, and served by some other webserver (Yuvi said we should do that, though it could fallback to the flask app for local debugging?).

Local clones of each extension/skin will be kept accessible in the file system somewhere, and we'll use redis for caching on web requests. The server would be firewalled to only allow connections to gerrit.wikimedia.org (phabricator in the future) and from MediaWiki app servers.

Proposed endpoints

 * / - tardist version info / link to docs?
 * /update/extensions/MassMessage/master - queue to rebuild a tarball (automatically called by jenkins on post-merge)
 * /list - returns a list of extensions+skins that exist
 * /list/extensions
 * /list/skins
 * /info/extensions/MassMessage - returns list of branches+links to tarballs, and other metadata (license, description (localized?), etc...) that is available in extension.json
 * /info/extensions - batch return of all extension info maybe?

Requirements

 * web server running flask app
 * redis instance for queue and caching
 * continuous running worker process
 * filesystem space: on labs, the src git clones take ~5G, tarballs are ~1.4G. That will grow as new versions of MediaWiki are released and new extensions are created.

Why?

 * The service in labs is a hack. Since production cannot talk to labs, the ED extension just talks to gerrit and assumes that the labs instance is up to date
 * ED can only list extension or skin names, but we have so much other metadata available like description, license, url that *should* be shown to users
 * The ED extension is only meant to be deployed at mediawiki.org, so there are no other external users of the extension that this will need to support