Extension:ExtensionDistributor/tardist

tardist is a proposed service to enhance the current ExtensionDistributor functionality. Some of it is based on top of the current functionality available at extdist.wmflabs.org (code). It is proposed that the service would live in WMF production. The service would not be exposed to the world directly, but would be exposed through api.php (probably, I'm not 100% sure about this).

We'll continue to use an hourly cron to update extensions, and at that time we'll read the extension.json file of the master branch (we should probably do it for all branches?) to tarballs will be stored with a directory somewhere, and served by some other webserver (Yuvi said we should do that, though it could fallback to the flask app for local debugging?). The extracted data will be stored in a small sqlite database and exposed via an API endpoint.

Local clones of each extension/skin will be kept accessible in the file system somewhere. The server would be firewalled to only allow connections to gerrit.wikimedia.org (phabricator in the future) and from MediaWiki app servers.

Proposed endpoints

 * / - tardist version info / link to docs?
 * /list - returns a list of extensions+skins that exist
 * /list/extensions
 * /list/skins
 * /info/extensions/MassMessage - returns list of branches+links to tarballs, and other metadata (license, description (localized?), etc...) that is available in extension.json
 * /info/extensions - batch return of all extension info maybe?

Requirements

 * web server running flask app
 * space for a small sqlite database (could be mysql if that would be easier)
 * hourly cronjob
 * filesystem space: on labs, the src git clones take ~5G, tarballs are ~1.4G. That will grow as new versions of MediaWiki are released and new extensions are created.

Why?

 * The service in labs is a hack. Since production cannot talk to labs, the ED extension just talks to gerrit and assumes that the labs instance is up to date
 * ED can only list extension or skin names, but we have so much other metadata available like description, license, url that *should* be shown to users
 * The ED extension is only meant to be deployed at mediawiki.org, so there are no other external users of the extension that this will need to support