Wikimedia Release Engineering Team/Wishlist

This is the wishlist for the Release Management and QA team.

Code Deploy Dashboard
Problem: We don't have an simple way of communicating what code is deployed to which wikis. Right now you need a combination of Gerrit and the MediaWiki Roadmap.

Stakeholders/use-cases:
 * Product Managers: I want to see one page where I can tell if code my team has developed is on all Wikipedias.
 * Developers: I want to know when code I have written is on the test wikis and when it'll be on English Wikipedia.
 * Technical Users/Readers: I want to see if there's an obvious excuse for a problem I am seeing on my home wiki due to a recent deployment.
 * Release Team: All of the above.

Solution ideas:
 * Create a dashboard page, hosted in Gerrit that displays the currently deployed/active branches on the WMF Cluster (and, for bonus points, the Beta Cluster). The branches will link to a single page in Gerrit that lists all of the changes and their commit short message.
 * The above, but hosted on mediawiki.org (managed by a bot, on commit, so it is not out of date).
 * The above, but hosted on its own vhost and running on the git.wikimedia.org

Raw thoughts from Ryan Lane: 20:14 < Ryan_Lane> greg-g: for the code deploy dashboard... 20:14 < Ryan_Lane> trebuchet writes its deployment info into redis 20:14 < Ryan_Lane> if we change its schema some we can track each deployment separately by                   tag 20:15 < Ryan_Lane> as well as the deployment message that went along with it 20:15 < Ryan_Lane> then a dashboard could just read from redis 20:16 < Ryan_Lane> currently each deployment for a repo overwrites the data from the last, to make things simpler 20:16 < Ryan_Lane> hm. or does it? did I change that 20:16 < Ryan_Lane> I need to document the schema being used 20:17 < Ryan_Lane> hm, I should also put the schema/attribute mapping in a pillar so that it can be changed without needing to modify the code everywhere ... later, in a different channel 00:18 < Ryan_Lane> it [NB: a dashboard] was my original intention for the data, but a cli was quicker/easier to build

Release Notes Sanity
Problem: 13:09 <    bd808> Gah! Release notes! F U release notes 13:10 <   greg-g> :) 13:10 <     ori-l> it's one of our "you forgot to say simon says" -1 tricks

Stakeholders:
 * All MediaWiki developers - When commiting a change to MW that needs a note in the RELEASE-NOTES-X.XX file, I shouldn't have to play games with gerrit and rebasing continuously.

Solution ideas:
 * rm -f RELEASE-NOTES-X.XX
 * Merge conflict resolution driver

True code pipeline
Right now code does not follow the same pipeline universally. For example:
 * Code merged to master on Friday will be on the Beta Cluster for one full week before going out on the testwikis
 * Code merged to master on Thursday early morning will be on the Beta Cluster for all of an hour (or a minute) before going to testwikis.

This is not ok. Code should propagate from master -> Beta Cluster -> test production -> other wikis in a consistent fashion.

Proposal
Nightly test tags


 * Every night at a specified time we tag master in core and all extensions in use on the cluster.
 * pre/post merge unit tests are run against the tag
 * the tag is deployed to the Beta Cluster using multiversion
 * browser tests are run against the beta cluster hitting the nightly tag version
 * in the morning we review the browser tests and unit test output, determine suitability for deploy
 * deploy the tag as the new wmfXX deployed version on Thursday morning
 * If the latest tag is not suitable (due to failed browser tests etc) we use the last suitable nightly (eg Tues night's)

Auto-populate 'Important Changes' from RELEASE-NOTES-1.XX
See the 'important changes' section in most wmfXX release pages, eg this one.

It would be nice to auto-create that based on the RELEASE-NOTES file.

See the RFC: Requests_for_comment/Release_notes_automation

block commits on warnings... almost
Let's make higher quality code. One way is to have a voting test that fails on code compile/build warnings. That's a bit extreme.

An alternative is to compare the number of warnings (and TODOs/FIXMEs?) before and after the commit and only fail if that number increases.