Wikimedia Release Engineering Team/Checkin archive/20150414

= 2015-04-14 =

Team Quarterly Goals
https://phabricator.wikimedia.org/maniphest/query/O9isnUt5IGLP/#R

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/?statuses=open%2Cstalled&allProjects=PHID-PROJ-arpazvuktn2l647rb6us#R

Beta Cluster stability / Staging

 * https://phabricator.wikimedia.org/project/board/497/?order=priority
 * Quarterly Priority: Green nightly builds on staging: https://phabricator.wikimedia.org/T88701
 * Quarterly Priority: Stable uptime metrics of the Staging cluster: https://phabricator.wikimedia.org/T88705


 * rin tin tin tin
 * Yuvi has stuff for oids, in discussion

Deployment Cabal

 * First meeting with Marco. Been very productive, Marco highlighted issues Service team had/have with Trebuchet
 * Trebuchet as building block. Seems accurate for the organization needs

Information leaked from the project super secret etherpad (NOFORN):
 * Services need better visibility and hookability:
 * Better feedback and interactivity from the git deploy cli
 * one aspect of this is non-blocking interactivity of git deploy, it should report thorough, real-time feedback and allow deployer to take actions during the deployment process, not just after
 * Needs to have:
 * rolling restarts
 * 'canary' testing
 * easy semi-automatic rollback when things go bad
 * emergency abort if deployer notices something went wrong
 * ability to isolate and investigate failed nodes when the majority of nodes succeed but a few failed.

Homework:
 * Mukunda and Chad will work further on the mediawiki/core and extensions release branching strategy
 * Mukunda is experimenting with git-subtree to create a composite release repo with all deployed extensions merged into one place
 * Tyler will attempt to document the hooks in trebuchet which we can use to provide custom behavior for individual services or projects
 * This is already in the trebuchet core and it's configurable, just not very discoverable right now
 * Dan to sit in on deployments to get a better sense for the current practices

Test history

 * Quarterly Priority: By team test history: https://phabricator.wikimedia.org/T88706


 * prototyped with Riojs + Elasticsearch
 * need feedback from potential consumers (James F, Jon Robson, Elena, ... ?)

Isolated CI instances

 * https://phabricator.wikimedia.org/tag/continuous-integration/board/?order=priority
 * Quarterly Priority: Disposable VMs - https://phabricator.wikimedia.org/T47499


 * antoine whistes*
 * nodepool packaging
 * moving jenkins & zuul to new servers - will cause delay but worth it
 * puppet work to be paired by Antoine/Zeljko with reviews by Andrew/Chase

MediaWiki Releases

 * Quarterly Priority: Release MediaWiki 1.25: https://phabricator.wikimedia.org/T88709
 * List of open 1.25 bugs: https://phabricator.wikimedia.org/maniphest/query/BPlQRzYIEE31
 * Ones with patch-for-review https://phabricator.wikimedia.org/maniphest/query/tfwMAVRECGaZ/#R

Should the mediawiki release be an actual projet with a workboard? ie https://phabricator.wikimedia.org/tag/MW-1.25-release/

Other Work
Mukunda had a Phabricator meeting with Chase, Andre and Quim.
 * Chase will be less available soonish
 * Mukunda & Chad are getting root on Phabricator server, Andre getting admin privs to handle user/repo maintenance
 * We need to take over responsibility for Phabricator upgrades (monthly) after this month
 * There is a lot of ongoing Phabricator process work, we might take an interest in some of it (looking for links)
 * Phacility Paid prio: https://secure.phabricator.com/T7711

Hiring

 * delayed...

Vacations/Confs/etc

 * Antoine: observing french holidays: Fri May 1, Fri May 7-8, Thurs May 14
 * Dan in France the week before offsite (May 11-15)
 * Week before hackathon: Team offsite in France - https://phabricator.wikimedia.org/T89036
 * May: Hackathon in Lyon, France
 * Chad vacation after offsite/hackathon (through 5/31)
 * Elena on vacation May26-June07