Wikimedia Release Engineering Team/Quarterly review, November 2013

Date: 2013-11-19 | Time: 10:00 Pacific | Slides: google docs | Notes: etherpad

Who:
 * Leads: Greg G, Chris M
 * Virtual team: Greg G, Chris M, Antoine, Sam, Chris S, Chad, Zeljko, Michelle G, Andre, Ariel
 * Other review participants (invited): Robla, Sumana, Quim, Maryana, James F, Ryan Lane, Ken, Terry, Tomasz, Alolita

Topics: Deploy process/pipeline, release process, bug fixing (the how of priority), code review, code management, security deploy/release, automation prioritization

Big picture
Release Engineering and QA are where our efforts in Platform can be amplified. When we do things well, we start to see more responsive development with higher quality code. That is our focus. What we want to accomplish: ...All in an effort to pave the path to a more reliable continuous deployment environment.
 * More appreciation of, response to, and creation of tests in development
 * Better monitoring and reporting out of our development and deployment processes, especially test environments and pre-deployment
 * Reduce time between code being merged and being deployed
 * Provide information about software quality in a way that informs release decisions
 * Help WMF Engineering learn and adapt from experience

Team roles
Many people outside of the virtual team play an important role in releases, but this review will focus on the work of the following people in the following roles:
 * Release engineering: Greg G, Sam, Chris S (security)
 * QA and Test Automation: Chris M, Zeljko, Michelle G*,
 * Bug escalation: Andre, Greg G., Chris M, Chris S (security)
 * Beta cluster development/maintenance:' Antoine, Ariel(?), Sam
 * Development tools (e.g. Gerrit, Jenkins): Antoine

Goals from last quarter
New priorities:
 * Align browser test coverage to high profile features.
 * ✅ Apply model of Language/Mobile embedded QA to a new feature team (specifically VisualEditor)
 * ✅ Include more user contributed code testing (eg: Gadgets)
 * Began, but stopping Increase capacity through community training for browser tests
 * Browser tests reliably tracking features across WMF software development projects in beta cluster
 * Tests moved into the repositories of the code being tested
 * ❌ support for single-host extension e.g. Parsoid
 * Start comprehensive quarterly assessments of postmortems
 * but much more to do here. Possibly more next quarter (see below)
 * Improve our deployment process
 * automate as much as possible
 * improve monitoring
 * see DevOps Sprint 2013
 * improve tooling (eg: atomic updates/rollbacks and cache invalidation)
 * see DevOps Sprint 2013
 * Take the Beta Cluster to the next level
 * monitoring of fatals, errors, performance
 * ✅ ganglia
 * still needed: graphite and icinga (Ops supported needed)
 * add more automated tests for eg the API
 * pending Beta Cluster support for Parsoid
 * pending implement API tests ontop of VE+Parsoid
 * General improvements:
 * ✅ support for automatic deployment of git submodules e.g. VisualEditor
 * ✅ support for pre-release extensions with full i18n messages
 * feed experiences/gained knowledge of Beta Cluster automation up to production automation (ONGOING)
 * ✅ Successfully streamline the review and deployment of extensions (brought in from Q2)

Goals
vis a vis the WMF Engineering 2013-14 goals.

Keep:
 * Successfully managed the first release of MediaWiki in conjunction with our outside contractor
 * Work with Antoine on swift backed download.wikimedia
 * Browser tests managed in feature repos with feature teams
 * more comprehensive quarter assessments of postmortems

New Priority:
 * Automated API integration tests in important areas
 * UploadWizard
 * Parsoid
 * ResourceLoader
 * Expose fatal errors from both unit tests and browser tests to teams
 * Create process documentation for ideal test/deployment steps (eg: ThoughtWorks exercise)

Deprioritize:
 * Successfully streamline the review and deployment of extensions (Done in Q1)
 * Manage build times, parallel execution, continuous execution of browser tests for optimum coverage (vague, ongoing goal)
 * Focus on community contributions and non-WMF

Dependencies
Ops dependency:
 * Provide true HTTPS support on the Beta Cluster
 * Incinga for Beta Cluster
 * More comprehensive quarterly assessments of postmortems

MW Core dependency:
 * automation of deployment process
 * Monitoring of deploys (performance)
 * Security patch management on cluster (especially after new wmfXX branches are cut)

Questions

 * Q: who is working on Vagrant support?
 * A: mostly distributed (e.g. Adam, Yuvi, etc)


 * Tomasz: what is being done to distribute work to other teams?
 * Greg: I think that's what our first bullet point was trying to get at
 * Chris: we have a great model working with the Language team


 * Resourcing for deployment tooling work (Erik)
 * Platform needs to determine it's ability to delegate some of it's long tail of responsibilities
 * begin with a sprint to kick off dev tooling with aaron/ori (arch) and bryan (dev)?
 * maybe easier to do limited bursts of support with an explicit backlog of issues for a specific component (eg ResourceLoader)

Actions

 * ACTION: Need to document the feature requirements of sartoris/etc - possible task for Bryan Davis after scholorship app (GG)
 * ACTION: clarify priority of work with antoine re vagrant spin ups for Jenkins builds (GG)
 * ACTION: change checkins to weekly (GG)
 * ACTION: revamp meeting style and project management (GG)