Wikimedia Release Engineering Team/Quarterly review, November 2013

Date: 2013-11-19 | Time: 10:00 Pacific | Slides: TBC | Notes: etherpad

Who:
 * Leads: Greg G, Chris M
 * Virtual team: Greg G, Chris M, Antoine, Sam, Chris S, Chad, Zeljko, Michelle G, Andre, Ariel
 * Other review participants (invited): Robla, Sumana, Quim, Maryana, James F, Ryan Lane, Ken, Terry, Tomasz, Alolita

Topics: Deploy process/pipeline, release process, bug fixing (the how of priority), code review, code management, security deploy/release, automation prioritization

Big picture
Release Engineering and QA are where our efforts in Platform can be amplified. When we do things well, we start to see more responsive development with higher quality code. That is our focus. What we want to accomplish: ...All in an effort to pave the path to a more reliable continuous deployment environment.
 * More appreciation of, response to, and creation of tests in development
 * Better monitoring and reporting out of our development and deployment processes, especially test environments and pre-deployment
 * Reduce time between code being merged and being deployed
 * Provide information about software quality in a way that informs release decisions
 * Help WMF Engineering learn and adapt from experience

Team roles
Many people outside of the virtual team play an important role in releases, but this review will focus on the work of the following people in the following roles:
 * Release engineering: Greg G, Sam, Chris S (security)
 * QA and Test Automation: Chris M, Zeljko, Michelle G*,
 * Bug escalation: Andre, Greg G., Chris M, Chris S (security)
 * Beta cluster development/maintenance:' Antoine, Ariel(?), Sam
 * Development tools (e.g. Gerrit, Jenkins): Antoine

Goals from last quarter
New priorities:
 * Align browser test coverage to high profile features.
 * ✅ Apply model of Language/Mobile embedded QA to a new feature team (specifically VisualEditor)
 * ✅ Include more user contributed code testing (eg: Gadgets)
 * Began, but stopping Increase capacity through community training for browser tests
 * Browser tests reliably tracking features across WMF software development projects in beta cluster
 * Tests moved into the repositories of the code being tested
 * support for single-host extension e.g. Parsoid
 * Start comprehensive quarterly assessments of postmortems
 * but much more to do here. Possibly more next quarter (see below)
 * Improve our deployment process
 * automate as much as possible
 * improve monitoring
 * see DevOps Sprint 2013
 * improve tooling (eg: atomic updates/rollbacks and cache invalidation)
 * see DevOps Sprint 2013
 * Take the Beta Cluster to the next level
 * monitoring of fatals, errors, performance
 * ✅ ganglia
 * still needed: graphite and icinga (Ops supported needed)
 * add more automated tests for eg the API
 * ✅ Beta Cluster support for Parsoid
 * implement API tests ontop of VE+Parsoid
 * General improvements:
 * ✅ support for automatic deployment of git submodules e.g. VisualEditor
 * ✅ support for pre-release extensions with full i18n messages
 * feed experiences/gained knowledge of Beta Cluster automation up to production automation (ONGOING)
 * ✅ Successfully streamline the review and deployment of extensions (brought in from Q2)

Goals
vis a vis the WMF Engineering 2013-14 goals.

Keep:
 * Successfully managed the first release of MediaWiki in conjunction with our outside contractor
 * Browser tests managed in feature repos with feature teams
 * more comprehensive quarter assessments of postmortems

New Priority:
 * Automated API integration tests in important areas
 * UploadWizard
 * Parsoid
 * ResourceLoader
 * Expose fatal errors from both unit tests and browser tests to teams
 * Create process documentation for ideal test/deployment steps (eg: ThoughtWorks exercise)

Deprioritize:
 * Successfully streamline the review and deployment of extensions (Done in Q1)
 * Manage build times, parallel execution, continuous execution of browser tests for optimum coverage (vague, ongoing goal)
 * Focus on community contributions and non-WMF

Dependencies
Ops dependency:
 * Provide true HTTPS support on the Beta Cluster
 * Incinga for Beta Cluster
 * More comprehensive quarterly assessments of postmortems

MW Core dependency:
 * automation of deployment process
 * Monitoring of deploys (performance)
 * Security patch management on cluster (especially after new wmfXX branches are cut)