Wikimedia Release Engineering Team/Checkin archive/20160815

= 2016-08-15 =

Vacations/Important dates
How to do it: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Time_off
 * July 25 - August 15: Željko vacation. Will have laptop with me. Reachable via phone.
 * July 30 - August 21: Antoine vacation. At home 1st week.
 * Sept 05: US Holiday (Labor day)
 * Sept 16: Q2 goals draft published
 * Sept 23: Q2 goals finalized
 * Oct 01: Start of Q2
 * October 10: US Holiday (Indigenous People's Day)
 * October 17-21: Offsite in Washington D.C.
 * October 31: Mukunda maybe?
 * November 24: US Holiday (Thanksgiving)
 * January 9-11: Dev Summit
 * January 12-13: All Hands

Time spent spreadsheet

 * Week 32 - https://docs.google.com/spreadsheets/d/1IrwGPdTDZ6H8x9Mf5dmCYlkK4hZ8sbUSLODEM4cFc4g/edit#gid=1998906598

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/u/blockers

weeks of Aug 8 and Aug 15

 * Train: Mukunda
 * wmf.14
 * wmf.15
 * SoS: Chad
 * https://phabricator.wikimedia.org/E155/19
 * https://phabricator.wikimedia.org/E155/20
 * Out:
 * Zeljko: July 25 - Aug 15
 * Antoine: July 30 - Aug 21

weeks of Aug 22 and Aug 29

 * Train: Antoine
 * wmf.16
 * wmf.17
 * SoS: Tyler
 * https://phabricator.wikimedia.org/E155/21
 * https://phabricator.wikimedia.org/E155/22
 * Out:

Actions from last meeting

 * TODO: Tyler - reply to Faidon's comment, keeping focused on getting off of gallium for now, we don't really care which network
 * TODO: Chad - lay out ideation on the LongLivedBranches task to then get Timo to review ( https://phabricator.wikimedia.org/T140921 )
 * TODO: Dan - Check status of the gem update to fix the https issue

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/query/h7YTCBTJsepS/#R

This week

 * Blocking
 * Blocked
 * Consolidate, remove, and/or downsize Beta Cluster instances to help with Purge_2016 - https://phabricator.wikimedia.org/T142288
 * specifically: -conftool (Ops?/Joe?), -conf3, (Analytics-ops?/Ottomata?), -kafka (Analytics-ops?), OCG (-pdf1 and -pdf2, but only 1 seems to be used?)
 * Updates
 * New SWAT window schedule starting Aug 22nd
 * See: https://wikitech.wikimedia.org/wiki/Deployments#Week_of_August_22nd
 * And: https://wikitech.wikimedia.org/wiki/SWAT_deploys#The_team
 * https://phabricator.wikimedia.org/T137970
 * https://phabricator.wikimedia.org/T137970

Last week ... Ooops, didn't happen

 * Blocking
 * Blocked
 * Consolidate, remove, and/or downsize Beta Cluster instances to help with Purge_2016 - https://phabricator.wikimedia.org/T142288
 * specifically: -conftool (Ops?/Joe?), -conf3, (Analytics-ops?/Ottomata?), -kafka (Analytics-ops?), OCG (-pdf1 and -pdf2, but only 1 seems to be used?)
 * Updates
 * New SWAT window schedule starting Aug 22nd
 * See: https://wikitech.wikimedia.org/wiki/Deployments#Week_of_August_22nd
 * And: https://wikitech.wikimedia.org/wiki/SWAT_deploys#The_team
 * https://phabricator.wikimedia.org/T137970
 * https://phabricator.wikimedia.org/T137970

Offsite

 * Rachel is working on venue options still; some good options so far
 * what do you want to talk about? Fill this out/vote on ideas:
 * https://etherpad.wikimedia.org/p/releng-offsite201610-proposedtopics

Replace primary production Continuous Integration host -

 * NEXT: https://phabricator.wikimedia.org/T139771 - "Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI"

Reduce Technical Debt
Perform a technical debt analysis of software and services maintained by WMF Release Engineering -


 * Original mega sheet: https://docs.google.com/spreadsheets/d/1Kxj9p4fKVNo2h23yAQVoOGg77dZ4FLxeXuYrH-1CrPA/edit#gid=0
 * Already is tracking specific 'things' which need to be addressed
 * Redux: https://docs.google.com/spreadsheets/d/1Ncbgbg-ZPSSScOaGswQSJRtreuJOlizRCFln4KyfMWI/edit#gid=0
 * Simply severity+importance.
 * Redux Redux: https://docs.google.com/spreadsheets/d/1btVdLuV59GZkQax8Hk0jkWDeyF5O_M5HkylWL0WxHxo/edit#gid=0
 * Just severity


 * Actions:
 * Fill out Redux Redux (THIS WEEK - week of Aug 1)
 * Based on Redux Redux identify the one thing to focus on (NEXT WEEK - week of Aug 8)

Hot spots:
 * MW 3rd party release, l10nupdate, (and probably swat) tooling
 * Nodepool

Streamline deployments (long-lived branches)
keyresult task: project view: https://phabricator.wikimedia.org/project/view/2117/
 * Convert our production deployment strategy to use long-lived branches -


 * added scap plugins functionality
 * can create special tooling eg: 'scap swat'
 * unresolved:
 * caching stuff (see also Chad's TODO)
 * build tooling to replace make-wmf-branch
 * documentation

SWAT deploy changes

 * European SWAT deploys next steps (
 * NEXT: Next week :)

CI Scaling/Nodepool

 * Outage :( :( :( https://etherpad.wikimedia.org/p/ci-nodepool-outage-20160810

Browser tests

 * https://phabricator.wikimedia.org/T142600 - Gergo is asking for help figuring out next steps
 * TODO: Dan help out

Beta Cluster

 * "deployment-fluorine becomes unresponsive frequently" - https://phabricator.wikimedia.org/T140313
 * Greg added dzahn to the patch

DB Inconsistencies

 * Figure out how to help Jaime with the DB schema inconsistencies issue:
 * https://phabricator.wikimedia.org/T132416 and https://phabricator.wikimedia.org/T104459 (see also: https://www.mediawiki.org/wiki/Development_policy#Database_patches )
 * What can we do in CI to help prevent, mostly?
 * Question: Does Chad need any help (other than time)?

Last week

 * Vacation

This week

 * Vacation

Last week

 * DB consistencies ....
 * Long lived branches
 * CI outage incident report

Last week

 * Start poking at MW-Vagrant jessie base image https://phabricator.wikimedia.org/T136429
 * Migrate deployment-prep to jessie https://phabricator.wikimedia.org/T138778
 * Follow up on mw-selenium/browser tests dependency updates re https://phabricator.wikimedia.org/T129483

This week

 * Start poking at MW-Vagrant jessie base image https://phabricator.wikimedia.org/T136429
 * Migrate deployment-prep to jessie https://phabricator.wikimedia.org/T138778

Last week

 * Train
 * LLB
 * Mukunda to share branch merging prototype code and solicit feedback from the team
 * Ended up building a scap plugin framework that allows scap cli tools to be loaded from alternative locations, this will be the basis for two new tools:
 * `scap swat` https://phabricator.wikimedia.org/T142880
 * `scap merge` https://phabricator.wikimedia.org/T140918
 * Maybe start playing with jenkins api for SWAT tool
 * https://gerrit.wikimedia.org/r/#/c/304604/

This week

 * Mediawiki train: 1.28.0-wmf.15
 * LLB
 * Continue working on the `scap swat` tool, more experimentation with gerrit api
 * Deploy release-tools repo with scap3 (if time allows)

Last week

 * Scap update
 * train
 * analytics/refinery move

This week

 * Bugfix scap update
 * Parsoid rollout via scap3
 * Try to stay on top of gallium things
 * Incident report for CI outage!

Last week

 * Vacation