Wikimedia Release Engineering Team/Checkin archive/20160725

= 2016-07-25 =

Vacations/Important dates
How to do it: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Time_off ...
 * July 25 - August 15: Željko vacation. Will have laptop with me. Reachable via phone.
 * July 30 - August 21: Antoine vacation. At home 1st week.
 * August 1st - 5th: Mukunda - vacation: Concert & relaxation
 * January 9-11: Dev Summit
 * January 12-13: All Hands

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/u/blockers

weeks of July 25 and Aug 1

 * Train: Tyler
 * wmf.12
 * wmf.13
 * SoS: Mukunda / Tyler
 * https://phabricator.wikimedia.org/E155/17 - Mukunda
 * https://phabricator.wikimedia.org/E155/18 - Tyler
 * Out:
 * Zeljko: July 25 - Aug 15
 * Antoine: July 30 - Aug 21
 * Mukunda: Aug 1-5

weeks of Aug 8 and Aug 15

 * Train: Mukunda
 * wmf.14
 * wmf.15
 * SoS: Chad
 * https://phabricator.wikimedia.org/E155/19
 * https://phabricator.wikimedia.org/E155/20
 * Out:
 * Zeljko: July 25 - Aug 15
 * Antoine: July 30 - Aug 21

Time spent spreadsheet

 * Week 29 - https://docs.google.com/spreadsheets/d/1IrwGPdTDZ6H8x9Mf5dmCYlkK4hZ8sbUSLODEM4cFc4g/edit#gid=1206189115

Actions from last meeting

 * TODO: file task re upgrading MW-Vagrant guests to Jessie
 * done by bryan :)
 * https://phabricator.wikimedia.org/T136429

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/query/h7YTCBTJsepS/#R

This week

 * Blocking
 * Blocked
 * Updates
 * Labs update (with CI distruption)
 * Proposed 2016-08-02. What time is best?
 * Andrew still wondering about https://phabricator.wikimedia.org/T139771
 * If everything is fine w/CI response then it's all good :)
 * Andrew still wondering about https://phabricator.wikimedia.org/T139771
 * If everything is fine w/CI response then it's all good :)

Last week
Blocking https://wikitech.wikimedia.org/wiki/X-Wikimedia-Debug if you're putting things up for SWAT
 * Android to differential
 * Blocked
 * None
 * Updates
 * Zuul upgraded this week, should address a bunch of issues
 * New SWAT deploy process going ok, reminder to install

Project tech leads

 * https://phabricator.wikimedia.org/T139540#2485589
 * tl;dr: how about we just add an explicit "Lead" for each quarterly goal?
 * eg: https://www.mediawiki.org/w/index.php?title=Wikimedia_Release_Engineering_Team%2FGoals%2F201617Q1&type=revision&diff=2196830&oldid=2181515

Offsite

 * agenda/purpose :)
 * https://phabricator.wikimedia.org/T138437
 * Will be meeting with Kristen Lans from TPG re potential TPG support in 2 hours :)
 * Last offsite writeup with lessons learned: https://docs.google.com/document/d/17C6x_Sys21DcEZ_HxgLA7FkCYeiTUAzZX1XbEcZTNfw/edit#

Replace primary production Continuous Integration host -

 * NEXT: https://phabricator.wikimedia.org/T139771 - "Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI"
 * Tyler and Hashar reply to Faidon's comment, keeping focused on getting off of gallium for now

Upgrade Beta Cluster database servers to Maria10/Jessie -

 * waiting on Jaime to priority
 * Priority is "this quarter" (not "this month" or "next week")
 * up to us to schedule, should be no more than an hour for Jaime
 * NEXT: Needs an owner
 * DAN!

Reduce Technical Debt
Perform a technical debt analysis of software and services maintained by WMF Release Engineering -


 * Original mega sheet: https://docs.google.com/spreadsheets/d/1Kxj9p4fKVNo2h23yAQVoOGg77dZ4FLxeXuYrH-1CrPA/edit#gid=0
 * Already is tracking specific 'things' which need to be addressed
 * Redux: https://docs.google.com/spreadsheets/d/1Ncbgbg-ZPSSScOaGswQSJRtreuJOlizRCFln4KyfMWI/edit#gid=0
 * Simply severity+importance.
 * Redux Redux: https://docs.google.com/spreadsheets/d/1btVdLuV59GZkQax8Hk0jkWDeyF5O_M5HkylWL0WxHxo/edit#gid=0
 * Just severity


 * Next steps?
 * Fill out Redux Redux
 * Based on Redux Redux identify the one thing to focus on
 * then plan accordingly in Phabricator

Streamline deployments (long-lived branches)
keyresult task: project view: https://phabricator.wikimedia.org/project/view/2117/
 * Convert our production deployment strategy to use long-lived branches -

SWAT deploy changes

 * European SWAT deploys next steps (
 * NEXT: stalled pending finding people to do the SWAT window while Antoine and Zeljko are on vacation
 * Week of 20th August lets gogogo

CI Scaling/Nodepool

 * Wait time for Nodepool instances https://grafana.wikimedia.org/dashboard/db/releng-kpis
 * Zuul repackaged with latest upstream. Will upgrade all fleet early this week.
 * debian-glue job enhanced
 * TODO: Zuul packaging tutorial
 * TODO: android job. Move to Jessie, hacked over the week-end has to be polished https://phabricator.wikimedia.org/T139137
 * MySQL on CI slaves either shutdown randomly / or does not start on boot :(

Differential migration
Differential weekly (https://etherpad.wikimedia.org/p/diffuerential-weekly ) TODOs:


 * Mukunda had questiosn for antione re puppet (keys into the private store, production or other? for CI image builder)
 * see: https://cloudbees.zendesk.com/hc/en-us/articles/203802500-Injecting-Secrets-into-Jenkins-Build-Jobs


 * Update documentation on creating/renaming of repos in Diffusion
 * https://phabricator.wikimedia.org/T139688


 * Update task with discussion about ACLs?
 * https://phabricator.wikimedia.org/T130786


 * Announce plan to migrate MW-Vagrant to Differential
 * https://phabricator.wikimedia.org/T131419#2439362
 * outstanding patches should be either merged, abandoned or migrated to differential revisions.

Beta Cluster

 * "deployment-fluorine becomes unresponsive frequently" - https://phabricator.wikimedia.org/T140313
 * TODO: Submit patch ( https://gerrit.wikimedia.org/r/#/c/299672/ ) for PuppetSWAT?

Other

 * Figure out how to help Jaime with the DB schema inconsistencies issue:
 * https://phabricator.wikimedia.org/T132416 and https://phabricator.wikimedia.org/T104459 (see also: https://www.mediawiki.org/wiki/Development_policy#Database_patches )
 * What can we do in CI to help prevent, mostly?
 * Chad will lick this cookie :)

Scap querying logstash now for the canaries:
 * Email to review: https://etherpad.wikimedia.org/p/scap-announce-2016-07-25

Last week

 * Gerrit upgrade / Zuul upgrade
 * Target host to replace gallium
 * Sync up with Tyler for CI / gallium phase out
 * Moaar maintenance
 * Offsite site/date

This week

 * Zuul upgrade to latest upstream
 * Zuul packaging doc
 * Vacations backup plan

Last week

 * Moar Gerrit. Train. Choo choo.

This week

 * Gerrit. Remove precise remnants from puppet, tune cache stuff, CSS tweaks for crap UI. Triaging old bugs to see which are fixed / invalid / still fixable.
 * DB consistencies thingie for Jaime. I owe him one.

Last week

 * Getting back

This week

 * Start poking at MW-Vagrant jessie base image https://phabricator.wikimedia.org/T136429
 * Figure out where we're at with Malu

Last week

 * Get the merge-wmf-branch script cleaned up and shared with the team for feedback
 * Brainstorm improvements / other ideas around branch merging / cherry-picking

This week

 * T141278: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches https://phabricator.wikimedia.org/T141278

Last week

 * MW Canary work

Last week

 * trying to do the first SWAT (depending on https://phabricator.wikimedia.org/T140264 MediaWiki deployment shell access request for zfilipin)
 * Analyze (and share analysis of) the browser testing feedback survey https://phabricator.wikimedia.org/T139247
 * Run language screenshots script for VisualEditor in Jenkins https://phabricator.wikimedia.org/T139613

This week
Vacation