Wikimedia Release Engineering Team/Checkin archive/20160829

= 2016-08-29 =

Vacations/Important dates
How to do it: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Time_off
 * Sept 02: Q2 goals draft published, Dan out
 * Sept 05: US Holiday (Labor day)
 * Sept 23: Q2 goals finalized
 * Oct 01: Start of Q2
 * October 10: US Holiday (Indigenous People's Day)
 * October 17-21: Offsite in Washington D.C.
 * October 31: Mukunda
 * October 28 - Nov 2 (ish) - Chad
 * November 24: US Holiday (Thanksgiving)
 * January 9-11: Dev Summit
 * January 12-13: All Hands

Time spent spreadsheet

 * Week 34 - https://docs.google.com/spreadsheets/d/1IrwGPdTDZ6H8x9Mf5dmCYlkK4hZ8sbUSLODEM4cFc4g/edit#gid=385336525

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/u/blockers

weeks of Aug 22 and Aug 29

 * Train: Antoine
 * wmf.16
 * wmf.17
 * SoS: Tyler
 * https://phabricator.wikimedia.org/E155/21
 * https://phabricator.wikimedia.org/E155/22
 * Out:

weeks of Sep 05 and Sep 12

 * Train: Chad
 * wmf.18
 * wmf.19
 * SoS: Mukunda
 * https://phabricator.wikimedia.org/E155/23
 * https://phabricator.wikimedia.org/E155/24
 * Out:
 * Sept 05 (Monday): US Holiday (Labor day)

Actions from last meeting

 * ✅: Chad - lay out ideation on the LongLivedBranches task to then get Timo to review ( https://phabricator.wikimedia.org/T140921 )

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/query/h7YTCBTJsepS/#R

This week

 * Blocking
 * Blocked
 * https://gerrit.wikimedia.org/r/#/c/300092/ ("contint: tidy Nodepool slaves config history")
 * Help requested: Upgrade base MW-Vagrant image to Jessie - https://phabricator.wikimedia.org/T136429
 * Outline from bd808: https://phabricator.wikimedia.org/T136429#2572195
 * Ori suggesting Ops support: https://phabricator.wikimedia.org/T136429#2572433
 * Updates
 * Updates

Last week

 * Blocking
 * Parsoid scap3 deployment: https://phabricator.wikimedia.org/T142990
 * Blocked
 * (Ops) Contint networking: https://phabricator.wikimedia.org/T140257#2553490
 * (Ops/puppet) https://gerrit.wikimedia.org/r/#/c/300092/ ("contint: tidy Nodepool slaves config history")
 * (Ops) Help requested: Upgrade base MW-Vagrant image to Jessie - https://phabricator.wikimedia.org/T136429
 * Outline from bd808: https://phabricator.wikimedia.org/T136429#2572195
 * Ori suggesting Ops support: https://phabricator.wikimedia.org/T136429#2572433
 * Updates
 * Bugfix release of scap
 * European SWAT started Monday!

Offsite

 * what do you want to talk about? Fill this out/vote on ideas:
 * https://etherpad.wikimedia.org/p/releng-offsite201610-proposedtopics
 * meeting with her at 11am Pacific to talk travel arrangements

"Upgrade all mw* servers to debian jessie"

 * https://phabricator.wikimedia.org/T143536
 * What should we make sure happens here?
 * Greg to find and comment on the Beta Cluster one

Q2 (Oct - Dec) Goals

 * Greg updated https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201617Q2 since last meeting

Previously listed goals

 * Differential
 * fix Jenkins tests, maybe
 * migrate android
 * not a goal


 * Malu
 * pause


 * LLB + MW + Extension deploys to scap3 ?
 * not a goal

New goal proposals

 * Python software deployment via scap3 (Zuul + Nodepool)
 * think more on it (Tyler and Antoine), not a goal for now


 * CI Tech Debt
 * Determine long term plan for Nodepool
 * This needs to be more specific
 * Anything else?
 * think about how we use queues
 * split queue per branch (eg: security releases hitting multiple branches, 600 jobs), can make more run parallel
 * SWAT deploys: make the wmf branch go through as fast as it can
 * ""Review and adjust CI queues for more parallel operations"
 * MW deploy tech debt (Experiment/Stretch)
 * scap swat
 * ability to have multiple checks on MW deploys (in addition to logstash), eg swagger spec for MW (node endpoints checking)

Replace primary production Continuous Integration host -

 * NEXT: https://phabricator.wikimedia.org/T139771 - "Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI"

Upgrade Beta Cluster database servers to Maria10/Jessie -

 * Chad will review Dan's patches
 * Dan will coord with Jaime for $whenever_works_for_them

Reduce Technical Debt
Perform a technical debt analysis of software and services maintained by WMF Release Engineering -

Next steps:
 * Greg get the documentation documented and call it done (for this goal for this quarter)

Streamline deployments (long-lived branches)
keyresult task: project view: https://phabricator.wikimedia.org/project/view/2117/
 * Convert our production deployment strategy to use long-lived branches -


 * Tooling will probably be done
 * static asset conclusion might not be
 * scap swat coming along nicely
 * use gerrit rest api (has need features not avail over ssh)
 * will need some sort of shared account (with frequent credential rotation, potentially each deploy)
 * can use a .netrc right now
 * scraping Deployment calendar page is crappy

SWAT deploy changes

 * European SWAT deploys (
 * Future changes?
 * requiring a task associated with each change being pushed out?
 * Add all swatters to each swat window, stop segmenting based on their availability (worst case they get a ping when they're not online)

Beta Cluster

 * Long lived cherry pick stuff popping up again
 * Antoine got one out thanks to Brandon

DB Inconsistencies
https://phabricator.wikimedia.org/T132416 and https://phabricator.wikimedia.org/T104459 (see also: https://www.mediawiki.org/wiki/Development_policy#Database_patches )

Last week

 * Catch up on Nodepool incident - DONE
 * Migrate jobs back to Nodepool instance - Week of Aug 29
 * Ideally get quota raised
 * Figure out contint1001 network with ops / Tyler
 * done: clear out 3 weeks worth of mails
 * personal: learn how to play https://www.youtube.com/watch?v=d9i_zXmULyk
 * pet project: rake / rspec on puppet.git and tox for operations/software.git

This week

 * Migrate jobs back to Nodepool instance. Chase to monitor wmflabs as we progressively switch back. Starting on Tuesday Aug 30th
 * Figure out contint1001 network with ops / Tyler
 * Haven't pushed for it. Faidon in vacations this week. -- solved with mark --> public IP
 * personal: working on Ukulele major chords. C, D, F, G done. Todo: A B E. Probably gonna buy a guitar.
 * Branch cut / train deploy - done

Last week

 * MW release today (finally)
 * Finally going to do DB consistency script -- per our 1:1 this shouldn't be so hard
 * Long lived branches (long may they ilve)

This week

 * Diving into the DB consistency script. Doable, but hard :)
 * More long lived branches

Last week

 * Migrate deployment-prep to jessie https://phabricator.wikimedia.org/T138778
 * Scap group_size funzies https://phabricator.wikimedia.org/T142990

This week

 * Start poking at MW-Vagrant jessie base image https://phabricator.wikimedia.org/T136429
 * Migrate deployment-prep to jessie https://phabricator.wikimedia.org/T138778

Last week

 * Finish the `scap swat` tool which is taking shape nicely.
 * Propose Improvements to the scap remote execution api to make it easy to use from scap plugins
 * This could facilitate development of arbitrary scap checks which can be ran separately from deployments
 * Will discuss with Tyler during the deployments meeting and go from there.

This week

 * Still finishing up scap swat
 * Hopefully, resolve the screen scraping debate
 * Start on `scap merge` branch management tooling

Last week

 * Bugfix scap update
 * nodepool things

Last week

 * https://phabricator.wikimedia.org/T139613 Run language screenshots script for VisualEditor in Jenkins
 * https://phabricator.wikimedia.org/T139247 Analyze (and share analysis of) the browser testing feedback survey

This week

 * https://phabricator.wikimedia.org/T139613 Run language screenshots script for VisualEditor in Jenkins
 * https://phabricator.wikimedia.org/T139247 Analyze (and share analysis of) the browser testing feedback survey
 * Review existing documentaiton on browser testing