Wikimedia Release Engineering Team/Checkin archive/20190408

= 2019-04-08 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * April 9-12: Greg at tech-mgt F2F in Portland
 * April 11: Dan out
 * April 17-19 (Wednesday - Friday) - Željko vacation
 * April 18-19 (Thursday, Friday) - Lars on vacation in Chicago
 * April 22 (WMF Holiday) - US Staff
 * April 22-27: Team offsite in Chicago
 * April 29: Moved WMF Holiday for US staff at offsite
 * May 1st - Lars, Antoine and Željko, Labor Day / May Day
 * May 8th - Antoine, 1945 victory
 * May 15 (Wednesday) - Željko vacation
 * May 16-20 - Wikimedia Hackathon 2019 (Prague, Czechia)
 * Attending: Greg, JR, Zeljko, James, and Jeena
 * May 30th-31th - Antoine, Feast of the Ascension
 * June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
 * May 27 (Memorial Day) - US Staff
 * June 6-7 - Brennen, Apogaea
 * June 19 (Juneteenth) - US Staff
 * July 22 - August 9 - Željko vacation
 * August 25 - September 4 - Brennen vacation

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R


 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine
 * Mar 04 - wmf.20 - Antoine
 * Mar 11 - wmf.21 - Zeljko 🐌
 * Mar 18 - wmf.22 - Zeljko 💣
 * Mar 25 - wmf.23 - Dan
 * Apr 01 - wmf.24 - Dan [train not finished yet]
 * Apr 08 - wmf.25 - Mukunda
 * Apr 15 - 1.34.0-wmf.1 - Mukunda
 * Apr 22 - wmf.2 - NO TRAIN, team offsite
 * Apr 29 - wmf.3 - Tyler
 * May 06 - wmf.4 - Tyler
 * May 13 - wmf.5 - Antoine
 * May 20 - wmf.6 - Antoine
 * May 27 - wmf.7 - Zeljko
 * June 03 - wmf.8 - Zeljko

SoS

 * Zeljko 4eva! :)

Timespent spreadsheet

 * For the avoidance of doubt: fill out the sheet week number for the previous week


 * W16 https://docs.google.com/spreadsheets/d/1urCLNQXeEi1DOR8Iu0qW0yPt-glxX1laqlMovbGyCW0/edit#gid=0
 * James: Should I be doing this now? (I don't have access.)
 * Greg: Yes, will deal with this later.
 * TODO: Greg give James access
 * TODO: Greg clarify distinction between "maintenance" and tec1
 * TODO: CI/CD book, educaton/prof dev column? for now "Other"

Book club

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
 * Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
 * Next:
 * At the team offsite
 * Up through Chapter 9

Spring Offsite

 * Location: Chicago, IL (Central timezone, UTC-5 while we're there)
 * Dates: Arrive Monday 4/22, Depart Saturday 4/27.
 * Activity day
 * Museum of Science and Industry on Friday
 * Cubs game Tuesday night
 * Program:
 * Forming....
 * Come prepared to discuss team mission and scope
 * Current priority of topics based on the etherpad votes:
 * 1) Future of WMF CI:
 * 1a) what tooling do we commit to for the next phase, processes of using CI/CD, implementation plan for new tooling/versions
 * 1b) Discussion of rubric (see mail - [RelEng] CI evaluation, phase 2: criteria)
 * 1c) Showcase integration/pipelinelib Pipeline Builder and how it could enable self-serve CI
 * 2) Continuation/”conclusion” of team scope/mission
 * 3) Future of the Beta Cluster
 * 3a) Things we said during annual plan discussions: https://etherpad.wikimedia.org/p/betaclusterwhat
 * 4) Discussion of Prodlike and how to get there
 * 5) How do we organise and track our own work? (Greg)
 * 6) Maintenance of documentation
 * 7) PGP training and keysigning (liw) see https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Onboarding/GPG
 * 8) logspam cleanup epic (follow-up from the book club discussion on 3/21)
 * 9) Book club discussion - Up through chapter 9
 * 10) Everybody does deployments (p271 Every member of the team should know how to deploy)

Skill matrix redux

 * cf: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Skill_matrix


 * I plan to have you update it next week (the week before the offsite).
 * Should we add people outside the team who have significant skills in our matrix? / bus factor indviduals.
 * Yeah, we should note it somehow.
 * Can we transpose the table now it's so wide? +1

Here is the current table, please add/strike-through/leave comments for how to improve it/make it relevant to your work today.


 * Developer Tools Support
 * MediaWiki-Vagrant
 * Elastic-search
 * Gerrit maint
 * Phabricator maint
 * Maintenance of misc. tools like Docker image list, misc. monitoring stuff, etc.?
 * Continuous Integration Infra
 * Jenkins maint
 * Zuul maint
 * Nodepool maint
 * CI config / JJB
 * docker-pkg
 * Quibble
 * Testing Tooling and Education
 * Unit test maint tooling
 * Integration test maint tooling
 * Acceptance test tooling
 * MW-Selenium (Ruby) deprecated in 2017 https://phabricator.wikimedia.org/J79
 * Selenium (NodeJS)
 * Integration Test Environments
 * Beta Cluster
 * Deploying software
 * Deploying new MW branches/The Train
 * backports & SWAT deploys
 * Developing scap
 * Debugging and/or Reporting log errors
 * Deployment Pipeline
 * k8s
 * minikube
 * blubber
 * pipelinelib
 * local-charts
 * MediaWiki Releases
 * Doing major releases
 * Doing point releases
 * Doing security releases

Monthly reflection on accomplishments - April '19 edition

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
 * Add as you have them!


 * Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)

Annual Planning

 * https://etherpad.wikimedia.org/p/releng-fy1920ap-tec1
 * https://etherpad.wikimedia.org/p/releng-fy1920ap-tec3
 * https://etherpad.wikimedia.org/p/releng-fy1920ap-tec12
 * https://etherpad.wikimedia.org/p/releng-fy1920ap-tec13
 * https://etherpad.wikimedia.org/p/releng-fy1920ap-new


 * Nothing new right now...
 * I'm talking with Mark tomorrow morning (he won't be in Portland, sadly)
 * apparently he's coming now, I'll talk to him there :)

[Task] Add Scribunto to extension-gate in CI

 * https://phabricator.wikimedia.org/T125050
 * https://gerrit.wikimedia.org/r/#/c/integration/config/+/497574/
 * calling into question time spent on unit tests in pre-merge tests.
 * yes to having better guidelines

task-series scap plugin broken

 * Friendly reminder to Mukunda :) Just need it by end of week
 * https://phabricator.wikimedia.org/T219192

Incoming from last week

 * Blocking:

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * Train Health
 * Last week: 1.33.0-wmf.24 - https://phabricator.wikimedia.org/T206678
 * This week: 1.33.0-wmf.25 - https://phabricator.wikimedia.org/T206679
 * Next week: 1.34.0-wmf.1 - [NEEDS TASK]
 * Code Health
 * Log Health
 * Code Health
 * Log Health

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * Blocked :/


 * RefreshLinks, last action:
 * [2019-04-05T16:02:11Z]  Synchronized php-1.33.0-wmf.24/includes/jobqueue/jobs/RefreshLinksJob.php: Ib1ac31365f9c / T220037 (duration: 00m 59s)


 * Need to fix scap clean :\
 * thcipriani has a crappy fix in mind until http tokens in gerrit are back
 * Any idea when HTTP tokens will come back? Weeks? Months? Never? :-(
 * When security lets us, hopefully Soon™ I'm pushing for it :\

Quarterly Goals for Q4
https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q4

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Undeploy the CodeReview extension.
 * WHO: James, need help from CPT


 * I will ping CPT about this this week

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Setup 1-3 of the CI WG options (Zuul v3, Argo, GitLab)
 * WHO:


 * Focus on a couple noteworthy repos: e.g.,
 * core
 * extensions
 * ops/puppet
 * Maybe setup in serial, i.e., a week per evaluation


 * Questions:
 * RelEng/Extended working group?
 * At least in the WG eval it was good to have non-familiar people
 * But maybe with the setup of options it might be beneficial to have experienced with current setup people.
 * Folks outside the original working group to join-in to setup options; people TBD
 * Do we need a rubric before we do this prototyping? (yes)
 * DONE lars to work on rubric week of 2019-04-01
 * See email 2019-04-08

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Instrument Quibble for data collection
 * WHO: Mukunda, Antoine


 * Still no progress / nowhere to store this data and other tasks taking priority

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Create a graph where time is spent and make a prioritized list for improvements.
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Prepare the Deployment Pipeline for changes to our CI tooling.
 * WHO: ???, ???


 * Blocked by not having new CI tooling yet

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOAL: Create a .pipeline/config.yaml standard to give users more control over how their tests are run in the pipeline and allow the easy saving of artifacts at pipeline completion. (RelEng)
 * WHO: Dan, Tyler, ???


 * Dan has a patch up for pipelinelib https://gerrit.wikimedia.org/r/#/c/integration/pipelinelib/+/500134/
 * needs review/is set it WIP

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * Wikidata Termbox SSR, Kask for Session Storage Service, cpjobqueue (stretch), ORES (stretch)
 * WHO: Dan, Tyler, Lars

There are tasks: https://phabricator.wikimedia.org/T220403


 * changeprop


 * ORES
 * cf: Dan's comments


 * Wikidata Termbox SSR


 * Kask for Session Storage Service


 * cpjobqueue (stretch)

TEC12 (DevProd): Outcome 1 / Output 1.1

 * GOAL: Provide an "Official" Docker base image for local development of MediaWiki based on the production tooling.
 * WHO: Jeena, Brennen

TEC13 (Code Health): Outcome 1 / Outcome 3

 * GOALs: Presentation/session(s) at the Wikimedia Hackathon on the current state of Code Health projects (technical debt and code stewardship)
 * WHO: JR


 * no progress

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOAL:
 * Publish a re-imagination of the Review Queue process.
 * Develop and implement metrics around task and code-review responsiveness
 * WHO: Greg, JR (and Andre)


 * no progress

TEC13 (Code Health): Outcome 4 / Output 4.2

 * GOALs:
 * Expand SonarQube reporting into CI infrastructure
 * Perform SonarQube analysis on all extensions
 * Engage user communities in direct feedback solicitation
 * WHO: JR, Zeljko, Code Health Metrics


 * new CI patches sumitted. Going to be moving off experimental.
 * merged a patch Friday for polling results/printing results to stdout
 * Perform SonarQube analysis on all extensions - done https://sonarcloud.io/organizations/wmftest

Selenium

 * T217544 selenium-daily-beta-MediaWiki fails due to QuickSurveys inserting HTML in the content
 * T220035 Drop Ruby Selenium CI jobs; we don't support them any more
 * T174018 [EPIC] Port Minerva's browser tests to Selenium Node.js
 * T219815 Create integration tests to cover potential issues with editing and uploading on Commons

Gerrit

 * Deploy barricade tomorrow
 * Revert tool work by EOW, hopefully
 * Threads/crashing recently discussion: https://groups.google.com/forum/#!topic/repo-discuss/pBMh09-XJsw

Phabricator

 * vandalism rollback tool
 * Started working on some other phab security hardening

QA/Code Health

 * Daniel met with Code Health Metric WG to discuss his work around Cycle dependency.
 * Setting up meeting with Corey and Marcella to discuss next/steps annual planning re software maintenance.

SCAP

 * Need to fix scap clean for ssh
 * for the time being I will comment out until fix for: https://phabricator.wikimedia.org/T218750

Antoine

 * What I plan to do this week
 * quibble and zuul upgrade
 * upgraded zuul 5-6 hours ago
 * should unblock gerrit 2.16
 * Friday Gerrit outage (cf: https://groups.google.com/forum/#!topic/repo-discuss/pBMh09-XJsw )
 * What I'm blocked on
 * Other?
 * Other?

Brennen

 * What I plan to do this week
 * Get first pass at releng/dev-images to a reviewable state
 * Got docker-pkg / CI image notes from Antoine on Friday, acting on that
 * Push from last week: Follow up with Eric Gardner re: docs
 * What I'm blocked on
 * Other?
 * Other?

Dan

 * What I plan to do this week
 * Finish up 1.33.0-wmf.24 train
 * Fix up an outstanding issue with the pipelinelib Pipeline Builder feature
 * https://gerrit.wikimedia.org/r/c/integration/pipelinelib/+/499918
 * It's a big change, so: refactor change into separate commits and write some nice commit messages
 * Write up an email to Analytics (MUST do)
 * What I'm blocked on
 * Other?
 * Other?

Greg

 * What I plan to do this week
 * tech-mgt F2F Tue-Fri, slow response
 * Quality Team follow-up
 * Gerrit incident meeting (today) follow-up
 * Annual Budget process kickoff/walk through meeting today
 * Talking with Deb today about Offsite agenda and support
 * Quarterly goal checkin meeting today (tyler attending?)
 * What I'm blocked on
 * Other?
 * Other?

James

 * What I plan to do this week
 * I owe Jeena some Mac testing of local-charts
 * CodeReview status wrangling from CPT
 * What I'm blocked on
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * reach out/start Code Review workgroup
 * Continue work on Test Strategy
 * What I'm blocked on
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * Address comments on mac mw install script/readme for local-charts
 * Work on volume mounting script in local-charts
 * Work on using x debug in the local-charts env
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * read Go book
 * discuss rubric for CI evaluation
 * attempt to get Minikube working in a VM
 * What I'm blocked on
 * Other?
 * six months at WMF today
 * six months at WMF today

Mukunda

 * What I plan to do this week
 * Train
 * Phabricating phabricator phabulously
 * Train task generation is broken
 * Phabricator global search is partially broken
 * Phabricator calendar is broken
 * What I'm blocked on
 * Time
 * Other?

Tyler

 * What I plan to do this week
 * Gerrit revert tool
 * Gerrit plugin deploys
 * Gerrit explosion monitoring
 * What I'm blocked on
 * Other?
 * Quick/dirty fix for scap clean
 * Quick/dirty fix for scap clean

Zeljko

 * What I plan to do this week
 * T217544 selenium-daily-beta-MediaWiki fails due to QuickSurveys inserting HTML in the content
 * T220035 Drop Ruby Selenium CI jobs; we don't support them any more
 * T174018 [EPIC] Port Minerva's browser tests to Selenium Node.js
 * T219815 Create integration tests to cover potential issues with editing and uploading on Commons
 * What I'm blocked on
 * Other?
 * Other?

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart