Wikimedia Release Engineering Team/Checkin archive/20190220

From mediawiki.org


2019-02-20[edit]

Vacations/Important dates[edit]

https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
How to do it
  • February 19 - March 1 - Dan, vacation
  • March 11 (WMF Holiday) - US Staff
  • April 22 (WMF Holiday) - US Staff
  • April 22-27: Team offsite in Chicago
  • April 22nd - Antoine, Easter - we're flying to Chicago?
  • May 1st - Antoine and Željko, Labor Day / May Day
  • May 8th - Antoine, 1945 victory
  • May 17-19 - Wikimedia Hackathon 2019 (Prague, Czechia)
  • May 30th-31th - Antoine, Feast of the Ascension
  • June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
  • May 27 (Memorial Day) - US Staff
  • June 19 (Juneteenth) - US Staff

Rotating positions[edit]

Train[edit]

Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R
  • Jan 07 - wmf.12 - Dan
  • Jan 14 - wmf.13 - Dan
  • Jan 21 - wmf.14 - Mukunda
  • Jan 28 - wmf.15 - No Train (All Hands)
  • Feb 04 - wmf.16 - Mukunda
  • Feb 11 - wmf.17 - Tyler
  • Feb 18 - wmf.18 - Tyler
  • Feb 25 - wmf.19 - Antoine
  • Mar 04 - wmf.20 -
  • Mar 11 - wmf.21 -
  • Mar 18 - wmf.22 -
  • Mar 25 - wmf.23 -
  • Apr 01 - wmf.24 -
  • Apr 08 - wmf.25 -

SoS[edit]

  • Zeljko 4eva! :)

Team Business[edit]

Book club[edit]


Spring Offsite[edit]

  • Location: Chicago, IL (Central timezone, UTC-5 while we're there)
  • Dates: Arrive Monday 4/22, Depart Saturday 4/27.
  • BOOK FLIGHTS BY
  • Activity day: Send your suggestions to me if you have them :) I'll make the voting spreadsheet later.
    • Chicago Bulls!!!11!oneone
      • April 10 -- Regular Season ends, so only if they're good this year :)
    • I've heard there's good pizza :P
      • I'm sure we'll have some of that for our dinners, unless you want to do a cooking class :)
    • Greenfield park conservatory?
    • Museum of Science and Industry - https://www.msichicago.org/
  • Program: Haven't started yet :)
    • Any American sport would be fun (basketball, football, baseball..) (Lars doesn't like watching sports, but would be happy to sit somewhere quite for the duration) (thcipriani: baseball isn't so much about watching baseball :))

Technical Advice IRC Meetings[edit]


Monthly reflection on accomplishments[edit]

  • Let's start keeping a list of accomplishments we've had over the last month (instead of monthy or weekly)
  • Purpose: helps with morale :) and can be a way of identifying good blog post/other ways of showcases
  • blubber uses blubberoid.wikimedia.org in the pipeline and pipeline is almost there for end-to-end functionality (can't yet deploy to production, but nearly can)
  • scap development back on gerrit -- new contributors
  • local-charts repo created
  • docker SIG announced/setup
  • Developer satisfaction survey results https://www.mediawiki.org/wiki/Developer_Satisfaction

Incoming/Needs attention[edit]

beta-update-databases-eqiad still failing[edit]

Instances hosting the master and slave MySQL database crashed last week.


"What does the Pipeline mean for X?"[edit]

  • Beta Cluster?
    • https://phabricator.wikimedia.org/T215217
    • What I said on task: "Things will be changing with what is possible and what is needed as we migrate more and more parts of our infrastructure to the Deployment Pipeline. We (RelEng and SRE) should scope out what that is and how that impacts Beta Cluster in the short, medium, and long term (read: that's the conversation that should happen to move this stewardship review forward)."
Questions[edit]
    • How long will it get worse?
    • What is the replacement?
    • What to do if it breaks fatally before the replacement is ready?
    • How does this fit with the Staging/Canaries work?
discussion[edit]
    • If one of the outcome of stewardship is that beta continues to exist, we need to better define the use-cases that beta will support. We need to break apart the use-cases: i.e., new use-cases covered by "staging", x-use-cases covered by "beta"
    • More resources not necessarily going to solve this until we sort use-cases
    • Some of the use-cases identified during discussion with SRE re:staging -- might be a good next step
    • Greg does not volunteer :)
    • ACTION: thcipriani: bring up the eventual "staging" -- what that means -- during cross-team meeting


  • CI "k8s cluster"
    • See email from Alexandros I forwarded to the team list
    • Nebulous specs from Antoine last year: https://docs.google.com/document/d/1IV4bprNRDWBX-OZHZC5tS1-c7GCY3HEfoblqMT_CRJs/edit
    • This would be *instead of* the WMCS VPS usage.
    • Lars: Does the dependent builds thing affect this?
    • original idea: not running docker on VMs in WMCS and run containers on k8s instead
    • How will the pipeline build images?
      • Unclear
    • still ideal to move off of WMCS infra
    • Blocked on unknowns related to zuul
      • Currently have a legacy fork of upstream zuulv2
      • migration to zuulv3 is a large overhaul
        • Requires nodepool, zookeeper, etc
        • Make zuulv3 move to k8s
    • Open question: does this involve some k8s cluster?
    • TODO Make a group to make a decision on this
      • Lars
      • <Greg to talk to people in 1:1s to find the who>
      • Guest speaker on Zuulv3: Paladox?


Scrum of Scrums[edit]

Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week[edit]


Outgoing this week (wrong section heading is on purpose for copy/pasting into Scrum of Scrums etherpad[edit]

Release Engineering[edit]


Callouts[edit]

  • Release Engineering


Train status and happenings[edit]

https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor



Quarterly Goals for Q3[edit]

https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q3

TEC1 (Maint): Outcome 1 / Output 1.1[edit]

GOAL: Automate the generation of change log notes
WHO: Mukunda, (Tyler on backup)
  • Still want to trigger on branch creation of mw/core...still in large list of TODOs

TEC1 (Maint): Outcome 1 / Output 1.1[edit]

GOAL: Investigate notification methods for developers with changes that are riding any given train
WHO: Mukunda, Tyler
  • No movement this week

TEC3 (Pipeline): Outcome 1 / Output 1.2[edit]

GOAL: Instrument Quibble for data collection
WHO: Mukunda, Antoine


TEC3 (Pipeline): Outcome 1 / Output 1.2[edit]

GOAL: Create a graph where time is spent and make a prioritized list for improvements.
WHO: Mukunda, Antoine


TEC3 (Pipeline): Outcome 2 / Output 2.1[edit]

GOAL: Select and integrate a code health metric solution into our tooling.
WHO: JR, ...
  • Pending code metrics workgroup work

TEC3 (Pipeline): Outcome 3 / Output 3.1[edit]

GOALS:
Adopt more services into Deployment pipeline - task T212801
cxserver, ORES (partially), citoid, changeprop, cpjobqueue (stretch)
Deploy eventgate
WHO: Dan, Tyler, Lars
  • citoid, eventgate: have images built via the pipeline
  • just merged cxserver move to pipeline
  • feedback from pipeline in progress


TEC12 (DevProd): Outcome 1 / Output 1.1[edit]

GOAL: Conduct interviews with development stakeholders and compile a report that informs future work creation of a rubric.
WHO: Jeena, Mukunda


TEC13 (Code Health): Outcome 1 / Output 1.1[edit]

GOALs:
Develop and communicate guidelines and best practices for successful Code Stewardship.
(Continued from Q2) Update/refresh review queue (review process for initial code deployment)
WHO: JR

minor progress

TEC13 (Code Health): Outcome 2 / Output 2.2[edit]

GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test - task T206621
WHO: Zeljko



TEC13 (Code Health): Outcome 2 / Output 2.3[edit]

GOALs:
Evolve/develop tools and processes to support the PE refactoring effort to improve code health.
Develop common test strategy that enable teams to engage in more effective and efficient testing practices. (maybe should be output 2.4?)
WHO: JR, Core Platform Team

Met up with CPT last week to discuss unti testing and code coverage tooling/process. Next steps defined.

TEC13 (Code Health): Outcome 3 / Output 3.2[edit]

GOALs:
Speak at All Hands on the status of Technical Debt
Engage and coach development teams on their approach to managing technical debt.
WHO: JR, Core Platform Team


No progress


TEC13 (Code Health): Outcome 4 / Output 4.1[edit]

GOALs: Code Health Dashboard with 50% of repositories covered.
WHO: JR, Core Platform Team

Core platform codebase now included in SonarQube POC.

Other non-goal work[edit]

Selenium[edit]

Gerrit[edit]

Phabricator[edit]

  • I Spent some time over the weekend experimenting with running phabricator in kubernetes
    • Most of this was time spent learning all of the tooling: minikube, kubectl and helm
    • Limited success! Got a vanilla phabricator container running from helm

Jenkins[edit]

QA/Code Health[edit]

Dom joined the foudnation as a QA engineer.

Discussions starting up again re: Code Maintenance. How to properly plan and resource for the ongoing work. Corey and Marcella driving the discussion.

SCAP[edit]

Standup![edit]

Antoine[edit]

  • What I plan to do this week
  • What I'm blocked on
  • Other?


Brennen[edit]

  • What I plan to do this week
  • What I'm blocked on
  • Other?


Dan[edit]

  • What I plan to do this week
    • Vacation!
  • What I'm blocked on
  • Other?


Greg[edit]

  • What I plan to do this week
    • ISOSSTWG results, maybe
    • Read the book
    • Get James approved for offsite
    • Review queue brain dump
    • Schedule retro of ExternalGuidance deployment
  • What I'm blocked on
    • Is everyone selected for the Hackathon going to go? (JR, Zeljko (yeah), James (yes), Greg, Jeena)
  • Other?

James[edit]

  • What I plan to do this week
    • Helped Krinkle with Fresnel/performance testing
    • Still reading into docker/CD stuff
  • What I'm blocked on
  • Other?
    • Out next week.


Jean-Rene[edit]

  • What I plan to do this week
    • Code Stewardship reviews continued
    • Code Stewardship best practices
  • What I'm blocked on
  • Other?


Jeena[edit]

  • What I plan to do this week
    • Add restbase to local charts
    • document install process on Mac for local charts
    • read book
  • What I'm blocked on
  • Other?


Lars[edit]

  • What I plan to do this week
    • Set up and run Quibble locally on my laptop
    • Skim Quibble source code, see about instrumenting it to see where time is spent
    • Read CD book
    • Read Go book
  • What I'm blocked on
  • Other?
    • Not getting Phabricator notification emails about new comments to tickets I'm subscribed to - is this normal?


Mukunda[edit]

  • What I plan to do this week
    • Release MediaWiki 1.32.1
    • Deploy phabricator update!
      • This hasn't happened for a long time due to a long list of interruptions and delays: difficult to resolve merge conflicts, followed by offsite, holidays, all-hands, moving into my new place and a broken local test environment.
      • Woohoo.
      • Lots of good upstream changes are incoming, so we should get some nice new functionality.
      • I intend to write a phame blog post covering some key changes.
    • Also continue to work on phabricator in kubernetes for local dev / test environment.
  • What I'm blocked on
  • Other?


Tyler[edit]

  • What I plan to do this week
    • train
    • scap release
    • scap local dev stuff (pairing friday)
    • maybe
      • pipelinelib gerrit commenting
      • branch notes on branch creation
  • What I'm blocked on
  • Other?


Zeljko[edit]

  • What I plan to do this week
    • T206621 5 of the 15 prioritized repositories have at least 1 end-to-end test
    • T204068 QA: Automation Testing - port Echo Notification tests to Node.js
    • T207046 Code health metrics spike
  • What I'm blocked on
  • Other?


Grooming[edit]

Team Kanban Board Review and Triage[edit]


Once / month-ish review of backlog(s)[edit]


Kanban stats[edit]

Burnup chart