Wikimedia Release Engineering Team/Checkin archive/2021-02-03

= 2020-02-03 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * 15 Feb: Presidents' Day -- US staff with reqs


 * 29 Mar: US staff with reqs


 * 12 Apr: US staff with reqs
 * 22 Apr: Earth Day -- US staff with reqs


 * I made this: https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar
 * Am I missing anything?

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R
 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Important_dates


 * 16 Nov - wmf.18 - Ahmon + Antoine
 * 23 Nov - wmf.19 - No Train - Thanksgiving Thurs/Fri https://phabricator.wikimedia.org/T263185
 * 30 Nov - wmf.20 - Antoine + Mukunda
 * 7 Dec - wmf.21 - Mukunda + Dan
 * 14 Dec - wmf.22 - Dan + Jeena
 * 21 Dec - wmf.23 - No Train
 * 28 Dec - wmf.24 - No Train
 * 4 Jan - wmf.25 - Jeena + Lars Antoine
 * NB: Lars is only back from holiday on Thursday Jan 7
 * 11 Jan - wmf.26 - Lars + Jeena
 * 18 Jan - wmf.27 - Brennen + Lars (Monday is a holiday)
 * 25 Jan - wmf.28 - Ahmon + Brennen


 * 1 Feb - wmf.29 - Antoine + Ahmon
 * 8 Feb - wmf.30 - Mukunda + Antoine
 * 15 Feb - wmf.31 - Dan + Mukunda (Monday is a holiday)
 * 22 Feb - wmf.31 - Jeena + Dan (Monday is a holiday)
 * 1 Mar - wmf.31 - Lars + Jeena (Monday is a holiday)
 * 8 Mar - wmf.31 - Brennen + Jeena (Monday is a holiday)

Status

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor

SoS

 * 2019-08-14 onwards: Zeljko 🎸 🎷 \o/
 * 2020-08-26 onwards: Deb is in charge/SoS is async
 * 2020-11-25: Brennen
 * 2020-12-02: Ahmon
 * 2020-12-09: Tyler
 * 2020-12-16: Antoine
 * 2021-01-06: Tyler
 * 2021-01-13: Text only update
 * 2021-01-20: Mukunda
 * 2021-01-27: Text only update
 * 2021-02-03: Thcipriani

Outgoing

 * Blocked by:
 * MemcachedPeclBagOStuff: Serialization of 'Closure' is not allowed
 * Listed as PET, has a patch
 * docker-pkg: "certificate verify failed: unable to get local issuer certificate" for docker-registry.discovery.wmnet when publishing dev-images from contint2001
 * Have checked the obvious stuff, but not really sure how to proceed.
 * Blocking:
 * Updates:
 * [All] Deployments/Covid-19 https://wikitech.wikimedia.org/wiki/Deployments/Covid-19
 * Train Health
 * Last week: 1.36.0-wmf.28 T271342
 * This week: 1.36.0-wmf.29 T271343
 * Next week: 1.36.0-wmf.30 T271344
 * Next week: 1.36.0-wmf.30 T271344

Callouts

 * [RelEng] After several failed attempts to rollout wmf.28, we abandoned the release and moved on to wmf.29. This was done because (1) wmf.29 is a superset of code in wmf.28 (2) we need a stable base to rollback to and there was not enough time to determine wmf.28's stability.

Incoming/Needs attention

 * Everything from this doc should be in betterworks: https://docs.google.com/document/d/1xxPYTb6mGjC0z3kEuFFWxV9YcOipvs4wSMkk3wPEl7s/edit#


 * Observability is moving Icinga to victorops. I mentioned that it might be good to have a training with them where we can ask questions (like the dumb question I asked: Are you talking about nagios? I get nagios alerts, is that affected? No.)
 * Interested? Put your name here (If you get alerts you ought to go -- Mukunda, Antoine(?)):
 * Antoine
 * Mukunda
 * Brennen
 * Jeena


 * 2021-02-10: Brennen demos logspam-watch!!!


 * Train policy: https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train#Issues_that_hold_the_train
 * Revisions needed?
 * Probably the one about if a security patch doesn't apply without 3-way - https://phabricator.wikimedia.org/T269153#6793657
 * Lars: until Scap apply-patches is fixed, restore the manual insturcvtions on deployment page? that would make this not be a train blocker
 * Proposal:
 * >= 2 new messages from a version
 * "new" -- is a term that needs to be defined
 * itermittent noise -- bots hitting obscure pages, etc.
 * Client errors? (should be)
 * Could be a question for observability -- how do we get to something more rigorous than logstash filters?
 * TODO: brennen to update docs: https://phabricator.wikimedia.org/T273802

Book club/Lunch and Learn

 * https://www.mediawiki.org/wiki/Wikimedia_Engineering_Productivity_Team/Book_club
 * https://www.mediawiki.org/wiki/Wikimedia_Engineering_Productivity_Team/Lunch_and_learn
 * https://www.mediawiki.org/wiki/Wikimedia_Engineering_Productivity_Team/Read_papers_and_talk
 * Feb 1st: Zeljko -- Cal Newport's Deep Work
 * Happened: https://www.mediawiki.org/wiki/Wikimedia_Engineering_Productivity_Team/Lunch_and_learn/2021-02-01
 * Feb 15th: Lars -- David Allen's Getting Things Done (GTD)

Monthly reflection on accomplishments - Jan '21 edition

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
 * Add as you have them!


 * Update dev images to split apache and php containers for local dev
 * Gerrit security bug discovery and deployed fix by Antoine
 * In sync with Gerrit upstream war (Java compiled code)
 * Target releases for apt packages in blubber deployed so wuvi can use npm

Ahmon

 * Blocked by:
 * none
 * Blocking:
 * SRE, who wants to see the results of my m8s experiments
 * Updates:
 * Train duty last week, m8s this week, and meetings. (m7i on k8s -> m8s -> MATES!)

Antoine

 * Blocked by:
 * Train blocked on some magic cache/serialization issue https://gerrit.wikimedia.org/r/c/mediawiki/extensions/FeaturedFeeds/+/661357/ . Root cause not identified (I am betting a cache related change in mediawiki/core)
 * Phatality is missed :-] +100
 * Blocking:
 * Switch Quibble jobs to 0.0.46 (Apache support) / Buster (catchup prod)
 * Updates:
 * gearman-java lib and plugin build/fixed for java 11. Gotta test Jenkins/gearman with java 11
 * Upgraded Gerrit 3.2.7 🚀🚀
 * Addresses a memory leak I found *flexes*. Some nice reading at https://bugs.chromium.org/p/gerrit/issues/detail?id=13858
 * Train log triage on Thursday european morning

Brennen

 * Blocked by:
 * None
 * Blocking:
 * Some MediaWiki-Docker users on MacOS, probably
 * Updates:
 * Last couple of weeks mostly eaten by train
 * Ongoing dev-images work
 * Need to get brain around GitLab auth better
 * Will do tweaks to train docs

Dan

 * Blocked by:
 * docker-pusher issues on releases-jenkins (haven't filed a patch yet)
 * Blocking:
 * Updates:
 * m8s image is building on releases-jenkins but not publishing yet
 * m8s image is building on releases-jenkins but not publishing yet

Jeena

 * Blocked by:
 * none
 * Blocking:
 * local dev cli changes
 * Updates:
 * Got target releases for apt packages in blubber deployed
 * Reviewing local dev cli changes
 * Adding ability to use credentials by whitelist in pipelinelib
 * Plan to work on mw-on-k8s secrets

Lars

 * Blocked by:
 * Python ecosystem
 * Blocking:
 * nope?
 * Updates:
 * Python2 projects in CI that use Blubber are broken at the moment, because the pip tool no longer supports Python2.
 * https://phabricator.wikimedia.org/T273793
 * This is currently breaking Scap CI
 * Working around this in a local container results in other things breaking
 * I also noticed the order of installing stuff (pip, setuptools) for python seems to not work

Mukunda

 * Blocked by:
 * kibana 7 is a pain in the ass
 * Blocking:
 * everyone
 * Updates:
 * Working on Phatality, hoping to get that deployed this week. This is pretty much all of my focus currently. +100!!!!!!!!!!!!! \o/

Tyler

 * Blocked by:
 * Nothing.
 * Blocking:
 * Everything.
 * Updates:
 * Deployment calendar bot appears to be ~working
 * Updated deployment calendar the past two weeks
 * Next stop: Cron job \o/
 * Hiring kickoff with SRE for the GitLab/Misc services role today
 * Working on m8s update for senior leadership for next Tuesday