Wikimedia Release Engineering Team/Checkin archive/20181105

= 2018-11-05 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * November 8-9 - Dan vacation in Mexico City 🇲🇽🌮🎉
 * November 12th - Holiday (Veteran's Day, Observed)
 * November 22+23 - Holidays (Thanksgiving)
 * November 25-december 2nd: Mukunda vacation (in California ahead of the offsite)
 * Week of December 3rd - Team offsite
 * December 24-28 - Holidays (Christmas)

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R


 * Oct 08 - wmf.25 - Dan (No train due to DC switchover)
 * Oct 15 - wmf.26 - Mukunda (last 1.32 wmf.XX release, 1.33 starts the next week)
 * Oct 22 - wmf.1 - Mukunda (warning, TechConf happening, ping Greg if you need responses from anyone there...)
 * Oct 29 - wmf.2 - Tyler
 * Nov 05 - wmf.3 - Tyler <
 * Nov 12 - wmf.4 - Antoine
 * Nov 19 - wmf.5 - No Train (Thanksgiving)
 * Nov 26 - wmf.6 - Antoine
 * Dec 03 - wmf.7 - No Train (Offsite)
 * Dec 10 - wmf.8 - Zeljko
 * Dec 17 - wmf.9 - Zeljko
 * Dec 24 - wmf.10 - No Train (Holiday break)
 * Dec 31 - wmf.11 - No Train (Holiday break)
 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine

SoS

 * Oct 10 - Zeljko
 * Oct 17 - Zeljko
 * Oct 24 - Zeljko
 * Oct 31 - Zeljko
 * Nov 07 - Zeljko <
 * Nov 14 - Zeljko
 * Nov 21 - Zeljko
 * Nov 28 - Zeljko
 * Dec 05 - Zeljko
 * Dec 12 - Zeljko
 * Dec 19 - Zeljko
 * Dec 26 - Zeljko
 * Jan 02 - Zeljko
 * Jan 09 - Zeljko
 * Jan 16 - Zeljko
 * Jan 23 - Zeljko
 * Jan 30 - Zeljko
 * Feb 06 - Zeljko
 * Feb 13 - Zeljko
 * Feb 20 - Zeljko
 * Feb 27 - Zeljko

Hiring

 * Software Engineer position open and reviewing/hiring for now
 * https://boards.greenhouse.io/wikimedia/jobs/1225258


 * update....

December Offsite
Details:
 * Week of December 3rd
 * At the Queen Mary hotel in Long Beach
 * Deb T will be facilitating

Topics!
 * https://etherpad.wikimedia.org/p/RelEng-Offsite-201811-Topics
 * Deb and I talked on Friday, she is starting to get the schedule in place.

REMINDER: Deadline to book travel is Nov 8th!

All Hands

 * Registration: https://office.wikimedia.org/wiki/All_hands/2019/Registration
 * Needed for everyone
 * NOTE: There's a way to request a hotel room for semi-local people (commutes longer than 1.5 hours)

Needs attention

 * gerrit security release 2018-10-08
 * https://groups.google.com/forum/m/#!topic/repo-discuss/eH0iLt2XawU
 * jGit update, we are unaffected
 * may want to hold off until next week: https://bugs.chromium.org/p/gerrit/issues/detail?id=9836
 * 2018-10-15 -- paladox tells me they're working on a fix and should have a 2.15.6 tagged Soon™
 * 2018-10-22 -- jGit updated to fix leaks https://gerrit-review.googlesource.com/c/gerrit/+/201273
 * 2018-10-29 -- 2.15.6 released: https://groups.google.com/forum/?hl=en#!topic/repo-discuss/9EUYI2eyIZM
 * thcipriani: Will send email today to update on...Wednesday? Anyone wanna work on this with me?
 * Antoine to pair, and be point next time
 * 2018-11-05: built and testing https://gerrit.wikimedia.org/r/#/c/operations/software/gerrit/+/471758/-1..1


 * deploy1001:/srv/mediawiki out of date?
 * https://phabricator.wikimedia.org/T207602
 * Found because the Security team noticed that a previously deployed security patch was no longer deployed, should sync up with them this week about that (Reedy or Brian)
 * See: https://phabricator.wikimedia.org/T207600
 * 2018-10-22: no idea, thcipriani will look, I guess
 * 2018-10-29: scap updated, needs release this week
 * 2018-11-05:
 * Need to poke Reedy re:T207600
 * scap still needs release - mukunda will take care of it


 * deployment-prep region migration
 * See email with same subject on releng@lists
 * Question: incrementally or not?
 * looks like "however Andrew wants to do it"
 * REMINDER: send an email update to wikitech-l@/qa@ with the planned timeline/outage
 * 2018-10-29: ACTION: Tyler to reply saying "take it away, andrew, and when are you going to do it?"
 * 2018-11-05: Email response ✅ -- blocking task from Krenair https://phabricator.wikimedia.org/T208101 -- Dan and Mukunda graciously volunteered ;)

Scrum of Scrums

 * Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

 * Blocking:

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * Train Health:
 * Last week: 1.33.0-wmf.2 deployment blockers https://phabricator.wikimedia.org/T206656
 * wmf.2 was late last week due to an odd HHVM issue: https://phabricator.wikimedia.org/T208549
 * This week: 1.33.0-wmf.3 deployment blockers https://phabricator.wikimedia.org/T206657
 * Next week:
 * Log Health:
 * Code Health:
 * Log Health:
 * Code Health:

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * OMG discussion on how to do incident reporting and analysis better
 * https://phabricator.wikimedia.org/T208632
 * mukunda to make some comments

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Release MediaWiki 1.32
 * WHO: Mukunda, (Tyler on backup)

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Determine the procedure and requirements for an automated MediaWiki branch cut.
 * WHO: Mukunda, Tyler, Antoine


 * Created a bunch of subtasks of https://phabricator.wikimedia.org/T156445 for automating release
 * most are needed for MW Branch cut as well as release automation

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Formalize the collection of CI infrastructure and tooling metrics
 * WHO: Dan, Antoine

TEC3 (Pipeline): Outcome 2 / Output 2.3

 * GOAL: Develop set of metrics to assess incident reports/post mortems -
 * WHO: Greg, Zeljko

https://docs.google.com/spreadsheets/d/1AUqMgzThBHNL7DgI8C9PO_YQ1oD5CSd0iWvcVbowzdg/edit#gid=1154483822

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * Migrate graphoid to the Deployment pipeline
 * Deploy zotero v2 to the Deployment pipeline
 * Deploy blubberoid
 * WHO: Dan, Tyler, Lars


 * Lars, Dan, and thcipriani had a pairing session Friday to move Blubberoid forward

TEC12 (DevProd): Outcome 2 / Output 2.1

 * GOAL: The Annual Developer Productivity Survey results are synthesized and shared, creating a first year baseline.
 * WHO: Mukunda, Greg


 * This is finally sent out and we've already gotten a lot of (IMO useful) responses.

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOAL: Update/refresh review queue (review process for initial code deployment)
 * WHO: JR

TEC13 (Code Health): Outcome 2 / Output 2.2

 * GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test -
 * WHO: Zeljko

TEC13 (Code Health): Outcome 2 / Output 2.3

 * GOAL: Assess Platform unit test practices and define improvement plan
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 3 / Output 3.2

 * GOAL: Core Platform and Search Platform teams are using TDM PoC
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 3 / Output 3.4

 * GOALs:
 * Identify key Tech Debt areas
 * Put in place Tech Debt management process for PEP
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 4 / Output 4.1

 * GOAL: Metrics defined and deployed for all 4 Code Health areas.
 * WHO: JR, Code Health Metrics Working Group

Antoine
Relocated Wikibase client job ready to migrate to Docker. Repo one gotta wait and see why scope is so different


 * What I plan to do this week
 * Look at Wikibase repo
 * Java 8 security update fall outs? Probably want to upgrade CI container
 * What I'm blocked on
 * DonationInterface migration pending on fundraising
 * Other?

Dan

 * What I plan to do this week
 * Write a blog post about October 2018 CI build data analysis
 * Working title: "It's a zombie party: bring in 'da noise, bring in defunct"
 * Analysis: https://docs.google.com/spreadsheets/d/1-HLTy8Z4OqatLnufFEszbqkS141MBXJNEPZQScDD1hQ/edit?usp=sharing
 * Still prometheus-ing
 * What I'm blocked on
 * Other?
 * Other?

Greg

 * What I plan to do this week
 * catch up on l10nupdate follow-ups
 * follow-up from TechConf program committee (cleaning/sanitzing notes and posting to wiki mostly)
 * a quick pass through any remaining updates to the onboarding process/task structure (incorporate learnings from Lars')
 * What I'm blocked on
 * dunno?
 * Other?
 * dunno?

Jean-Rene

 * What I plan to do this week
 * What I'm blocked on
 * Other?
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * Delivery pipeline architecture diagram to understand what the goal and status quo is.
 * Find and read existing delivery pipeline code. (thcipriani: in integration/config)
 * Study Kanban boards.
 * What I'm blocked on
 * Lack of superbrain
 * Other?
 * Nada

Mukunda

 * What I plan to do this week
 * Get the lastest scap deb released
 * keyholder review
 * I didn't get the MW 1.32.0-rc1 tarball done last week, get that done this week for sure
 * (with Dan) Fix beta cluster static IPs for transition to the new cloud region
 * Outline proposal for incident report forms
 * What I'm blocked on
 * Other?
 * Other?

Tyler

 * What I plan to do this week
 * Train
 * Gerrit
 * Fundraising CI job
 * What I'm blocked on
 * Other?
 * Other?

Zeljko

 * What I plan to do this week
 * T199133 Find top 15 target projects that could use Selenium tests to prevent incidents
 * What I'm blocked on
 * Other?
 * Other?

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart