Wikimedia Release Engineering Team/Checkin archive/20190513

= 2019-04-15 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * May 13–15: James working from London
 * May 15 (Wednesday) - Željko vacation
 * May 16-20 - Wikimedia Hackathon 2019 (Prague, Czechia)
 * Attending: Greg, JR, Zeljko, James, and Jeena
 * May 17th: Mukunda day off - Concert.
 * May 17th: thcipriani - half day - airport run
 * May 20-31 - Jeena Vacation
 * May 21: James still travelling back to SF
 * May 27 (Memorial Day) - US Staff
 * May 28th-31st - thcipriani - family in town
 * May 30th - Lars, Ascension
 * May 30th-31th - Antoine, Feast of the Ascension


 * June 6-7 - Brennen, Apogaea
 * June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
 * June 10-? - Dan leave (4-6 weeks, then additional leave later)
 * June 19 (Juneteenth) - US Staff


 * July 4 (US Independence Day) - US Staff
 * July 22 - August 9 - Željko vacation
 * July 22 - Lars, Midsummer


 * August 7–9 - James off
 * August 12 (Glorious Twelfth) - US Staff
 * August 14–18 - Wikimania
 * Attending: James, ? …
 * August 25 - September 4 - Brennen vacation


 * September 2 (Labor Day) - US Staff


 * October 14 (Indigenous Peoples' Day) - US Staff


 * November 11 (Veterans' Day) - US Staff
 * November 28–29 (Thanksgiving) - US Staff


 * December 6 - Lars, Finnish Independence Day
 * December 25–31 (Christmas) - US Staff
 * December 25-26 - Lars, Christmas


 * 2020 January 1 (New Year's Day) - US Staff, Lars

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R


 * Apr 29 - wmf.3 - Tyler
 * May 06 - wmf.4 - Tyler
 * May 13 - wmf.5 - Antoine
 * May 20 - wmf.6 - Antoine
 * May 27 - wmf.7 - Zeljko
 * June 03 - wmf.8 - Zeljko
 * June 10 - wmf.9 - Mukunda
 * June 17 - wmf.10 - No Train (Juneteenth)
 * June 24 - wmf.11 - Mukunda
 * July 1 - wmf.12 - No train (Fourth of July)
 * July 8 - wmf.13 - Tyler
 * July 15 - wmf.14 - Tyler
 * July 22 - wmf.15 - Antoine
 * July 29 - wmf.16 - Antoine
 * Aug 5 - wmf.17 - one of Mukunda/Antoine/Tyler (Antoine and Zeljko on vacation)
 * Aug 12 - wmf.18 - Zeljko (during Wikimania)
 * Aug 19 - wmf.19 - Zeljko (after Wikimania)

SoS

 * Zeljko 4eva! :)

Timespent spreadsheet

 * For the avoidance of doubt: fill out the sheet week number for the previous week


 * link to week stating May 13: https://docs.google.com/spreadsheets/d/1urCLNQXeEi1DOR8Iu0qW0yPt-glxX1laqlMovbGyCW0/edit#gid=571269651

Book club

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
 * Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
 * Next: June 14th, chapters 10+11

Spring Offsite
Follow-ups:


 * Greg: Email mark re gerrit/phab hosting discussion
 * DONE: scheduled for Wed May 8th.
 * Attending: Mark, mutante, Alex, Joe, Lars, Antoine, Dan, Mukunda, Tyler, Me
 * Not just about gerrit/phab, also "the pipeline transition period and what that means for Beta Cluster" (hence all the people)
 * https://etherpad.wikimedia.org/p/ep-sre-ap-sync


 * Greg: email mark about capex request for next year for pipeline
 * I'm actually not sure what this is about/what the ask is, help?!
 * "staging" pipeline?
 * Production access?


 * Tyler: write out a justifiable ask for hardware resources for Gerrit
 * DONE: https://phabricator.wikimedia.org/T222391


 * ????: re Integration environments: establish SLAs between the teams for what is their responsibility and ours, what is the working relationship
 * I think there's something more here that needs to be fleshed out, see the relevant section here: https://docs.google.com/document/d/1Y-cYrPKT0dvN2oj0hScIjRjkM2zWL5NY9xMYfMuC2Do/edit?ts=5c9cd50b#heading=h.vbm26ktfhprv
 * Greg: flesh out/say more on this
 * 2019-05-13: not yet...


 * Mukunda: talk with Timo and Fillipo about our prioritized of feature requests for LMM
 * Note: Gergo confirmed that SRE is going to work on Sentry in Q1/Q2 (from a conversation with Faidon and Filippo)


 * Greg: announce that RelEng is backup only for SWAT (removal of person’s names from getting pinged everytime on IRC) and we’ll start working on automating the train
 * Still need to do Q4 goals...table this “doing” until Q1?
 * Greg will send a signed email if someone writes it up ;)
 * Željko will write the e-mail this week - done


 * Greg: setup the new project/task management process in Phab based on feedback
 * taskified: https://phabricator.wikimedia.org/T222496
 * Demo time!
 * kanban: https://phab.wmflabs.org/project/board/37/query/all/
 * TODO: https://phab.wmflabs.org/project/board/36/
 * RelEng (categories): https://phab.wmflabs.org/project/board/35/


 * Greg: collect mission/scope output in a central living place
 * DONE: https://docs.google.com/document/d/13wF7e9ZkdoFytl3tbO2UywsXCj0xxbJ4avvfTAwn-gk/edit#

Monthly reflection on accomplishments - May '19 edition

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
 * Add as you have them!


 * Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
 * Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
 * Team offsite in Chicago
 * Repository-hosted CI/CD pipeline configurations now supported (.pipeline/config.yaml) - https://phabricator.wikimedia.org/T210267
 * Train notes published on branch cut

Annual Planning

 * Metrics are in for Core/Operational work. Waiting on c-level announcement on the Priority work.
 * https://docs.google.com/document/d/1GueI1JhQkWjnZXKUmN8T7SS3s3alJQY2fYFfCRZp7So/edit#heading=h.v9gue7m4adr5

FYI from SRE: "Percentage of services in the Deployment Pipeline having SLOs defined and agreed upon together with their service owner" 50% by end of FY2019/20, 100% in 3-5 years.

Annual Reviews
Overview: https://office.wikimedia.org/wiki/FY_2018-19_Annual_Review_and_Retrospective
 * Note: there is a workshop you can attend to get advice: https://office.wikimedia.org/wiki/FY_2018-19_Annual_Review_and_Retrospective#Sprints_&_trainings_-_support_from_T&C

Deadlines
Everyone:
 * Starting now: You and I discuss who your peer reviewers should be
 * April 26th: Enter your peer reviewers into Namely (please run them by me first)
 * May 17th: Deadline to complete self-reviews, peer reviews, and reviews of your manager.
 * May 20th: I start reviewing the peer reviews and writing my feedback on you.

Non SafeGuard (aka US Employees):
 * June 14th: Deadline for managers to complete all 1:1 meetings with direct reports and provide written feedback in Namely.

SafeGuard:
 * June 14th - Managers of those employed by Safeguard submit their reviews to HR for submission to Safeguard
 * July 12th - Deadline to have a 1:1 and share final manager review with direct report in Namely

Incoming/Needs attention

 * node6-node10 migration in CI: https://phabricator.wikimedia.org/T211784
 * James needs a CI root to push new config/images
 * TODO: Give James access (to contint-admins in puppet + integration/config +2)
 * Done: https://phabricator.wikimedia.org/T223137 // https://gerrit.wikimedia.org/r/#/c/509891/


 * FYI: Java logging session from gehel last week https://drive.google.com/file/d/1gkA1CUrBkiN6XkUNWSChK-XQMZSbxGQT/view

Incoming from last week

 * Blocking:
 * Callouts: Changing a WikibaseCirrusSearch config default (to activate its functionality by default when installed) appears to break several browser tests in CI. Guidance requested on what to do about this: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseCirrusSearch/+/507597/
 * I took a look. It looks to me that the tests fail because search is not working (properly) in CI. Is it similar to this? https://phabricator.wikimedia.org/T188507
 * I guess Antoine needs to take a look at this.


 * Callouts: some tests are really long: https://phabricator.wikimedia.org/T222757
 * Greg already left a comment. I think Antoine is already working on making Quibble faster. I've looked at the phab board, but couldn't find a task: https://phabricator.wikimedia.org/project/view/2772/
 * Do we need to leave a comment saying we're working on it? Antoine again? :)


 * Language: Add abi to l10n-watchers group in Gerrit (https://phabricator.wikimedia.org/T222015)
 * This seems a policy problem. Greg or Tyler? thcipriani: what policy blocks this? Not sure. People seem confused. Maybe it just needs to be made explicit what are the rules.

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * Train Health
 * Last week: 1.34.0-wmf.4 - https://phabricator.wikimedia.org/T220729
 * This week: 1.34.0-wmf.5 - https://phabricator.wikimedia.org/T220730
 * Next week: 1.34.0-wmf.6 - https://phabricator.wikimedia.org/T220731
 * Code Health
 * Log Health
 * Code Health
 * Log Health

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * Need to fix scap clean :\
 * thcipriani has a crappy fix in mind until http tokens in gerrit are back
 * Any idea when HTTP tokens will come back? Weeks? Months? Never? :-(
 * ~Weeks
 * 2019-05-06: cleaned up stuff last week on deploy hosts, just not the gerrit branches
 * 2019-05-13: …


 * 1.33 branch cut for extensions is blocked (except tarball ones, which James did manually)
 * 2019-05-06: Mukunda to do it this week
 * Greg: email Cindy re process of this release
 * 2019-05-13: We talked on Thursday. Mukunda will review hexmode's work, Cindy will email Greg with plan of action re timeline.

Quarterly Goals for Q4
https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q4

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Undeploy the CodeReview extension.
 * WHO: James, need help from CPT


 * James will ping CPT about this this week (April 8th)
 * … and again w/c 15 April.
 * … and again w/c 6 May (in SoS).

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Setup 1-3 of the CI WG options (Zuul v3, Argo, GitLab)
 * WHO:


 * Focus on a couple noteworthy repos: e.g.,
 * core
 * extensions
 * ops/puppet
 * Maybe setup in serial, i.e., a week per evaluation


 * Questions:
 * RelEng/Extended working group?
 * At least in the WG eval it was good to have non-familiar people
 * But maybe with the setup of options it might be beneficial to have experienced with current setup people.
 * Folks outside the original working group to join-in to setup options; people TBD
 * Do we need a rubric before we do this prototyping? (yes)
 * DONE lars to work on rubric week of 2019-04-01
 * See email 2019-04-08


 * 2019-05-06: Feedback from Android. Working on an arch document. Do in Q1?

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Instrument Quibble for data collection
 * WHO: Mukunda, Antoine


 * Still no progress / nowhere to store this data and other tasks taking priority

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Create a graph where time is spent and make a prioritized list for improvements.
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Prepare the Deployment Pipeline for changes to our CI tooling.
 * WHO: ???, ???


 * Blocked by not having new CI tooling yet

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOAL: Create a .pipeline/config.yaml standard to give users more control over how their tests are run in the pipeline and allow the easy saving of artifacts at pipeline completion. (RelEng)
 * WHO: Dan, Tyler, ???


 * Dan's pipeline work merged
 * Followup to update the pipeline to *use* that new fancy code
 * General problem of shared resources on staging and in helm test stage (ci namespace on staging)


 * What does the annual plan *actually say* for this?

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * Wikidata Termbox SSR, Kask for Session Storage Service, cpjobqueue (stretch), ORES (stretch)
 * WHO: Dan, Tyler, Lars

There are tasks: https://phabricator.wikimedia.org/T220403


 * changeprop


 * ORES
 * cf: Dan's comments


 * Wikidata Termbox SSR


 * Kask for Session Storage Service


 * cpjobqueue (stretch)

TEC12 (DevProd): Outcome 1 / Output 1.1

 * GOAL: Provide an "Official" Docker base image for local development of MediaWiki based on the production tooling.
 * WHO: Jeena, Brennen
 * https://phabricator.wikimedia.org/T212449

TEC13 (Code Health): Outcome 1 / Outcome 3

 * GOALs: Presentation/session(s) at the Wikimedia Hackathon on the current state of Code Health projects (technical debt and code stewardship)
 * WHO: JR

Met and discussed Hackathon session with Code Health Metrics WG. Daniel will also be having a related session on Cycle Dependencies.
 * T216630 Present Code Health Metrics at the Hackathon
 * image updated
 * jobs publish
 * need to actually use jobs in zuul, waiting on T222210

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOAL:
 * Publish a re-imagination of the Review Queue process.
 * Develop and implement metrics around task and code-review responsiveness
 * WHO: Greg, JR (and Andre)


 * No activity

TEC13 (Code Health): Outcome 4 / Output 4.2

 * GOALs:
 * Expand SonarQube reporting into CI infrastructure
 * Perform SonarQube analysis on all extensions
 * Engage user communities in direct feedback solicitation
 * WHO: JR, Zeljko, Code Health Metrics


 * continued work towards integration SonarQube into CI

Release MW 1.33

 * ETA: end of May
 * "just" producing the tarballs, Cindy is on point for what's in/what's out.
 * Can I get a volunteer?
 * Mukunda can build tarballs
 * Greg: email Cindy asking for status
 * See previous discussion above

Selenium

 * Progress on various Phabricator tickets and/or Gerrit patches

Phabricator

 * Merged and deployed upstream changes
 * Fixed the calendar default view, no more fatal error.
 * Reviewed and deployed several weeks of new translations from translatewiki

QA/Code Health

 * Sent out participation request for Code Review Workgroup, 19 interested respondents so far.


 * Discussions with existing TEs moving over to Q&T Engineering started last week. New team is being received well.

Antoine

 * What I plan to do this week
 * Train!
 * tox upgrade from 2.9 to 3.10.0
 * What I'm blocked on
 * Zuul dos - our zuul is too old. I give up eventually.
 * Bring new zuul-merger instance but the provisionning is subtily broken
 * Reorganize CI projects in Phabricator https://phabricator.wikimedia.org/T223134


 * Other?
 * For those heading to hackathon, Kosta is willing to work on: https://phabricator.wikimedia.org/T87781 //Split mediawiki core tests into unit and integration tests//
 * if you build a Docker container, it might not be immediately available due to replication delay between codfw (active) and eqiad (replica, serving docker-registry.wikimedia.org) https://phabricator.wikimedia.org/T222210#5176863

Brennen

 * What I plan to do this week
 * Work on publishing of starter dev images before hackathon
 * Whatever else might be useful to tweaking local-charts prior to showing it a bunch of people
 * Review writing
 * What I'm blocked on
 * Other?
 * Other?

Dan

 * What I plan to do this week
 * Get a pipeline/.config.yaml working for... blubber(oid)?
 * Needs integration/config
 * What to do? Seed job?
 * Reviews
 * What I'm blocked on
 * Other?
 * Other?

Greg

 * What I plan to do this week
 * Annual Reviews
 * Annual Planning
 * Hackathon
 * What I'm blocked on
 * Other?
 * Other?

James

 * What I plan to do this week
 * CI/npm stuff, as above.
 * More MW static configuration concept work
 * Hackathon
 * Pipeline documentation work with Martyav.
 * Helping more ServiceOps with the HHVM -> PHP72 migration
 * What I'm blocked on
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * Prep for Hackathon
 * Travel/Hackathon
 * Reviews
 * What I'm blocked on
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * Finish reviews
 * merge parsoid patch for blubbefile
 * get Xdebug in mediawiki dockerfile
 * Hackathon prep/Hackathon
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * help Antoine with the train group 0
 * write reviews of self, Greg, peers
 * improve CI arch document, reach out for more feedback
 * start looking at what it takes to get one of the CI candidates running
 * What I'm blocked on
 * nada
 * Other?
 * nada

Mukunda

 * What I plan to do this week
 * Write reviews
 * T222638 Talk with Timo and Fillipo about grafana and sentury
 * T222829 merge branch.py and make-wmf-branch
 * Try to get some movement on "T200392 release notes automation"
 * Read a book
 * What I'm blocked on
 * Other?
 * Other?

Tyler

 * What I plan to do this week
 * gerrit 2.15.13
 * working on code health stuff
 * pipeline policy file
 * gerrit logging
 * What I'm blocked on
 * Other?
 * annual review
 * book
 * book

Zeljko

 * What I plan to do this week
 * Prepare for Wikimedia hackathon (submit sessions and projects)
 * Attend the hackathon
 * What I'm blocked on
 * Other?
 * Other?

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart