Wikimedia Release Engineering Team/Checkin archive/20190318

= 2019-03-18 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * March 29–April 1: James out (New Hampshire)
 * April 9-12: Greg at tech-mgt F2F in Portland
 * April 17-19 (Wednesday - Friday) - Željko vacation
 * April 22 (WMF Holiday) - US Staff
 * April 22-27: Team offsite in Chicago
 * April 29: Moved WMF Holiday for US staff at offsite
 * May 1st - Lars, Antoine and Željko, Labor Day / May Day
 * May 8th - Antoine, 1945 victory
 * May 15 (Wednesday) - Željko vacation
 * May 16-20 - Wikimedia Hackathon 2019 (Prague, Czechia)
 * Attending: Greg, JR, Zeljko, James, and Jeena
 * May 30th-31th - Antoine, Feast of the Ascension
 * June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
 * May 27 (Memorial Day) - US Staff
 * June 6-7 - Brennen, Apogaea
 * June 19 (Juneteenth) - US Staff
 * July 22 - August 9 - Željko vacation
 * August 25 - September 4 - Brennen vacation

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R


 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine
 * Mar 04 - wmf.20 - Antoine
 * Mar 11 - wmf.21 - Zeljko
 * Mar 18 - wmf.22 - Zeljko
 * Mar 25 - wmf.23 - Dan
 * Apr 01 - wmf.24 - Dan
 * Apr 08 - wmf.25 - Mukunda
 * Apr 15 - 1.34.0-wmf.1 - Mukunda
 * Apr 22 - wmf.2 - NO TRAIN, team offsite
 * Apr 29 - wmf.3 - Tyler
 * May 06 - wmf.4 - Tyler
 * May 13 - wmf.5 - Antoine
 * May 20 - wmf.6 - Antoine
 * May 27 - wmf.7 - Zeljko
 * June 03 - wmf.8 - Zeljko

SoS

 * Zeljko 4eva! :)

Book club

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
 * Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
 * Next: March 21st at the "same" time (9am Pacific/16:00 UTC)

Spring Offsite

 * Location: Chicago, IL (Central timezone, UTC-5 while we're there)
 * Dates: Arrive Monday 4/22, Depart Saturday 4/27.
 * BOOK YOUR FLIGHTS BY: March 21
 * Activity day
 * Fill out the spreadsheet: https://docs.google.com/spreadsheets/d/1zqO8Mk1wUU2ZtyAM9xU68CQTpJFEOPALfDKCj7aMNo4/edit
 * Program:
 * start listing your topics! https://etherpad.wikimedia.org/p/releng-offsite-201904-topics

Monthly reflection on accomplishments - March '19 edition

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
 * Add as you have them!


 * CI tooling future WG started, blogged
 * GerritBot comments on patches going through the pipeline (with fancy badges and the like)
 * Train deploy notes are now automatically generated on branch push
 * Scap 3.9.2-1 released in production
 * Phabricator upgrade: https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/
 * Published the ISOSTWG results and recommendation on officewiki and announced: https://office.wikimedia.org/wiki/Internal_Support_for_Open_Source_Tools_Working_Group
 * swat tags now show up in the deployment schedule (via lua magic)

Q4 Goals planning

 * etherpad: https://etherpad.wikimedia.org/p/releng-1819Q4-goals
 * Due: Monday March 18th, aka this Friday

Posted online at their respective locations:
 * https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC12:_Developer_Productivity/Goals#Q4_Goals
 * https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC3:_Deployment_Pipeline/Goals#Q4_Goals
 * https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC13:_Code_Health/Goals#Q4_Goals
 * https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC1:_Reliability,_Performance,_and_Maintenance/Goals#Q4_Goals

Annual Planning is coming up

 * 2019-03-13: I emailed mark re future testing/"evaluation" environments
 * See notes here: https://docs.google.com/document/d/1QU_6Svn4iduK0TPLSOghYP4g1lK-byCv-0ZKoHfIAVY/edit#heading=h.6gq2j7lm5pz8
 * 2019-03-18: updates....

Pywikibot CI

 * https://phabricator.wikimedia.org/T132138
 * Antoine to take a time boxed look into this, this week
 * 2019-03-18: Antoine was blocked last week

Merge blocker: The table 'l10n_cache' is full in quibble-vendor-mysql-hhvm-docker

 * https://phabricator.wikimedia.org/T217654
 * "The bump from 256M to 320M must be good enough and I have updated the Jenkins jobs. Lowering priority to High." -- https://phabricator.wikimedia.org/T217654#5020364
 * closed

Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11)

 * https://phabricator.wikimedia.org/T216689
 * "I have rollbacked the jobs container:" -- https://phabricator.wikimedia.org/T216689#5020757
 * See T218209 though. :-(
 * closed

Merge blocker: Failed to create /nonexistent/.pki/nssdb directory

 * https://phabricator.wikimedia.org/T218209
 * Caused by revert for T216689
 * closed

FYI: Wikimedia-production-error (Shared Build Failure)

 * https://phabricator.wikimedia.org/project/profile/3298/

Cannot access beta cluster db

 * https://phabricator.wikimedia.org/T217938
 * Mukunda to take a look
 * joe claimed the task, has some patches

Deploy Extension:WikimediaEditorTasks to Beta

 * https://phabricator.wikimedia.org/T218137
 * needed today
 * James can and will deal

branch cutting

 * our current branch cut method is broken due to HTTP Token on gerrit being disabled for security reasons.
 * TODO: create a task about this, add to train as a blocker
 * https://phabricator.wikimedia.org/T218597
 * Tyler and Mukunda and $OTHERS to chat after this meeting

Scrum of Scrums

 * Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

 * Blocking:

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * Help my CI job fails with exit status -11 https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
 * Train Health:
 * Last week: 1.33.0-wmf.21 - https://phabricator.wikimedia.org/T206675
 * This week: 1.33.0-wmf.22 - https://phabricator.wikimedia.org/T206676
 * Next week: 1.33.0-wmf.23 - https://phabricator.wikimedia.org/T206677
 * Code Health:
 * Code Health:

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor

Quarterly Goals for Q3
https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q3

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Automate the generation of change log notes
 * WHO: Mukunda, (Tyler on backup)


 * should now run on branch cut https://integration.wikimedia.org/ci/job/train-deploy-notes/

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Investigate notification methods for developers with changes that are riding any given train
 * WHO: Mukunda, Tyler

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Instrument Quibble for data collection
 * WHO: Mukunda, Antoine


 * I haven't gotten any responses about where to put the data. Hopefully graphite & promethius will work. Otherwise I guess logstash?

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Create a graph where time is spent and make a prioritized list for improvements.
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 2 / Output 2.1

 * GOAL: Select and integrate a code health metric solution into our tooling.
 * WHO: JR, ...

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * cxserver, ORES (partially), citoid, changeprop, cpjobqueue (stretch)
 * Deploy eventgate
 * WHO: Dan, Tyler, Lars


 * cxserver
 * Images built via deployment pipeline
 * Namespaces created for k8s eqiad/codfw
 * helm charts created


 * ✅ citoid
 * Images built via deployment pipeline
 * Deployed
 * Traffic switched


 * changeprop


 * ✅ eventgate
 * Image built via pipeline
 * Chart
 * Deployed


 * ORES
 * cf: Dan's comments

TEC12 (DevProd): Outcome 1 / Output 1.1

 * GOAL: Conduct interviews with development stakeholders and compile a report that informs future work creation of a rubric.
 * WHO: Jeena, Mukunda


 * ✅ Results are posted: https://www.mediawiki.org/wiki/Developer_Satisfaction

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOALs:
 * Develop and communicate guidelines and best practices for successful Code Stewardship.
 * (Continued from Q2) Update/refresh review queue (review process for initial code deployment)
 * WHO: JR

relocated Code Stewardship page and created base structure for Resources/Best practices.

TEC13 (Code Health): Outcome 2 / Output 2.2

 * GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test -
 * WHO: Zeljko

TEC13 (Code Health): Outcome 2 / Output 2.3

 * GOALs:
 * Evolve/develop tools and processes to support the PE refactoring effort to improve code health.
 * Develop common test strategy that enable teams to engage in more effective and efficient testing practices. (maybe should be output 2.4?)
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 3 / Output 3.2

 * GOALs:
 * Speak at All Hands on the status of Technical Debt
 * Engage and coach development teams on their approach to managing technical debt.
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 4 / Output 4.1

 * GOALs: Code Health Dashboard with 50% of repositories covered.
 * WHO: JR, Core Platform Team

Waiting on patch review/merge from RelEng. Upon merge, all extensions will have ability to run experimental to perform code analysis

Phabricator

 * Vandalism revert tool should have been finished last week but that didn't happen, should be done this week.

Antoine

 * What I plan to do this week
 * pywikibot tests run- https://phabricator.wikimedia.org/T186208
 * Some Quibble improvements?!
 * Catch on CI working group
 * What I'm blocked on
 * Please proof read/improve https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
 * And sign at the bottom if you did any fix :-]
 * Other?
 * Strike at school on Tuesday so kids will be at home some part of the day

Brennen

 * What I plan to do this week
 * CI WG
 * Run through Zuul v3 quick start
 * https://phabricator.wikimedia.org/T218138
 * https://zuul-ci.org/docs/zuul/admin/quick-start.html
 * Finish local-charts sshfs scripting - https://gerrit.wikimedia.org/r/c/releng/local-charts/+/497013
 * If longma's not already on it, tackle mediawiki blubber.yaml definition - https://phabricator.wikimedia.org/T218360
 * Read next 2 chapters of book
 * What I'm blocked on
 * Nothing
 * Other?
 * Fiddling with YubiKey 5 and SSH keys

Dan

 * What I plan to do this week
 * Implementing .pipeline/config.yaml https://phabricator.wikimedia.org/T210267
 * Drafting email to Analytics re: long-term event log storage
 * Evaluating Jenkins X
 * What I'm blocked on
 * Nada
 * Other?

Greg

 * What I plan to do this week
 * Quality discussion
 * Schedule a meeting for us before offsite to talk annual planning kickoff
 * TechConf planning meeting and follow-up with Deb/etc
 * "Wikimedia Foundation's Health and WellBeing Benefits Survey", due Friday March 22nd
 * will email reminder to team list
 * Write down some more notes about CD book
 * What I'm blocked on
 * A bit sick as well :/
 * Other?

James

 * What I plan to do this week
 * Most SDC stuff (potentially bumpy train deployment, as there's a DOM change for Commons File pages)
 * Train blocker fun
 * More CD reading.
 * What I'm blocked on
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * Continue work on test strategy
 * continue work of Code Stewardship best practices
 * Q4 Code Health Metrics WG goals.
 * Start work on DevEd Unit testing work with Guillaume
 * What I'm blocked on
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * Figure out helm charts issue turning number strings into floats. Then finish and test mediawiki automated install for local-charts
 * Update my computer to try and stop it from frrreezing and shutting down
 * Read book
 * Work on documentation for local-charts
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * Read CD book chapter 7, prepare for and particpate in book club meeting on Thursday.
 * Finish the CI WG work as much as possible (deadline on Monday next week).
 * What I'm blocked on
 * Other?
 * Other?

Mukunda

 * What I plan to do this week
 * Figure out storage for quibble instrumentation
 * Finish deploying vandalism revert tool in phabricator
 * Document branch cut via ssh / pushInsteadOf
 * What I'm blocked on
 * No storage for metrics from quibble. I'm hoping to use promethius to collect the metrics if I can figure it out.
 * Other?

Tyler

 * What I plan to do this week
 * fix wikimedia branch cut docs
 * blubber policy; made upstream patch
 * kosta and paladox review. My review queue is backed up :(
 * What I'm blocked on
 * Other?
 * sick :(
 * brain scatter
 * brain scatter

Zeljko

 * What I plan to do this week
 * T206676 1.33.0-wmf.22 deployment blockers
 * T217325 Consider and evaluate possible new CI tooling
 * What I'm blocked on
 * Other?
 * code health metrics blocked on releng (Antoine/Tyler):
 * https://gerrit.wikimedia.org/r/c/integration/config/+/494548 integration/config Remove requirement for properties file, import coverage if present
 * https://gerrit.wikimedia.org/r/c/integration/quibble/+/497222 integration/quibble Add Parsoid to docker image and run for Selenium tests
 * https://phabricator.wikimedia.org/T218598 Generate code coverage and make it available to wmf-sonar-scanner
 * https://phabricator.wikimedia.org/T218598 Generate code coverage and make it available to wmf-sonar-scanner

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart