Wikimedia Release Engineering Team/Checkin archive/20190114

= 2019-01-14 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * January 20 - North American Lunar Eclipse: https://www.space.com/42976-blood-moon-lunar-eclipse-2019-coming-soon.html
 * January 21 (MLK Day) - US Staff - team meeting cancelled?
 * Moving to Wednesday, overwriting those 1:1s
 * January 28 - February 1 - All Hands
 * February 2 - February 9 - Lars on vacation
 * February Fri. 8th - Mon. Feb. 17th - Antoine, school vacations
 * February 18 (President's Day) - US Staff
 * February 19 - March 1 - Dan, vacation
 * March 11 (WMF Holiday) - US Staff
 * April 22 (WMF Holiday) - US Staff
 * April 22nd - Antoine, Easter
 * May 1st - Antoine, labor day
 * May 8th - Antoine, 1945 victory
 * May 30th-31th - Antoine, Feast of the Ascension
 * June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
 * May 27 (Memorial Day) - US Staff
 * June 19 (Juneteenth) - US Staff

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R


 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine

SoS

 * Zeljko 4eva! :)

Book club

 * It's now a thing: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
 * Buy your copy and expense it.
 * SUMMARY: Read Part I
 * Greg to find a magical time to discuss

Spring Offsite

 * Give me your "no travel" dates: https://docs.google.com/spreadsheets/d/1q_y0rmSKbi-qzDtHbDES2jEANWN2CZ4TxpHAaXFfGxU/edit#gid=0
 * 2019-04-22
 * 2019-05-06

Gerrit 2.15.7

 * https://phabricator.wikimedia.org/T210785
 * After break
 * Prep this week
 * Tyler to ping Antoine
 * Jeena may join in as well
 * Dan may want to as well - depends on schedule + Train
 * 2019-01-07: thcipriani did nothing here
 * 2019-01-14: this is now 2.15.8
 * thcipriani updated stable-2.15
 * plan is to wrestle plugins today
 * hopefully this week

LFS objects are not mirroring from Github through Phab to Gerrit consistently

 * https://phabricator.wikimedia.org/T212818 -> now: https://phabricator.wikimedia.org/T212962
 * We need to figure out the correct long term solution here.
 * Contraints:
 * They want git-lfs support
 * They also want to have development happening on github ("external contributors")
 * question: can we just use git-fat?
 * Antoine: probably because ORES models stored outside of git predate our adoption of git-fat and at the time we went with lfs
 * Possible outcome: migrate to git-fat and archiva?
 * references/see also:
 * https://phabricator.wikimedia.org/T181678 - Plan migration of ORES repos to git-lfs - Nov 2017 - June 2018
 * https://phabricator.wikimedia.org/T180627 - Support git-lfs in scap
 * 2019-01-14:
 * LFS doesn't support mirroring automatically
 * Mirroring ***from*** github may be broken in other ways, not sure
 * Mirroring ***from*** github may be broken in other ways, not sure

Mukunda to follow up on: https://phabricator.wikimedia.org/T212962 - articlequality repo mirroring is broken
 * this is more complex than it seemed initially, I'm working on sorting this out


 * TODO: Greg, make time on Tuesday for this to be a discussion topic, especially wrt to Continuous Deployment and mid-term planning

Scrum of Scrums

 * Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

 * Blocking: nothing...

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * Train Health:
 * Last week: 1.33.0-wmf.12 - https://phabricator.wikimedia.org/T206666 - no problemo
 * This week: 1.33.0-wmf.13 - https://phabricator.wikimedia.org/T206667 - no problemo (so far) :P
 * Next week: 1.33.0-wmf.14 - https://phabricator.wikimedia.org/T206668 - last train before All Hands (no train that week)
 * Log Health:
 * Code Health:
 * Next week: 1.33.0-wmf.14 - https://phabricator.wikimedia.org/T206668 - last train before All Hands (no train that week)
 * Log Health:
 * Code Health:

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * That 60 second timeout issue is noisy
 * The new normal? Do we filter it out?
 * https://phabricator.wikimedia.org/T204871

Quarterly Goals for Q3
https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q3

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Automate the generation of change log notes
 * WHO: Mukunda, (Tyler on backup)


 * There is a Jenkins Job!
 * Docs updated https://wikitech.wikimedia.org/wiki/Heterogeneous_deployment/Train_deploys#Update_deploy_notes
 * Next step: figure out what versions to generate for automagically
 * Working on the wednesday

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Investigate notification methods for developers with changes that are riding any given train
 * WHO: Mukunda, Tyler


 * Maybe we should send an email this week?
 * add all committers to train task
 * TODO: Tyler to draft a thing

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Instrument Quibble for data collection
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Create a graph where time is spent and make a prioritized list for improvements.
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 2 / Output 2.1

 * GOAL: Select and integrate a code health metric solution into our tooling.
 * WHO: JR, ...

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * cxserver, ORES (partially), citoid, changeprop, cpjobqueue (stretch)
 * Deploy eventgate
 * WHO: Dan, Tyler, Lars


 * eventgate-ci has a pipeline-produced image: https://people.wikimedia.org/~thcipriani/docker/wikimedia/eventgate-ci/tags/
 * New Blubber docs (complete with fancy logo): https://wikitech.wikimedia.org/wiki/Blubber
 * Debating how to do more than one service per repo: https://phabricator.wikimedia.org/T210267

TEC12 (DevProd): Outcome 1 / Output 1.1

 * GOAL: Conduct interviews with development stakeholders and compile a report that informs future work creation of a rubric.
 * WHO: Jeena, Mukunda


 * Did some interviews last week
 * More this week
 * Need to record the results somewhere outside of my notebook

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOALs:
 * Develop and communicate guidelines and best practices for successful Code Stewardship.
 * (Continued from Q2) Update/refresh review queue (review process for initial code deployment)
 * WHO: JR

TEC13 (Code Health): Outcome 2 / Output 2.2

 * GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test -
 * WHO: Zeljko


 * commit for one of the selected repos created before I've contacted them https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/476519
 * contacted people from all relevant teams (could not find mail lists for teams) including #releng ;)
 * should have added Greg to cc to make it more scary :)
 * got a contact for PageTriage, followed up
 * Flow said no thanks
 * wmde and fr-tech people forwarded to their list, no reply yet
 * language-eng looped in more people, no reply yet
 * releng replied, no next steps yet ;P

TEC13 (Code Health): Outcome 2 / Output 2.3

 * GOALs:
 * Evolve/develop tools and processes to support the PE refactoring effort to improve code health.
 * Develop common test strategy that enable teams to engage in more effective and efficient testing practices. (maybe should be output 2.4?)
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 3 / Output 3.2

 * GOALs:
 * Speak at All Hands on the status of Technical Debt
 * Engage and coach development teams on their approach to managing technical debt.
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 4 / Output 4.1

 * GOALs: Code Health Dashboard with 50% of repositories covered.
 * WHO: JR, Core Platform Team

Selenium

 * ERR No Time

Phabricator

 * Meeting with Evan Priestley today about Corey's plan to fund upstream development on a couple of features:
 * Reporting/Charting - $9,000
 * Workboard Column Triggers - $4,000
 * Workboard Realtime Updates - $3,000
 * Task Types - Needs Discussion
 * PERT/Dependency Graph - Needs Discussion
 * More Flavors of Dependencies - Needs Discussion

QA/Code Health

 * Code Health Metrics Workgroup (Zeljko and Kosta) will be doing 5 min SonarQube demo this Thursday at Tech-Demo meeting.
 * The video! https://www.youtube.com/watch?v=PExZ2o5luos (private to wikimedia.org accounts)

SCAP
Python 2/3 compat changes needing review:
 * scap lock
 * scap say

Antoine

 * What I plan to do this week
 * Continue on CI-slipway (migrate out of permanent slaves) https://phabricator.wikimedia.org/project/view/3722/
 * CI jobs now use tox 2.9.1 (was 2.6.0). Cache corruption for binary wheels since we switched from Jessie to Stretch, libs are different (typically mysql-python linked in cache to libmysqlclient.so.18 which does not exist in Stretch). Had to nuke castor cache.
 * Timo migrating npm jobs to NodeJS 10 (and npm 6)
 * What I'm blocked on
 * E too many things (haven't looked at optimizing Quibble (marble: https://phabricator.wikimedia.org/project/view/3765/ )
 * No bandwith to context switch to the SonarQube effort :-/
 * Does no-antoine block this work?
 * Tyler has reviewed the commit, so not blocked.
 * Ah coool!! -- Antoine :)
 * Other?
 * Pipeline migration related:
 * Scrum of Scrum wikidata/query/gui https://phabricator.wikimedia.org/T210286, blocked on providing a Nginx image https://phabricator.wikimedia.org/T209292
 * wikimedia/portals ( for https://www.wikipedia.org/ ) deployed as submodule of operations/mediawiki-config and served by MediaWiki app servers. Maybe it can be migrated to k8s instead?

Dan

 * What I plan to do this week
 * Manually defining artifacts results in default copy of all project files
 * Antoine: CI uses LOG_DIR env variable as a convention, default to /log iirc for containers (used to be $WORKSPACE/log ). So then Jenkins just archive log/* :)
 * This is different :)
 * Train
 * What I'm blocked on
 * Other?
 * Other?

Greg

 * What I plan to do this week
 * Read and respond to a question from the other Evan P(rodromou) re mutation testing: https://en.wikipedia.org/wiki/Mutation_testing
 * Guillaume L (gehel) gave a talk on this last on this!
 * Still need to make myself decide if I have any feedback on the Gerrit Policy from Tim
 * CTO hiring
 * Other hiring
 * mid-year check-ins continuing
 * What I'm blocked on
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * What I'm blocked on
 * Other?
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * interviews for local dev
 * Need to update the phabricator task
 * Record interview notes somewhere
 * If have time, try to work on containerizing one of services in our list
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * finish minikube + helm + blubberoid setup notes \o/
 * process feedback to my CD essay, continue discussion, maybe widen audience (wikitech? google doc? some mailing list?)
 * pipeline group?
 * start learning Go
 * start re-reading CD book
 * What I'm blocked on
 * brain capacity
 * Other?
 * more feedback on CD essay is welcome!

Mukunda

 * What I plan to do this week
 * Meet with Evan Priestley about funded phabricator changes
 * Still need to figure out what is actually broken in articlequality repo mirroring
 * review tyler's patches
 * gpg signing:
 * Try to figure out why my gpg subkeys don't validate
 * Re-sign the 1.32 build if I can't get the existing signature to validate
 * Mark the 1.32 release resolved. ( https://phabricator.wikimedia.org/T207529 )
 * What I'm blocked on
 * Other?
 * Other?

Tyler

 * What I plan to do this week
 * scap 2/3 compat work
 * draft email to increase developers awarement of train
 * Gerrit 2.15.8
 * What I'm blocked on
 * Other?
 * Other?

Zeljko

 * What I plan to do this week
 * T207044 Give a code health metrics talk at All Hands
 * Presenting at local meetup https://www.meetup.com/testival/events/257897967/
 * T206621 5 of the 15 prioritized repositories have at least 1 end-to-end test
 * contacted teams
 * one repo working on tests https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/476519
 * What I'm blocked on
 * T207046 Code health metrics spike
 * Kosta asked for review (Antoine and/or Tyler) https://gerrit.wikimedia.org/r/c/integration/config/+/475470
 * Other?
 * Uploaded photos from all offsites to team drive
 * Code Health Metrics Workgroup 5 min SonarQube demo
 * video: https://www.youtube.com/watch?v=PExZ2o5luos (private to wikimedia.org accounts)
 * blog post: https://phabricator.wikimedia.org/phame/post/view/133/code_health_metrics_and_sonarqube/

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart