Wikimedia Release Engineering Team/Checkin archive/20181217

= 2018-12-17 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * Friday's in December: Greg off
 * Dec 20-21: Dan - if Greg approves it :)
 * Dec 21st: thcipriani
 * Dec 21st: mukunda also
 * December 24 - January 1 - Holidays (Christmas + New Years)
 * Jan 2nd: mukunda maybe

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R


 * Dec 10 - wmf.8 - Zeljko
 * Dec 17 - wmf.9 - Zeljko
 * Dec 24 - wmf.10 - No Train (Holiday break)
 * Dec 31 - wmf.11 - No Train (Holiday break)
 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine

SoS

 * Zeljko for ever :)

All Hands

 * Registration: https://office.wikimedia.org/wiki/All_hands/2019/Registration
 * Due Friday December 14th
 * Needed for everyone

Incoming Triage/Needs attention

 * Migrate the Integration cloud project to eqiad1-r
 * https://phabricator.wikimedia.org/T208803
 * 2018-11-12: Need a point person to work with Andrew on this
 * 2018-11-19: Tyler and Andrew migrated a few, no issues so far. integration-publishing migrated. castor02 can be migrated the same ways as other slaves, culprit might happen see task for details.
 * 2018-11-26: Need to migrate `castor` tomorrow morning.

Gerrit 2.15.7 https://phabricator.wikimedia.org/T210785
 * After break
 * Prep this week
 * Tyler to ping Antoine
 * Jeena may join in as well

Scrum of Scrums

 * Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

 * Blocking: nothing...

Release Engineering

 * Blocked by:
 * SRE patch (re)review
 * jenkins agent on releases-jenkins
 * install docker on releases-jenkins
 * Blocking:
 * Updates:
 * Train Health:
 * Last week: 1.33.0-wmf.8 deployment blockers https://phabricator.wikimedia.org/T206662
 * This week: 1.33.0-wmf.9 deployment blockers https://phabricator.wikimedia.org/T206663
 * Next week: No train, holidays!
 * No Train (nor other deploys) weeks of December 24th and December 31st
 * Log Health:
 * Code Health:
 * Log Health:
 * Code Health:

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Release MediaWiki 1.32
 * WHO: Mukunda, (Tyler on backup)


 * If no new blockers, we'll call it the final release
 * Cindy will review and decide.
 * A meeting this thursday to make the call/do the release.

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Determine the procedure and requirements for an automated MediaWiki branch cut.
 * WHO: Mukunda, Tyler, Antoine


 * Few open patches to merge, listed in SoS
 * Requirements determined, getting them merged

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Formalize the collection of CI infrastructure and tooling metrics
 * WHO: Dan, Antoine


 * Trying to get _something_ done for this, but it's unclear at this point whether we should continue settings up prometheus or go with ES
 * https://phabricator.wikimedia.org/T78705
 * vs. https://phabricator.wikimedia.org/T182759

tl;dr: talk with godog about what we want/need and what they can do with/for us.

TEC3 (Pipeline): Outcome 2 / Output 2.3

 * GOAL: Develop set of metrics to assess incident reports/post mortems -
 * WHO: Greg, Zeljko

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * Migrate graphoid to the Deployment pipeline
 * Deploy zotero v2 to the Deployment pipeline
 * Deploy blubberoid
 * WHO: Dan, Tyler, Lars


 * graphoid in stewardship review
 * zotero ✅
 * blubberoid today \o/

TEC12 (DevProd): Outcome 2 / Output 2.1

 * GOAL: The Annual Developer Productivity Survey results are synthesized and shared, creating a first year baseline.
 * WHO: Mukunda, Greg


 * Jeena started qualitative review/coding
 * ACTION: Mukunda, Jeena, and Greg to sync on Wednesday morning, get it out by Thursday

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOAL: Update/refresh review queue (review process for initial code deployment)
 * WHO: JR

No activity.

TEC13 (Code Health): Outcome 2 / Output 2.2

 * GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test -
 * WHO: Zeljko

Late a quarter. :/ Željko will send initial e-mail to repository owners this quarter. I think it is doable to catch up next quarter.

TEC13 (Code Health): Outcome 2 / Output 2.3

 * GOAL: Assess Platform unit test practices and define improvement plan
 * WHO: JR, Core Platform Team

Started conversation with CPT regarding Q3 goals. More to come this week.

TEC13 (Code Health): Outcome 3 / Output 3.2

 * GOAL: Core Platform and Search Platform teams are using TDM PoC
 * WHO: JR, Core Platform Team

done.

TEC13 (Code Health): Outcome 3 / Output 3.4

 * GOALs:
 * Identify key Tech Debt areas
 * Put in place Tech Debt management process for PEP
 * WHO: JR, Core Platform Team

done.

TEC13 (Code Health): Outcome 4 / Output 4.1

 * GOAL: Metrics defined and deployed for all 4 Code Health areas.
 * WHO: JR, Code Health Metrics Working Group

no activity. Q2 goals done.

TEC3 (Pipeline): Outcume 2 / Output 2.3

 * GOAL: Outline options for managing incident reports creation, follow-ups, and analysis
 * WHO: Greg, Mukunda, Zeljko
 * Task: https://phabricator.wikimedia.org/T208632


 * ACTION: Greg to review and probably resolve.
 * ACTION: Zeljko to write down his ideas of specific improvements to the report template and the script.

Selenium

 * No major progress this quarter. :|
 * T206624 Q2 Selenium framework improvements

Gerrit

 * 2.15.7 update prep

Phabricator

 * Got Corey introduced to Evan and got a conversation started about upstream work on the various Phabricator improvements we want to sponsor
 * We have the beginnings of a plan for which improvements to tackle ourselves, which ones to push upstream, etc.

QA/Code Health
Code Health Newsletter published today (finally). Kinda good that it was delayed as I got to add the first "help wanted" request.

SCAP

 * Meeting with thcpriani, mmodell and effie later today
 * something about scap state?
 * Meeting on Friday went over the horrifying history and myriad functions of scap.
 * Working on what's need for "percentage rollout" now that we have shared understanding

Antoine

 * What I plan to do this week
 * What I'm blocked on
 * Other?
 * Other?
 * Other?

Dan

 * What I plan to do this week
 * setting up a meeting w/ filippo about collecting CI build metrics
 * seeing off Blubber
 * deciding on blubber logo (help! vote?)
 * What I'm blocked on
 * Other?
 * Other?

Greg

 * What I plan to do this week
 * Internal Support for Open Source Tools Working Group (mouth full) survey analysis
 * Dev Productivity survey with Jeena and Mukunda
 * Get Jeena registered for the Hackathon
 * Q3 goals meeting on Thursday - 3 hours /o\
 * Review and probably resolve https://phabricator.wikimedia.org/T208632
 * Vote on logos!
 * What I'm blocked on
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * Define Code Review Workgroup scope
 * Work on metrics for incident reports pushing out until Q3
 * Wrap up Q2 goal loose ends (tasks, etc..)
 * Wrap up Q3 goal planning
 * What I'm blocked on
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * Finish analyzing survey results
 * Participate in blubberoid deployment
 * Participate in gerrit upgrade(?) (thcipriani: will setup pairing session with Antoine this week)
 * Register for hackathon
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * Experiment with ways to store Jenkins builds data for longer than 14 days.
 * Re-do work to set up Helm and deploy Blubberoid to minikube, with notes.
 * Re-do UML sequence diagram delivery pipeline using seqdiag instead of plantuml.
 * https://wikitech.wikimedia.org/wiki/Streamlined_Service_Delivery_Design/research#Process_as_a_UML_sequence_diagram
 * What I'm blocked on
 * Broken hard disk at home.
 * Jet lag.
 * Other?
 * N/A

Mukunda

 * What I plan to do this week
 * Follow up with Corey Floyd and Evan Priestley about phabricator upstream work.
 * Release MediaWiki 1.32.0 on thursday (or cut another rc if we have any blockers)
 * Help with finishing the dev satisfaction survey results and publish them by EOW
 * Taking friday off, prepare for next week trip to Boston!


 * What I'm blocked on
 * Other?
 * Other?

Tyler

 * What I plan to do this week
 * Scap q3 goal
 * gerrit prep
 * end of q2/real year task cleanup
 * What I'm blocked on
 * Review -- two for pipeline:
 * https://gerrit.wikimedia.org/r/#/c/integration/config/+/476593/
 * https://gerrit.wikimedia.org/r/#/c/integration/config/+/476600/
 * Review to push to SoS (releases jenkins):
 * jenkins agent on releases-jenkins
 * install docker on releases-jenkins
 * Other?

Zeljko

 * What I plan to do this week
 * T206663 1.33.0-wmf.9 deployment blockers
 * T206621 5 of the 15 prioritized repositories have at least 1 end-to-end test
 * Send the e-mail :|
 * T204871 Investigate the spikes of "web request took longer than 60 seconds and timed out" during deployments
 * paperwork
 * What I'm blocked on
 * T206662 1.33.0-wmf.8 deployment blockers
 * T211885 ErrorException from line 47 of /srv/mediawiki/php-1.33.0-wmf.8/extensions/Kartographer/includes/ApiQueryMapData.php: PHP Warning: data error
 * https://wikitech.wikimedia.org/wiki/Incident_documentation/20181212-Train-1.33.0-wmf.8
 * Other?

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart