Jump to content

Wikimedia Release Engineering Team/Checkin archive/20181029

From mediawiki.org


2018-10-29

[edit]

Vacations/Important dates

[edit]
https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
How to do it
  • November 1 (Thursday) - Holiday (All Saints' Day - Željko)
  • November 1st / 2nd - Holiday (Antoine)
  • November 12th - Holiday (Veteran's Day, Observed)
  • November 22+23 - Holidays (Thanksgiving)
  • November 25-december 2nd: Mukunda vacation (in California ahead of the offsite)
  • Week of December 3rd - Team offsite
  • December 24-28 - Holidays (Christmas)

Rotating positions

[edit]

Train

[edit]
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R
  • Oct 08 - wmf.25 - Dan (No train due to DC switchover)
  • Oct 15 - wmf.26 - Mukunda (last 1.32 wmf.XX release, 1.33 starts the next week)
  • Oct 22 - wmf.1 - Mukunda (warning, TechConf happening, ping Greg if you need responses from anyone there...)
  • Oct 29 - wmf.2 - Tyler <----
  • Nov 05 - wmf.3 - Tyler
  • Nov 12 - wmf.4 - Antoine
  • Nov 19 - wmf.5 - No Train (Thanksgiving)
  • Nov 26 - wmf.6 - Antoine
  • Dec 03 - wmf.7 - No Train (Offsite)
  • Dec 10 - wmf.8 - Zeljko
  • Dec 17 - wmf.9 - Zeljko
  • Dec 24 - wmf.10 - No Train (Holiday break)
  • Dec 31 - wmf.11 - No Train (Holiday break)
  • Jan 07 - wmf.12 - Dan
  • Jan 14 - wmf.13 - Dan
  • Jan 21 - wmf.14 - Mukunda
  • Jan 28 - wmf.15 - No Train (All Hands)
  • Feb 04 - wmf.16 - Mukunda
  • Feb 11 - wmf.17 - Tyler
  • Feb 18 - wmf.18 - Tyler
  • Feb 25 - wmf.19 - Antoine


SoS

[edit]
  • Oct 10 - Zeljko
  • Oct 17 - Zeljko
  • Oct 24 - Zeljko
  • Oct 31 - Zeljko <----
  • Nov 07 - Zeljko
  • Nov 14 - Zeljko
  • Nov 21 - Zeljko
  • Nov 28 - Zeljko
  • Dec 05 - Zeljko
  • Dec 12 - Zeljko
  • Dec 19 - Zeljko
  • Dec 26 - Zeljko
  • Jan 02 - Zeljko
  • Jan 09 - Zeljko
  • Jan 16 - Zeljko
  • Jan 23 - Zeljko
  • Jan 30 - Zeljko
  • Feb 06 - Zeljko
  • Feb 13 - Zeljko
  • Feb 20 - Zeljko
  • Feb 27 - Zeljko

Team Business

[edit]

Hiring

[edit]
  • update....

First Offsite

[edit]

Details:

  • Week of December 3rd
  • At the Queen Mary hotel in Long Beach
  • Deb T will be facilitating

Topics!

Needs attention

[edit]



  • deployment-prep region migration
    • See email with same subject on releng@lists
    • Question: incrementally or not?
      • looks like "however Andrew wants to do it"
      • REMINDER: send an email update to wikitech-l@/qa@ with the planned timeline/outage
      • Tyler to reply saying "take it away, andrew, and when are you going to do it?"

Scrum of Scrums

[edit]
Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

[edit]
  • Blocking:
    • Fundraising Tech: CRM tests still regularly failing due to full mysql partition on integration hosts. Possible fix noted by Eileen on https://phabricator.wikimedia.org/T205950
      • ACTION: Tyler to comment on the task with his proposal

Outgoing this week (wrong section heading is on purpose for copy/pasting into Scrum of Scrums etherpad

[edit]

Release Engineering

[edit]
  • Blocked by:
  • Blocking:
  • Updates:
    • Train Health:
      • Last week: 1.33.0-wmf.1 deployment blockers https://phabricator.wikimedia.org/T206655
        • Six blockers closed, one opened on friday after conclusion of the train
        • Group1 was delayed until Thursday and Group2 finally happened late Thursday evening.
        • Otherwise it was a mostly uneventful train. The blockers were perhaps more worrisome than destructive, no outages occurred.
      • This week: 1.33.0-wmf.2 deployment blockers https://phabricator.wikimedia.org/T206656
      • Next week:
    • Log Health:
    • Code Health:

Callouts

[edit]
  • Release Engineering

Train status and happenings

[edit]
https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor
  • Six blockers closed, one opened on friday after conclusion of the train
  • Group1 was delayed until Thursday and Group2 finally happened late Thursday evening.
  • Otherwise it was a mostly uneventful train. The blockers were perhaps more worrisome than destructive, no outages occurred.
  • Mukunda: TODO: simplified log version of incident report
  • OMG discussion on how to do this better
    • TODO: Greg to file a task

Quarterly Goals for Q2

[edit]

TEC1 (Maint): Outcome 1 / Output 1.1

[edit]
GOAL: Determine the procedure and requirements for an automated MediaWiki branch cut.
WHO: Mukunda, Tyler, Antoine

- Filed task for figuring out job storage for releases-jenkins


TEC3 (Pipeline): Outcome 1 / Output 1.2

[edit]
GOAL: Formalize the collection of CI infrastructure and tooling metrics
WHO: Dan, Antoine


TEC3 (Pipeline): Outcome 2 / Output 2.3

[edit]
GOAL: Develop set of metrics to assess incident reports/post mortems - task T206622
WHO: Greg, Zeljko

https://docs.google.com/spreadsheets/d/1AUqMgzThBHNL7DgI8C9PO_YQ1oD5CSd0iWvcVbowzdg/edit#gid=1154483822


TEC3 (Pipeline): Outcome 3 / Output 3.1

[edit]
GOALS:
Adopt more services into Deployment pipeline - task T205919
Migrate graphoid to the Deployment pipeline
Deploy zotero v2 to the Deployment pipeline
Deploy blubberoid
WHO: Dan, Tyler, Lars



TEC12 (DevProd): Outcome 2 / Output 2.1

[edit]
GOAL: The Annual Developer Productivity Survey results are synthesized and shared, creating a first year baseline.
WHO: Mukunda, Greg


TEC13 (Code Health): Outcome 1 / Output 1.1

[edit]
GOAL: Update/refresh review queue (review process for initial code deployment)
WHO: JR

Reviewed existing review queue process


TEC13 (Code Health): Outcome 2 / Output 2.2

[edit]
GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test - task T206621
WHO: Zeljko


TEC13 (Code Health): Outcome 2 / Output 2.3

[edit]
GOAL: Assess Platform unit test practices and define improvement plan
WHO: JR, Core Platform Team

No progress

TEC13 (Code Health): Outcome 3 / Output 3.2

[edit]
GOAL: Core Platform and Search Platform teams are using TDM PoC
WHO: JR, Core Platform Team

No progress

TEC13 (Code Health): Outcome 3 / Output 3.4

[edit]
GOALs:
Identify key Tech Debt areas
Put in place Tech Debt management process for PEP
WHO: JR, Core Platform Team

Discussion with Editing team re Code Health and Tech Debt.

TEC13 (Code Health): Outcome 4 / Output 4.1

[edit]
GOAL: Metrics defined and deployed for all 4 Code Health areas.
WHO: JR, Code Health Metrics Working Group

WG worked on spike task which is focused on getting a single metric running in our CI environment using a single tool (Sonarqube). Antoine deployed phpmetrics into CI ( https://integration.wikimedia.org/ci/job/mediawiki-core-phpmetrics-docker/3/console, https://doc.wikimedia.org/mediawiki-core/master/phpmetrics/, and https://doc.wikimedia.org/mediawiki-core/master/phpmetrics/violations.html)

Other work

[edit]

Selenium

[edit]

Gerrit

[edit]
  • Upgrade this week barring anything strange from the mailing lists
    • Need to test out upgrade on local instance

Phabricator

[edit]
  • I've implemented a prototype extension that can map urls like https://phabricator.wikimedia.org/train/1.33.0-wmf.1/ to the corresponding train blockers task.
    • This will also support /train/current to map to the currently active train.
    • This can also support other task-series schemes such as /swat/2018-10-31.1/


Jenkins

[edit]

QA

[edit]

SCAP

[edit]

Standup!

[edit]

Antoine

[edit]
  • What I plan to do this week
    • Complete Wikibase tests comparison and migrate to Docker
    • Progress on mediawiki extensions dependencies doc
  • What I'm blocked on
    • No feedback from fundraising team for DonationInterface, will probably just switch it
  • Other?
    • Not there 11/1st 11/2nd (holiday + relocating)


Dan

[edit]
  • What I plan to do this week
    • Continue with Jenkins/Prometheus
  • What I'm blocked on
    • Nada
  • Other?


Greg

[edit]
  • What I plan to do this week
    • take the rest of today and most of tomorrow off, I lost a weekend and a half with the travel/work last week
    • catch up on l10nupdate follow-ups
    • follow-up from TechConf program committee (cleaning/sanitzing notes and posting to wiki mostly)
    • follow-up from TechConf hallway session asks
    • Phabricator meeting on Wednesday
  • What I'm blocked on
    • sickness (just your basic cold from hanging around people from around the world)
  • Other?
    • I won't make you all do 8 days straight of work without a break somewhere in there, ftr


Jean-Rene

[edit]
  • What I plan to do this week
    • Continue work on Code Health Metrics
    • Continue work on Review Queue/RoO
  • What I'm blocked on
  • Other?


Lars

[edit]
  • What I plan to do this week
    • Try to understand how Blubberoid works.
    • See if I can find a way to run it locally.
    • If I can run it locally, sketch out the beginnings of a user guide.
    • Start a sketch of a very high level architecture diagram of how the deployment pipeline works.
    • Browse the various team Kanban boards to be comfortable with them.
  • What I'm blocked on
    • Nada
  • Other?
    • Zilch


Mukunda

[edit]
  • What I plan to do this week
    • Dev Productivity Survey is ready to go out this week
    • Pairing with Tyler on keyholder and scap stuff
    • Continue work on `scap swat` and swat-in-phab stuff
    • Phabricator improvements meeting wednesday
    • Train incident report
    • Work on paring down my list of open tasks
  • What I'm blocked on
  • Other?


Tyler

[edit]
  • What I plan to do this week
    • Gerrit
    • Scap
    • Train
    • Finish keyholder
  • What I'm blocked on
  • Other?


Zeljko

[edit]
  • What I plan to do this week
    • T199133 Find top 15 target projects that could use Selenium tests to prevent incidents - almost there(tm) I have all the data, working on a report (one big sheet to rule them all) - need help from Greg and/or JR with picking 5 repos once I have all the data
    • T207046 Code health metrics spike - need help from Antoine
  • What I'm blocked on
  • Other?


Grooming

[edit]

Team Kanban Board Review and Triage

[edit]


Once / month-ish review of backlog(s)

[edit]


Kanban stats

[edit]
Burnup chart