Wikimedia Release Engineering Team/Checkin archive/20190304

= 2019-03-04 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * March 11 (WMF Holiday) - US Staff
 * April 9-12: Greg at tech-mgt F2F in Portland
 * April 22 (WMF Holiday) - US Staff
 * April 22-27: Team offsite in Chicago
 * April 22nd - Antoine, Easter - we're flying to Chicago?
 * May 1st - Antoine and Željko, Labor Day / May Day
 * May 8th - Antoine, 1945 victory
 * May 17-19 - Wikimedia Hackathon 2019 (Prague, Czechia)
 * Attending:
 * May 30th-31th - Antoine, Feast of the Ascension
 * June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
 * May 27 (Memorial Day) - US Staff
 * June 19 (Juneteenth) - US Staff

Train

 * Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R


 * Jan 07 - wmf.12 - Dan
 * Jan 14 - wmf.13 - Dan
 * Jan 21 - wmf.14 - Mukunda
 * Jan 28 - wmf.15 - No Train (All Hands)
 * Feb 04 - wmf.16 - Mukunda
 * Feb 11 - wmf.17 - Tyler
 * Feb 18 - wmf.18 - Tyler
 * Feb 25 - wmf.19 - Antoine
 * Mar 04 - wmf.20 - Antoine
 * Mar 11 - wmf.21 - Zeljko
 * Mar 18 - wmf.22 - Zeljko
 * Mar 25 - wmf.23 - Dan
 * Apr 01 - wmf.24 - Dan
 * Apr 08 - wmf.25 - Mukunda
 * Apr 15 - wmf.26 - Mukunda
 * Apr 22 - 1.34.0-wmf.1 - NO TRAIN, team offsite
 * Apr 29 - wmf.2 - Tyler
 * May 06 - wmf.3 - Tyler
 * May 13 - wmf.4 - Antoine
 * May 20 - wmf.5 - Antoine
 * May 27 - wmf.6 - Zeljko
 * June 03 - wmf.7 - Zeljko

SoS

 * Zeljko 4eva! :)

Book club

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
 * Scheduled: Mar 7th, 9am Pacific

Spring Offsite

 * Location: Chicago, IL (Central timezone, UTC-5 while we're there)
 * Dates: Arrive Monday 4/22, Depart Saturday 4/27.
 * BOOK FLIGHTS BY: March 21
 * Activity day: Send your suggestions to me if you have them :) I'll make the voting spreadsheet later.
 * Chicago Bulls!!!11!oneone
 * April 10 -- Regular Season ends, so only if they're good this year :)
 * I've heard there's good pizza :P
 * I'm sure we'll have some of that for our dinners, unless you want to do a cooking class :)
 * Greenfield park conservatory?
 * Museum of Science and Industry - https://www.msichicago.org/
 * Any American sport would be fun (basketball, football, baseball..) (Lars doesn't like watching sports, but would be happy to sit somewhere quite for the duration) (thcipriani: baseball isn't so much about watching baseball :)) (Lars: going to the baseball stadium is stressful when there's thousands of others there (I'm difficult, sorry))
 * Program: Haven't started yet :)
 * start listing your topics! https://etherpad.wikimedia.org/p/releng-offsite-201904-topics

Monthly reflection on accomplishments - March '19 edition

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
 * Add as you have them!


 * CI tooling future WG started, blogged
 * GerritBot comments on patches going through the pipeline (with fancy badges and the like)
 * Train deploy notes are now automatically generated on branch push
 * Scap 3.9.1-1 released in production
 * Phabricator upgrade: https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/

Mid-term planning and what it means for us

 * See: https://office.wikimedia.org/wiki/Annual_planning/FY19-20/Medium-term_Planning
 * Specifically: https://office.wikimedia.org/wiki/Annual_planning/FY19-20/Medium-term_Planning#Foundation-wide_goals_in_development

Q4 Goals planning

 * TEC-3: Deployment Pipeline - https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC3:_Deployment_Pipeline/Goals
 * Please discuss during the RelEng meeting on Tuesday
 * I would join but I have a conflict (the mid-term planning working group thing)
 * TEC-12: Developer Productivity - https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC12:_Developer_Productivity/Goals
 * Given the strong overlap, please try to accomplish along with TEC3 on Tuesday
 * TEC-13: Code Health - https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC13:_Code_Health/Goals
 * JR, Zeljko, and I should have a call this week
 * TODO: usurp Zeljko's 1:1 on Wednesday

Annual Planning is coming up

 * Regardless of the mid-term planning above we need to have a clear idea of what we want to accomplish next fiscal (for the newbies: our fiscal year is July 1st - June 30th)[[
 * As it stands now Platform Evolution is one of the main mid-term goals with Engineering Productivity as a highlighted sub-goal/project
 * Timeline: https://office.wikimedia.org/wiki/Annual_planning/FY19-20

Recover from corrupted beta MySQL slave (deployment-db04)

 * https://phabricator.wikimedia.org/T216067
 * THANKS MUKUNDA!
 * Process Documented: https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/MariaDB_Slave_instance_setup

Pywikibot CI

 * https://phabricator.wikimedia.org/T132138
 * Antoine to take a time boxed look into this, this week

Scrum of Scrums

 * Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

Incoming from last week

 * Blocking:

Release Engineering

 * Blocked by:
 * Blocking:
 * Updates:
 * https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/
 * Train Health:
 * Last week: 1.33.0-wmf.19 - https://phabricator.wikimedia.org/T206672
 * This week: 1.33.0-wmf.20 - https://phabricator.wikimedia.org/T206673
 * Next week: 1.33.0-wmf.21 - https://phabricator.wikimedia.org/T206674
 * Log Health:
 * Code Health:
 * Code Health:

Callouts

 * Release Engineering

Train status and happenings

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * "the site is still up"
 * EventBus and logstash blew up, lost some events
 * a few new error spam reports filed in phabricator

Quarterly Goals for Q3
https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q3

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Automate the generation of change log notes
 * WHO: Mukunda, (Tyler on backup)


 * ✅ should now run on branch cut https://integration.wikimedia.org/ci/job/train-deploy-notes/

TEC1 (Maint): Outcome 1 / Output 1.1

 * GOAL: Investigate notification methods for developers with changes that are riding any given train
 * WHO: Mukunda, Tyler

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Instrument Quibble for data collection
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 1 / Output 1.2

 * GOAL: Create a graph where time is spent and make a prioritized list for improvements.
 * WHO: Mukunda, Antoine

TEC3 (Pipeline): Outcome 2 / Output 2.1

 * GOAL: Select and integrate a code health metric solution into our tooling.
 * WHO: JR, ...

TEC3 (Pipeline): Outcome 3 / Output 3.1

 * GOALS:
 * Adopt more services into Deployment pipeline -
 * cxserver, ORES (partially), citoid, changeprop, cpjobqueue (stretch)
 * Deploy eventgate
 * WHO: Dan, Tyler, Lars


 * CI future WG work starting first evalutions this week
 * cxserver
 * Images built via deployment pipeline
 * Namespaces created for k8s eqiad/codfw


 * citoid
 * Images built via deployment pipeline
 * Deployed
 * Traffic not switched yet


 * changeprop


 * ✅ eventgate
 * Image built via pipeline
 * Chart
 * Deployed


 * ORES
 * cf: Dan's comments

TEC12 (DevProd): Outcome 1 / Output 1.1

 * GOAL: Conduct interviews with development stakeholders and compile a report that informs future work creation of a rubric.
 * WHO: Jeena, Mukunda


 * Results are posted: https://www.mediawiki.org/wiki/Developer_Satisfaction

TEC13 (Code Health): Outcome 1 / Output 1.1

 * GOALs:
 * Develop and communicate guidelines and best practices for successful Code Stewardship.
 * (Continued from Q2) Update/refresh review queue (review process for initial code deployment)
 * WHO: JR

went on walkabout - pivoting a little

TEC13 (Code Health): Outcome 2 / Output 2.2

 * GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test -
 * WHO: Zeljko

TEC13 (Code Health): Outcome 2 / Output 2.3

 * GOALs:
 * Evolve/develop tools and processes to support the PE refactoring effort to improve code health.
 * Develop common test strategy that enable teams to engage in more effective and efficient testing practices. (maybe should be output 2.4?)
 * WHO: JR, Core Platform Team

JS code coverage GAP may be close to being addressed. Code Health Metrics WG (Kosta) may have a good solution in place.

TEC13 (Code Health): Outcome 3 / Output 3.2

 * GOALs:
 * Speak at All Hands on the status of Technical Debt
 * Engage and coach development teams on their approach to managing technical debt.
 * WHO: JR, Core Platform Team

TEC13 (Code Health): Outcome 4 / Output 4.1

 * GOALs: Code Health Dashboard with 50% of repositories covered.
 * WHO: JR, Core Platform Team

current mixing of unit and integration tests it our use model is potentially causing issues with pulling in results to SonarQube/Cloud.

Gerrit

 * jGit GC index bugs https://gerrit.wikimedia.org/r/c/operations/puppet/+/493963
 * Gerrit upgrade pending....dzahn in a different part of the world
 * Gerrit users unable to login should be OK now -- https://phabricator.wikimedia.org/T216605 -- would like to schedule some downtime for this Soon™ish

Phabricator

 * Phabricator updates last week fixed a few bugs and rolled out improved subtype support, see blog post for details:
 * https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/

QA/Code Health
interesting stuff :-)

Antoine

 * What I plan to do this week
 * What I'm blocked on
 * Other?
 * Other?
 * Other?

Brennen

 * What I plan to do this week
 * CI evaluation - we have time scheduled to meet twice
 * Review and test Jeena's restbase chart - https://gerrit.wikimedia.org/r/c/releng/local-charts/+/493766
 * New developer docs discussion with Eric Gardner and Hana Worku
 * Evaluate existing dev bootstrapping docs around Docker, etc.
 * Maybe summarize some book notes
 * What I'm blocked on
 * Other?
 * Other?

Dan

 * What I plan to do this week
 * Get caught up
 * What I'm blocked on
 * Kauai brain 8D
 * Other?

Greg

 * What I plan to do this week
 * OMG Planning (4 hours of workshopping tomorrow)
 * Talk with people
 * Loop back to Beta Cluster code stewardship
 * Review queue, ugh
 * Book highlighting/scribbles to notes
 * Loop back to TechConf thinking/planning
 * What I'm blocked on
 * Other?
 * Other?

James

 * What I plan to do this week
 * What I'm blocked on
 * being sick :(
 * Other?
 * Other?

Jean-Rene

 * What I plan to do this week
 * Catch up on book
 * Code stewardship review activities
 * Code Stewards best practices (metrics to enable)
 * What I'm blocked on
 * Other?
 * Other?

Jeena

 * What I plan to do this week
 * Merge restbase chart
 * Update tickets and documentation for Developer Productivity work
 * Work on improving local env performance (look into suggestions from Kosta)
 * New Developer Docs Discussion
 * Read book
 * What I'm blocked on
 * Other?
 * Other?

Lars

 * What I plan to do this week
 * Get Quibble running on my laptop
 * Start evaluating CI tooling options
 * What I'm blocked on
 * n/a
 * Other?
 * n/a

Mukunda

 * What I plan to do this week
 * Finally release the vandalism revert script
 * Deploy conduit method that is required by the above ^
 * Read a book
 * Fix bulk editing in phabricator ( https://phabricator.wikimedia.org/T216867 )
 * More work with subtypes
 * What I'm blocked on
 * Other?
 * Other?

Tyler

 * What I plan to do this week
 * work-todo: Scheduled:  TODO Review pruning docker image jenkins job work
 * work-todo: Scheduled:  TODO Scap Python3: Make working scap dev environment
 * work-todo: Scheduled:  TODO Scap Python3: Investigate python3 print v Jenkins shell
 * work-todo: Scheduled:  TODO Train Automation: Ping AndyRussG about updating HEAD
 * work-todo: Scheduled:  TODO Continuous Delivery Bookclub: Inspectional Read Ch 1-4
 * work-todo: Scheduled:  TODO Continuous Delivery Bookclub: Outline Ch 1-4
 * work-todo: Scheduled:  TODO Gerrit 2.15.1(0|1): Schedule upgrade
 * work-todo: Scheduled:  TODO Gerrit 2.15.1(0|1): Investigate JGit GC issue ✅ (cf: https://gerrit.wikimedia.org/r/c/operations/puppet/+/493963 )
 * work-todo: Scheduled:  TODO Gerrit 2.15.1(0|1): Investigate Gerrit prometheus export
 * work-todo: Scheduled:  TODO Blubber: Policy File: Write a policyfile to enforce wmf base images
 * work-todo: Scheduled:  TODO Blubber: Policy File: Where to store policyfile...production-charts? puppet?
 * work-todo: Scheduled:  TODO Deployment Pipeline: Draft TEC3 Goal email
 * work-todo: Scheduled:  TODO Sonarcube: deploy job-template change
 * What I'm blocked on
 * Other?
 * Other?

Zeljko

 * What I plan to do this week
 * T217325 Consider and evaluate possible new CI tooling
 * T217008 Report results from SonarCloud to Gerrit
 * T217051 Echo automation smoke test
 * What I'm blocked on
 * Other?
 * it's crazy hat season https://usercontent.irccloud-cdn.com/file/pIiDQBLh/IMG_20190302_162303.jpg https://en.wikipedia.org/wiki/Yosemite_Sam
 * it's crazy hat season https://usercontent.irccloud-cdn.com/file/pIiDQBLh/IMG_20190302_162303.jpg https://en.wikipedia.org/wiki/Yosemite_Sam

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng
 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart