Wikimedia Release Engineering Team/Checkin archive/20171106

= 2017-11-06 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * Nov 10 (Fri) - Veteran's Day
 * November 17 (Friday) Željko - conference (Coderetreat)
 * Nov 20th - Dec 1st: Greg vacation
 * Nov 20th - 22nd: Mukunda vacation
 * Nov 23+24 - Thanksgiving
 * December 25 (Monday): Željko - holiday (Christmas Day)
 * December 26 (Tuesday): Željko - holiday (St Stephen's Day)
 * Dec 25-Jan 1 - End of year/new year holidays
 * January 1 (Monday): Željko - holiday (New Year's Day)

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R

Oct 23 and Oct 30

 * Train Chad
 * wmf.5
 * wmf.6
 * SOS: Mukunda
 * Out
 * Week of Oct 23: Tyler
 * Oct 25: Zeljko
 * Nov 1: Zeljko

Nov 6 and Nov 13

 * Train Chad
 * wmf.7
 * wmf.8
 * SOS: Tyler
 * Out
 * Nov 10 (Fri) - Veteran's Day
 * November 17 (Friday) Željko - conference (Coderetreat)

This week

 * Blocking
 * Blocked
 * Please port your browser tests to the nodejs framework. 7 repositories have not started yet (still in ruby, which is no longer maintained).
 * See: https://phabricator.wikimedia.org/T139740
 * Notably: Global Collaboration Team, Fundraising Tech, Wikibase, and Mulitmedia
 * Updates
 * No MW Train the week of the 20th due to Thanksgiving, SWATs will be open on Mon and Tuesday (Wednesday is “Friday” that week).
 * [TechDebt program] First pass of the service levels for component ownership was shared with the Code Health Group last week; feedback on-going.
 * [TechDebt program] The next blog post should be posted Real Soon Now™ (done on our side).
 * [SSD Program] Working on getting the mathoid tests running on submit.
 * [SSD Program] A new release of Blubber is on the horizon.
 * [scap tech debt] Working to support both pre git-2.11 and post git 2.11 hosts (namely Trusty and Jessie (with backports)/Stretch, respectively) to let us use newer functionality (notably `--jobs`).
 * [CI] Most tox jobs are moved to Docker containers
 * [CI] Investigating why many docker containers are left behind after a SIGTERM
 * [CI] Investigating why many docker containers are left behind after a SIGTERM

Last week

 * Blocking
 * Blocked
 * Updates:
 * We had to pause the MediaWiki train last week due to a hard to diagnose issue in production: https://phabricator.wikimedia.org/T179156
 * T171852 Tech talk: Selenium tests in Node.js https://phabricator.wikimedia.org/phame/post/view/78/tech_talk_selenium_tests_in_node.js/
 * T173488 Selenium Ruby framework deprecated https://phabricator.wikimedia.org/phame/post/view/79/selenium_ruby_framework_deprecation_october/
 * CI: Antoine was sick last week. But, he setup a simple package manager caching system for the docker based CI (a port of “castor” in our nodepool based CI). He will migrate some of the CI config related tox jobs to it.
 * T173488 Selenium Ruby framework deprecated https://phabricator.wikimedia.org/phame/post/view/79/selenium_ruby_framework_deprecation_october/
 * CI: Antoine was sick last week. But, he setup a simple package manager caching system for the docker based CI (a port of “castor” in our nodepool based CI). He will migrate some of the CI config related tox jobs to it.

Puppet SWAT

 * list of patches you want to submit to puppet swat

Logspam \ Last week's train updates

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


 * pretty quiet week
 * parameterized a few log spam messages (to glob them together), db and swift errors notably

Other Team Business

 * T179824 pws is unusuable on releng-secrets choking on a key
 * looks like Zeljko's expired key messed up things, will fix

Q2 goal/project check-in

 * All of it in table form: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201718Q2

Program 1: Outcome 5: Milestone 1: Migrate majority of developers to JavaScript based browser test framework (webdriver.io)

 * Due: End of this quarter
 * Quarter Goal Task: Port Selenium tests from Ruby to Node.js -


 * T139740 Port Selenium tests from Ruby to Node.js
 * +1 (Popups)
 * T179546 Popups Selenium tests daily targeting beta cluster
 * Done with minor CI refactoring, thanks to Antoine for help
 * T179190 Run Cucumber+Selenium+Node.js in CI
 * Talked with Antoine about minor CI refactoring
 * T179188 Video recording for Selenium tests in Node.js
 * researching

Ruby

 * T167432 Run Wikibase daily browser tests on Jenkins
 * only JJB config left

Program 3: Outcome 1: Objective 1: Define a set of code stewardship levels (from high to low expectations)

 * Due: End of this quarter
 * Quatertly Goal task: -


 * first pass was completed last week and shared with Code Health team
 * Made some updates to the Developers/Maintainers page to reflect the stewardship approach which spured some conversation (and revert).

Program 3: Outcome 1: Objective 2: Identify and find stewards for high-priority/high use code segment orphans

 * Due: End of next quarter
 * Quaterly Goal task -

Program 3: Outcome 2: Objective 1: Define a “Technical Debt Project Manager” role that regularly communicates with all Foundation engineering teams regarding their technical debt

 * Due: End of this quarter

Program 3: Outcome 2: Objective 2: Define and implement a process to regularly address technical debt across the Foundation

 * Due: End of next quarter


 * tech debt series blog post 1 completed and ready to post

==== Program 6: Outcome 2: Objective 2: Set up a continuous integration and deployment pipeline to publish new versions of an application to production via testing and staging environments that reliably reproduce production ====
 * Due: End of this quarter
 * Keyword: SSD
 * Complete build phase of release pipeline


 * Build test variant
 * Run test entrypoint w/developer feedback - services dependency
 * Build production variant w/developer feedback - services dependency
 * Tag production container
 * Push to production docker registry - ops dependency - staging namespace
 * Tracking: https://phabricator.wikimedia.org/T157469
 * current status: https://phabricator.wikimedia.org/project/view/2453/


 * Trying to get mathoid tests to run automagically on submit...having troubles
 * https://phabricator.wikimedia.org/T177954
 * do pipeline jobs register with the gearman plugin?
 * Need to (always need to :)) code review patch implementing config validation
 * https://phabricator.wikimedia.org/D868
 * New blubber release Soon™

Program 1: Outcome 1: Objective 1: Scap (Tech Debt Sprint FY201718-Q2)

 * workboard

** Still needs someone to accept the patch: https://phabricator.wikimedia.org/D866
 * Made more progress on submodule disk space usage, I expect to resolve this week: https://phabricator.wikimedia.org/T137124
 * Converted most of git commands to use the "sh" module (https://amoffat.github.io/sh/) which is much nicer than the standard subprocess:
 * Unbreak scap-vagrant, mostly


 * New scap sayings:
 * S.C.A.P.: scaring children away promptly
 * S.C.A.P.: soilent contents are people
 * S.C.A.P.: say carl and pause
 * S.C.A.P.: stupid captions annotate pictures
 * S.C.A.P.: sh -c awk | perl
 * thcipriani: this is maybe my new favorite :)
 * Mukunda agrees o.O
 * S.C.A.P.: sulphur, carbon, arsenic, phosphorus
 * S.C.A.P.: syntax: conjunction, article, pronoun

Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure

 * Goal: A generalized POC for a docker-based CI.
 * https://phabricator.wikimedia.org/project/view/3008/ (shipyard workboard)


 * Most tox jobs moved to Docker containers
 * Docker containers are left behind after Jenkins SIGTERM 'docker run'

Program 1: Outcome 6: Milestone 2: Maintain Phabricator

 * Mostly Uneventful. Upstream took a week off from stable channel updates.
 * https://phabricator.wikimedia.org/D831 broke phabricator.wmflabs.org, I'll have to work around the issue which is caused by the fact that phab.wmflabs doesn't use oauth or ldap.

Program 1: Outcome 5: Objective 1: MW Nightlies server

 * Um. So on branching....

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng


 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart