Wikimedia Release Engineering Team/Checkin archive/20170724

= 2017-07-24 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * July 24: 1-2 days - Željko vacation (maybe)
 * July 28th: Greg vacation
 * August 3-9: Željko vacation
 * August 8-15: Greg @ Wikimania&Tech-mgrs F2F
 * August 9-13: Wikimania
 * Aug 10-13: Dan on vacation
 * Aug 11-13: Chad maybe on vacation
 * Some weeks in August: Antoine, Probably starting Aug 8th
 * Aug 14th: thcipriani Birthday!
 * Aug 17th: Mukunda - court again
 * Aug 21st - thcipriani eclipse!

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R

July 17 and July 24

 * Train: Chad
 * wmf.10
 * wmf.11
 * SoS: Tyler
 * Out:
 * July 24: 1-2 days - Željko vacation
 * July 28th: Greg vacation

July 31 and Aug 7

 * Train: Mukunda
 * wmf.12
 * wmf.13
 * SoS: Chad
 * Out:
 * August 3-9: Željko vacation
 * August 8-15: Greg @ Wikimania&Tech-mgrs F2F
 * August 9-13: Wikimania
 * Aug 11-13: Chad maybe on vacation

Aug 14 and Aug 21

 * Train: Tyler
 * wmf.14
 * wmf.15
 * SoS: Mukunda
 * Out:
 * Aug 14th: thcipriani Birthday!
 * Aug 17th: Mukunda court :-/
 * Aug 21st - thcipriani eclipse!

Actions from last meeting

 * Tyler: Some runjobs thing changed logging channels -- need to file meta task
 * ❌ needs some investigation still :|
 * Tyler: remove a whole bunch of stuff from /srv/deployment -- task to be filed
 * ✅ - https://phabricator.wikimedia.org/T170881

This week

 * Blocking
 * Blocked
 * Updates
 * Updates
 * Updates

Last week

 * Blocking
 * On beta commons, thumbnailing of 3D files is broken still https://phabricator.wikimedia.org/T170444
 * thcipriani replied to an email
 * mukunda cherry picked a thing
 * resolved? From our side I think so...
 * Parser tests fail if default Skin for unit tests makes use of doEditSectionLink https://phabricator.wikimedia.org/T170880
 * Antoine will take a look
 * Blocked
 * Updates
 * Train update
 * Train update

Not a SoS thing, but: https://phabricator.wikimedia.org/T171371 Investigate 30x increase in Jobrunner errors
 * Antoine: due to ukwikimedia moved from closed.dblist to deleted.dblist . Bunch of HTMLCacheUpdate jobs can no more run as a result (no wiki found)

Logspam

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor

Other Team Business

 * Reminder on annual personal goals: see email


 * Deployment process improvements:
 * https://wikitech.wikimedia.org/w/index.php?title=Deployments&action=historysubmit&type=revision&diff=1765924&oldid=1765923
 * https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train


 * Jenkins: Assert no PHP errors (notices, warnings) were raised or exceptions were thrown
 * https://phabricator.wikimedia.org/T50002
 * "OK. mw-error.log and mw-exception.log are empty again for phpunit and qunit jobs. Let's stay on top of this and enforce it soon? (@hashar, @greg)"


 * Releng Secrets repo
 * https://phabricator.wikimedia.org/source/releng-secrets/
 * Get a gpg key: https://alexcabal.com/creating-the-perfect-gpg-keypair/
 * keysigning hangout?
 * Will try to schedule a meeting friday about this

Program 6: Streamlined service delivery

 * Define functional tests for Mathoid running on the staging Kubernetes cluster for use in future gating decisions -
 * Define method for monitoring and reacting to the above functional tests -


 * No meeting last week because ops goes to management meetings
 * still need to get this https://phabricator.wikimedia.org/T169557 (seems to have some movement)
 * Dan working with Helm to see if he can get a POC going

Deprecate use of Trebuchet across production -

 * https://phabricator.wikimedia.org/T129290


 * StatsV deployed https://phabricator.wikimedia.org/T129139
 * logstash-logback-encoder https://gerrit.wikimedia.org/r/#/c/366466/
 * cassandra-metrics-collector https://gerrit.wikimedia.org/r/#/c/366404/
 * requested to make a how-to-deploy-for-the-fist-time-with-scap3 page
 * jobrunner is all weird https://wikitech.wikimedia.org/wiki/Incident_documentation/20170718-JobQueue

Migrate majority of developers to JavaScript based browser test framework (webdriver.io) -

 * T164721 Run WebdriverIO tests in CI for extensions
 * Resolved! (insert party emoji) (A big thank you to Antoine) 🎈🎉 🎊 🎈
 * T164024 Rewrite Related pages browser tests in Node.js
 * Done (as far as #releng cares) refactored the patch to use page obect pattern, ready to get merged, waiting for review
 * T162256 [EPIC] Port Selenium tests from Ruby to Node.js on Reading Web extensions
 * I will port one test per repository and make sure it runs in CI

Quality improvements

 * Code Health
 * started reviewing core code base
 * Meeting today to socialize code health
 * This week working with Kevin on Tech Debt plans
 * Jenkins Emails to QA Alerts
 * Not an issue after all.

Phabricator

 * New diffusion is nice
 * Upstream no longer accepting new users on https://secure.phabricator.com
 * They are directing people to https://discourse.phabricator-community.org/
 * I don't think this matters much to us, Mukunda is an upstream contributor and...
 * Anyone with an existing account on secure.phab can still contribute
 * Finally making some progress on phab1001
 * Untangled some dependencies so that we can move forward without waiting on @traffic

Docker for CI
No need to say anything, look at patches if interested
 * rewrite of rakefile https://gerrit.wikimedia.org/r/#/c/366591/
 * builder pattern https://gerrit.wikimedia.org/r/#/c/366726/2

Misc CI

 * Mukunda got pinged on irc about https://phabricator.wikimedia.org/T170458
 * Setting up ci in differential, waiting on @fdns to test it
 * R language job polishing
 * Android Periodic tests made 4 x faster (thanks Michael Holloway)
 * Castor failed due to LDAP issue
 * Bunch of Beta cluster instances were no more reacheable due to LDAP issue; Solved, filled tickets to fix puppet
 * Webperformance job going to dedicated slaves
 * Giuseppe rewriting Puppet Rakefile
 * rewrite of rakefile https://gerrit.wikimedia.org/r/#/c/366591/

Team Kanban Board Review and Triage

 * All Open
 * Assigned
 * Unassigned
 * No update for 1 week
 * No update for 2 weeks
 * No update for 3 weeks
 * No update for 4 weeks

Kanban stats

 * Burnup chart