Wikimedia Release Engineering Team/Checkin archive/20180326

From mediawiki.org


2018-03-19[edit]

Vacations/Important dates[edit]

https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
How to do it
  • Mar 26-29 (week since WMF holiday Fri): thcipriani vacation
  • Mar 30 (Fri): WMF Holiday
  • April 2: Ĺ˝eljko (Holidays in Croatia - Easter Monday)
  • Apr 3-13: Greg vacation
  • April 16 (Mon): WMF Holiday
  • May 1: Ĺ˝eljko (Holidays in Croatia - Labor Day / May Day)
  • May 14-17: Team offsite in Barcelona
  • May 18-21: Wikimedia Hackathon in Barcelona
  • May 21 (Mon): Tech-Mgt F2F
  • May 31: Ĺ˝eljko (Holidays in Croatia - Corpus Christi)

Rotating positions[edit]

Train[edit]

Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R
  • Feb 19 - wmf.22 - Mukunda
  • Feb 26 - wmf.23 - Tyler
  • Mar 05 - wmf.24 - Tyler
  • Mar 12 - wmf.25 - Chad
  • Mar 19 - wmf.26 - Chad
  • Mar 26 - wmf.27 - Mukunda <----
  • Apr 02 - wmf.28 - Mukunda
  • Apr 09 - wmf.29 - Tyler
  • Apr 16 - wmf.30 - Tyler

SoS[edit]

  • Feb 19 - Chad
  • Feb 26 - Mukunda
  • Mar 05 - Mukunda
  • Mar 12 - Tyler
  • Mar 19 - Tyler
  • Mar 26 - Chad <----
  • Apr 02 - Chad
  • Apr 09 - Mukunda
  • Apr 16 - Mukunda

Team Business[edit]

Updates[edit]

Scrum of Scrums[edit]

Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums

This week[edit]

Release Engineering[edit]

  • Blocking
  • Blocked
  • Updates

Last week[edit]

Release Engineering[edit]

  • Blocking
  • Blocked
  • Updates
    • Minor Gerrit upgrade planned for this week (2.14.6 -> 2.14.7)
    • Incident analysis started last week of the last year’s worth of incidents reports
    • Scap 3.7.7 should be rolled out to production this week
  • Quarterly goal dependency update:

Train status and happenings[edit]

https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor


Past week status updates[edit]

All of it in table form: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201718Q3

Quarterly Goals[edit]

Program 1: Outcome 5: Milestone 1: Develop and migrate to a JavaScript-based browser testing stack[edit]

Due: End of this quarter
What: Specific improvements to the now canonical framework, see: task T182421, notably:
Upgrade webdriverIO to version 4.9
Investigate replacing nodemw with mwbot
Video recording for Selenium tests in Node.js
Task: task T182421

Priority: high

  • T179188 Video recording for Selenium tests in Node.js - in progress - will do this week
  • T180144 Upgrade WebdriverIO to 4.12.0 - resolved
  • T181284 Replace nodemw with mwbot - in progress - almost done, updating documentation
    • T190426 Refactor AdvancedSearch browser tests which use nodemw module - in progress - helping AdvancedSearch team

Priority: normal

  • T164721 Run Selenium tests in CI for extensions - not started - CI changing
    • T180125 Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - not started
    • T180482 Create mediawiki-core-qunit-selenium-composer-jessie - not started
  • T182692 Document differences between Ruby and Node.js Selenium frameworks - not started - not hard to do, will do this week
  • T185011 Create selenium-MediaWiki-jessie daily Jenkins job - in progress
  • T188740 Retrospective for T139740 Port Selenium tests from Ruby to Node.js - resolved

Priority: low

  • T182412 Investigate if WebdriverIO `sync: false` would be useful to us and document how to use it - in progress - it could be useful for some tests, documentation pending, will do this week
  • T182691 Selenium tests should be easier to run - in progress - blocked by upstream bug
  • T183160 Sample code in Node.js for repositories that still have Selenium+Ruby tests - not started
  • T183162 Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - not started
  • T185094 Update page object pattern in Selenium tests - in progress - done, but probably will not be implemented, discussion with upstream to revert to previous recommendation is probably the best thing to do
  • T187859 Move one Selenium tests from mediawiki/core to mediawiki/skins/Vector - in progress - blocked on understanding how it breaks Minerva
  • T188742 Should selenium-EXTENSION-jessie run for all repositores with Selenium tests? - not started - have to contact repository owners

Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure[edit]

Goals
Draft requirements for a Kubernetes based solution for CI - task T183513
Migrate MediaWiki PHPUnit tests to Shipyard (docker-based CI) (~40% of Nodepool usage) - task T183512
Will be worked on after the long tail task T187797
Unify production and CI docker image build process - task T177276
Yes Done 01/15



Program 3: Outcome 1: Objective 2: Identify and find stewards for high-priority/high use code segment orphans[edit]

Due: End of quarter
task T174091

Pivoted on the stewardship review process. Working with delegates prior to engaging with Toby and Victoria. Scheduled standing review monthly with Toby and Victoria

Program 3: Outcome 2: Objective 2: Define and implement a process to regularly address technical debt across the Foundation[edit]

Due: End of quarter
task T174095

worked on technical debt avoidance framework.

Program 3: Outcome 2: Objective 3: Promote and surface important technical debt topics at large gatherings of Wikimedia developers (e.g., DevSummit and Hackathon(s))[edit]

Due: End of next quarter
task T174096

No activity

Program 6: Outcome 2: Objective 2: Set up a continuous integration and deployment pipeline[edit]

Due: End of this quarter
Keyword: SSD
phab project: https://phabricator.wikimedia.org/project/view/2453/
Goal:
Verify basic functionality of 'production' deployment and image (initially targeting mathoid):
Functional PoC within integration in the deployment-pipeline
Deploy to isolated k8s

thcipriani update[edit]

This is a severely long bit of notes about what I did last week so that you all can pick up where I left off...hopefully

  • We are sooo close to getting the PoC working, I was trying to build an image that worked that I could then puppetize
  • I ended up blocked on a few things, some of which were resolved over the weekend.

Creating a new minikube agent 1. Create a new machine in horizon named like: integration-slave-k8s-10XX 2. ssh to machine (have to wait a bit, puppet needs to run there) 3. Fix weird self hosted puppet issues (see https://www.mediawiki.org/wiki/Continuous_integration/Docker#Jenkins_Agent_Creation )

* sudo rm -fR /var/lib/puppet/ssl
 * sudo mkdir -p /var/lib/puppet/client/ssl/certs
 * sudo puppet agent -tv
 * sudo cp /var/lib/puppet/ssl/certs/ca.pem /var/lib/puppet/client/ssl/certs
 * sudo puppet agent -tv

4. Apply the role role::ci::slave::labs::docker to the instance via horizon 5. sudo puppet agent -tv (this was failing last week see https://phabricator.wikimedia.org/T190584 ) 6. Setup minikube:

sudo apt-get install -y helm minikube kubernetes-client export MINIKUBE_WANTUPDATENOTIFICATION=false export MINIKUBE_WANTREPORTERRORPROMPT=false export MINIKUBE_HOME=$HOME export CHANGE_MINIKUBE_NONE_USER=true mkdir $HOME/.kube || true touch $HOME/.kube/config export KUBECONFIG=$HOME/.kube/config

sudo -E minikube start --vm-driver none --bootstrapper=localkube

7. Clone all necessary repos git clone https://gerrit.wikimedia.org/r/operations/deployment-charts git clone https://gerrit.wikimedia.org/r/mediawiki/services/mathoid

8. Build mathoid image cd mathoid blubber dist/pipeline/blubber.yaml production | docker build -t mathoid -f - .

9. Setup helm/tiller This is where I got stuck :( See: https://phabricator.wikimedia.org/T190589

10. helm install && helm test? Maybe? Didn't get this far :(


Quaterly non-goal "Work"[edit]

Program 1: Outcome 1: Objective 1: Scap (Tech Debt Sprint FY201718-Q2)[edit]

workboard
  • Worked with awight on git-lfs + scap

Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure[edit]

  • https://phabricator.wikimedia.org/T189660
    • Fixed the phabricator-jessie-diffs job. Thanks to Antoine for identifying the problem.
    • Also improved the logging on failures so jenkins-bot will now comment with more useful info.


Program 1: Outcome 6: Milestone 1: Maintain Gerrit[edit]

Program 1: Outcome 6: Milestone 2: Maintain Phabricator[edit]

Streamline logspam workflows by adding some integration with phabricator
Store git-lfs (and other phab uploads) in swift: task T182085
    • Finally got back into this during the second half of the week.
      • Found out that there is already a swift cluster in deployment-prep and started configuring phab.wmflabs.org to work with this shared swift cluster.


Other work[edit]

Selenium retrospective tool place last week. See: https://phabricator.wikimedia.org/phame/post/view/88/selenium_tests_in_node.js_project_retrospective/ Post Mortem on 20180129-MediaWiki Incident. See: https://etherpad.wikimedia.org/p/postmortem-20180129-MediaWiki_Incident Code Health Group Meeting: See: https://etherpad.wikimedia.org/p/codehealthgroup-20180321


Standup![edit]

Antoine[edit]

  • What I plan to do this week
    • Demo of quibble right now
    • Add experimental job to CI for mediawiki/core that would run some subset of phpunit/qunit/composer test/npm test and webdriver.io
  • What I'm blocked on
  • Other?
    • mediawiki/core suite fails on sqlite or when LANG is different from C.
      • I didn't know there were other LANGs ;-)


Chad[edit]

  • What I plan to do this week
    • abusefilter private logs / data pruning
    • gerrit missing branch thingie? I hate git
    • helm helm helm
    • MW general release planning?
  • What I'm blocked on
  • Other?


Dan[edit]

  • What I plan to do this week
    • Integrate new Blubber release into pipeline script
    • Publish a common policy file for Blubber to integration.wikimedia.org
    • Refactor scap's CI jobs to use blubber
    • Starting working on composer support in Blubber
  • What I'm blocked on
  • Other?
    • thcipriani: see update Program 6 Outcome 2 Objective 2 for where I left off last week...


Greg[edit]

  • What I plan to do this week
    • MW Release meeting as well
    • talking with Mark&Faidon re 'staging' tomorrow
    • apparently another budget [urgent] review item
    • Q4 team goals
    • SWAT changes
  • What I'm blocked on
  • Other?


Jean-Rene[edit]

  • What I plan to do this week
    • Finish up Q3 goal work re Technical Debt process
    • Q3 Stewardship review
  • What I'm blocked on
  • Other?


Mukunda[edit]

  • What I plan to do this week
    • Swift, Swift and train
    • more Swift
  • What I'm blocked on
    • n/a
  • Other?


Tyler[edit]

  • What I plan to do this week
    • Vacation
  • What I'm blocked on
    • Blocked? Baby I'm on vacation!
  • Other?
    • <3 you all -- have a good week (I posted an update in program 6)


Zeljko[edit]

  • What I plan to do this week
    • Should I move tasks marked not started to T182986 Selenium framework improvements?
      • Greg: yeah, I think so
    • T179188 Video recording for Selenium tests in Node.js
    • T190426 Refactor AdvancedSearch browser tests which use nodemw module
    • T182692 Document differences between Ruby and Node.js Selenium frameworks
    • T188740 Retrospective for T139740 Port Selenium tests from Ruby to Node.js
    • T185011 Create selenium-MediaWiki-jessie daily Jenkins job
    • T182412 Investigate if WebdriverIO `sync: false` would be useful to us and document how to use it
  • What I'm blocked on
    • T182691 Selenium tests should be easier to run - blocked by upstream or a new idea
    • T185094 Update page object pattern in Selenium tests - waiting to see if Timo will explain to upstream that they are doing it wrong
    • T187859 Move one Selenium tests from mediawiki/core to mediawiki/skins/Vector - blocked on understanding how it breaks Minerva
  • Other?
    • T190039 - CirrusSearch smoke selenium tests cause failures of mediawiki-core-qunit-selenium-jessie job for extensions - CI fixed
    • Will there be Q4 Selenium framework improvements?
    • Ordered Kinesis Advantage2 <3

Grooming[edit]

Team Kanban Board Review and Triage[edit]


Once / month-ish review of backlog(s)[edit]


Kanban stats[edit]

Burnup chart