Wikimedia Release Engineering Team/Checkin archive

This is the Wikimedia Release Engineering Team's archive of our weekly check-ins. We take notes on an etherpad during the meeting and archive them here afterwards. =2014-10-21=

Team Business

 * Phabricator workboard discussion
 * https://phabricator.wikimedia.org/project/board/20/
 * Make sure you have an account :)
 * Join the "project"
 * Things to think about, we probably want to take a 2 hour chunk of time to do this well enough:
 * What is our generalized process? We can keep our current process and make the tool match that.
 * "size" of things in this specific board (do we need two?)
 * "Blocked" column?
 * "Needs review" column?
 * "Done" column (and related "Archive" column)?
 * other things?

Scrum of Scrums
(kept as long url because all short url providers are blocked by mw.org)
 * Dependency wall

Phabricator

 * Direct phabricator / migration questions to the #wikimedia-devtools irc channel
 * QChris finished the phabricator plugin for gerrit, we should see gerritbot posting to phabricator soon
 * https://phabricator.wikimedia.org/T169
 * See a sample of data migrated from bugzilla to phabricator to get a feel for how it's going to look and identify any problems:
 * Preview instance: https://bugzillapreview.wmflabs.org
 * (antoine) it (phabricator upstream:) is awesome. Can fill bug at https://secure.phabricator.com/ which are triaged quickly

Deployment tooling

 * ready to merge the l18n stuff
 * sync up with Niklas re l18n

Jenkins
Zuul / Gearman related:
 * Jobs being stuck in Zuul queue due to an error not being handled https://bugzilla.wikimedia.org/show_bug.cgi?id=72113, patch proposed upstream
 * Jobs not being triggered by Zuul suddenly https://bugzilla.wikimedia.org/show_bug.cgi?id=63760, hard to track
 * Fixed: "Jenkins: jobs created via JJB are not properly registered in Zuul Gearman server" https://bugzilla.wikimedia.org/show_bug.cgi?id=63758, pending upstream release although already deployed
 * (Zeljkof, Antoine) some rubocop ruby2.0 related work over the week.

Beta cluster

 * Have some Icinga/Graphite monitoring in staff. Not sure whom to notify beside Yuvi, Greg and I. Ideas? Maybe a public mailling list similar to qa-alerts ( https://lists.wikimedia.org/pipermail/qa-alerts/ ), but not qa-alerts since it is super spammy. Maybe betacluster-alerts or something
 * create a task in the workboard (greg)
 * Andrew is still waiting on bids for virt hardware, and frowning a lot

Browser tests

 * (Dan) Making progress on env layer
 * Antoine got Dan to commit to giving a TDD talk to WMF :)
 * Chris in SF, working mostly directly with Flow folks (mostly S I think) and a little with Rummana for VE.
 * Almost 80 people RSVPd for http://www.meetup.com/wikimedia-tech/events/207856222/

Vagrant

 * (Dan) Survey out—around 38 replies so far. Let it roll? Send to community?
 * (Dan) Start analysis of survey results

Hiring

 * Release Engineer in-progress: https://boards.greenhouse.io/wikimedia/jobs/29435?t=5fw24x

Vacations/Confs/etc

 * 10/20 - 10/30 - Antoine might skip morning and work during evening {european pov}.
 * 10/21-22-23 Chris in SF (Elisabeth Hendrickson talk at WMF Oct 22)
 * 10/23 - Antoine traveilling during morning
 * 10/27 - 10/29 Chris at Google Test Automation Conference Seattle
 * 10/30 - 10/31 Chris vacation
 * 11/3 - 11/7: Antoine - OpenStack Summit Paris
 * 11/11 - Antoine Holiday (WW1)

=2014-10-14=

Scrum of Scrums

 * Dependency wall: https://wikimedia.mingle.thoughtworks.com/projects/scrum_of_scrums/cards/list?style=list&tab=All

Jenkins

 * Jobs created with Jenkins Job Builder should now properly register ( https://bugzilla.wikimedia.org/show_bug.cgi?id=63758 )
 * Tobi completed WD migrated (refractor in-progress)
 * rubucop etc
 * "Jenkins Performance"
 * No remaining time last to reproduce it

Beta cluster

 * virt100x outage last week, we have lost deployment-cxserver01 . Had trouble rebuilding it due to labs partitionning improvement. Kindly fixed and enhanced by Andrew/Coren
 * Andrew and Rob reviewing quotes for additional virt hardware today (re: new beta cluster)
 * Maybe we need to backup instances
 * Second beta cluster still being discussed

Browser tests

 * Stuff went red while Chris was away
 * mediawiki/selenium has env to fix xvfb race condition. Only used for local browser tests though.
 * since all jobs are on Sauce, should we stop throttling them?
 * Let Chris work on getting the builds back to green first
 * Jobs were running SauceLabs and had timeouts, so unrelated to xvfb race condition
 * Conclusion: keep them throttle-
 * Continuing work on environment abstraction

Vagrant

 * Release trafficcontrol MWV role (uses tc + iptables to simulate network conditions)
 * Perform MMV perf tests using trafficcontrol profiles
 * Distribute survey! (today)

Hiring

 * Elena starts today(!!)
 * Buddying with James and Rummana, I chatted with her yesterday, will sync up later in the week as well
 * Chris wrote this long ago but it's still mostly relevant: https://www.mediawiki.org/wiki/Quality_Assurance/first_week
 * welcome email to-be-sent (already written)

Vacations/Confs/etc

 * 10/13 - 10/17 Greg at MediaWiki Core offsite
 * 10/20 - 10/30 - Antoine might skip morning and work during evening {european pov}.
 * 10/21-22-23 Chris in SF (Elisabeth Hendrickson talk at WMF Oct 22)
 * 10/27 - 10/29 Chris at Google Test Automation Conference Seattle
 * 10/30 - 10/31 Chris vacation
 * 11/3 - 11/7: Antoine - OpenStack Summit Paris
 * 11/11 - Antoine Holiday (WW1)

= 2014-10-07 =

Team Business

 * All Hands Excursion
 * https://docs.google.com/a/wikimedia.org/spreadsheets/d/1qTLhQstAiTIHH6iBjHINLuaOaZHs1CnIAYWmcLjxQCM/edit#gid=0

Scrum of Scrums
One card (ops/release) Phabricator related:
 * Dependency wall: (grrr spam filter)
 * https://wikimedia.mingle.thoughtworks.com/projects/scrum_of_scrums/cards/119

Phabricator
Lots happening in Phabricator world.....
 * We announced that the wikimedia phabricator is now open to everyone...
 * log in (or register) and try to break it
 * link your account to LDAP and OAuth provider here: https://phabricator.wikimedia.org/settings/panel/external/
 * New projects opening on hold till Bugzilla migration
 * bug qgil if you have an urgent need for a new project before the migration
 * Lots of tasks finally closed out of spite.  https://phabricator.wikimedia.org/maniphest/query/oRIJDB5MpxjI/#R

Deployment tooling
Summary: currently app servers sync l10nupdate from tin instead of using the rsync proxies just like scap does. BitTorrent?!!
 * [Ops] LocalisationUpdate == useless Tin

Jenkins

 * (Antoine) All jobs depending on mediawiki core now have mediawiki/vendor cloned as well. Unblocks Bryan Davis changes to core logging
 * (Antoine, Timo) Nasty regression in Jenkins Git plugin under Trusty (fixed)
 * (Zeljkof, Tobi) Wmde browsertests jobs migrated \O/
 * (Antoine, Dan, Zeljkof) JJB macro to easily run bundle commands
 * (Timo) Labs slaves monitoring: https://integration.wikimedia.org/monitoring/
 * (Antoine) Integration job to assert our PHPUnit fork works with mw/core release branches / master
 * (Antoine, devs) pywikibot running python3.4 tests on Ubuntu Trusty
 * Proposal to merge JJB and Zuul config repositories (see QA list)
 * ACTION self deploy on +2

Beta cluster

 * Andrew has been too busy (migrating LDAP) to order new hardware for nightly cluster Still hoping to do that soon.
 * Nightly cluster need resources from the whole engineering team. (Antoine, Greg during 1/1)

Browser tests

 * (Dan) Checking in with MM team about MMV metrics and need (or not) to setup traffic shaping
 * http://multimedia-metrics.wmflabs.org/dashboards/mmv#media_viewer_vs_file_page-graphs-tab
 * (Dan) Helping Zero with first browser tests for the Zero Portal
 * (Dan) Moving ahead on environment abstraction layer
 * (Tobi, Zeljkof, Antoine) Jenkins Performance Plugin enabled on all jobs (see QA list)
 * (Antoine, Dan, Zeljkof) Firefox local browser tests have a race condition killing Xvfb

Vagrant

 * (Dan) Draft of survey is in Qualtrics (needs final feedback)
 * (Dan) Tech Talk in November with Bryan Davis
 * Install party at Dev Summit: https://www.mediawiki.org/wiki/MediaWiki_Developer_Summit_2015#Workshops

Hiring

 * Elena starts on Tuesday (SF based)

Vacations/Confs/etc

 * 10/8: Zeljko - Croation Holiday
 * 10/6 - 10/10 Chris vacation
 * 10/13 - 10/17 Greg at MediaWiki Core offsite
 * 10/20 - 10/30 - Antoine might skip morning and work during evening {european pov}.
 * 10/21-22-23 Chris in SF (Elisabeth Hendrickson talk at WMF Oct 22)
 * 10/27 - 10/29 Chris at Google Test Automation Conference Seattle
 * 10/30 - 10/31 Chris vacation
 * 11/3 - 11/7: Antoine - OpenStack Summit Paris
 * 11/11 - Antoine Holiday (WW1)

= 2014-09-30 =

Team Business

 * Metrics
 * Should we set up a labs instance to capture and store metrics? (crons + some db + limn)
 * Registration for MediaWiki Developers Summit
 * Sam, Antoine, Mukunda :)
 * Ideas for Saturday during All Hands?
 * http://www.computerhistory.org/
 * http://www.bayareabrewerytours.com/

Scrum of Scrums

 * Dependency wall:

Phabricator

 * Quim Discovered a couple of issues with our 'secure' task hiding
 * One is a weekness that could expose private tasks via herald: https://phabricator.wikimedia.org/T493
 * This is getting addressed upstream thanks to chase's proposal to epriestley: https://secure.phabricator.com/T6211
 * Another issue was that the reporter of an issue wasn't actually able to access the maniphest task once it got submitted. https://phabricator.wikimedia.org/T475
 * This is fixed by https://gerrit.wikimedia.org/r/#/c/163753/
 * Still working on https://phabricator.wikimedia.org/T419 and https://phabricator.wikimedia.org/T169

Deployment tooling

 * Elasticsearch upgraded on logstash100[1-3] to match production. Other packages upgraded etc. Hopefully increase stability. logstash upgrades to come per beta below (reedy)

Jenkins

 * ✅ Zuul cloner bug, https://bugzilla.wikimedia.org/show_bug.cgi?id=71133
 * should the bug be FIX/RESO? :)
 * MediaWiki extensions qunit jobs migrated to it but VisualEditor. New job pending testing by VE team. (Antoine, Timo)
 * Wrote a diagnostic tool for Zuul (zuul-gearman.py). Doc updated at https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Debugging need moaar doc
 * Zuul stop processing jobs from time to time. Gathered traces which indicates it is most probably an issue in Gearman server.
 * C Scott proposed to merge zuul-config and jenkins-job-builder-config repositories. Thoughts?
 * Zeljkof, +1
 * Antoine to fill a bug about it and handle the merging + updating related jobs.

Beta cluster

 * logstash upgraded to 1.4.2 for testing, prior to deployment to production (reedy)
 * addition of redis yesterday broke beta labs https://bugzilla.wikimedia.org/show_bug.cgi?id=71415 bug was closed but re-opened because editing is still busted and Preferences also (at least)
 * greg to email deployers...

Browser tests

 * major updates to Echo and Flow repos in process
 * basic training for Rummana today
 * i10n screenshots (with Amir and Vikas)
 * some tests failing but he's coming back/still around
 * the missing font issue (solved)
 * user selectable language (default is a long list)
 * JJB browser tests cucumber macro needs evil refactoring. Been copy pasted all over the place.

Vagrant

 * Survey is going out this week? hopefully, once Dan's better and can get it in Qualtrics

Hiring

 * Elena Tonkovidova starts 14 October. Bay Area based.

Other
Selenium Workshop
 * Zeljkof working with Nik
 * asked Rachel for room/etc

Vacations/Confs/etc

 * 10/3: Zeljko - Conference
 * 10/8: Zeljko - Croation Holiday
 * 10/6 - 10/10 Chris vacation
 * 10/20 - 10/30 - Antoine might skip morning and work during evening {european pov}.
 * 10/21-22-23 Chris in SF (Elisabeth Hendrickson talk at WMF Oct 22)
 * 10/27 - 10/29 Chris at Google Test Automation Conference Seattle
 * 10/30 - 10/31 Chris vacation
 * 11/3 - 11/7: Antoine - OpenStack Summit Paris
 * 11/11 - Antoine Holiday (WW1)

= 2014-09-23 =

Team Business
From mail Antoine sent on RelEng list (corrected): Mon 19 - FREE (US holiday) / Travel day Tue 20 - FREE Wed 21 - All Hands Thu 22 - All Hands Fri 23 - Usual work day at office Sat-Sun: Free || offsite || team socializing at Alcatraz Mon 26th - tech days Tue 27th - tech days Wed 28th - FREE Thu 29th - FREE Fri 30th - FREE Follow up on mailing list
 * FOSS OPW projects? (similar to Google Summer of Code but restricted to ladies)
 * modernize rspec?
 * TODO: Chris to write up the idea (with Zeljko)
 * All Hands/Offsite current thinking:
 * 1 day at All Hands (Tuesday, the day before)
 * Week before Paris Hackathon
 * antoine: Wanna come to Nantes? :-]  maybe ;)   Also, why are you the same color?!
 * What's happening in Paris? yearly european hackathon like Zurich last year. Ohhh. i was thinking something "sooner" :D

Scrum of Scrums
The proxy itself is like OWASP Zed or whatever, we want to create the ability to send Selenium traffic to that particular proxy.
 * Dependency wall: (stupid mingle url...)
 * Chris/Dan card #135 https://wikimedia.mingle.thoughtworks.com/projects/scrum_of_scrums/cards/135

Phabricator

 * stuffs

Deployment tooling

 * scap stuffs

Jenkins

 * MediaWiki jobs switched to Zuul cloner. Now use the proper branch (was always using 'master')
 * Random fails of extensions patches against wmf branches.
 * Bug in Zuul cloner https://bugzilla.wikimedia.org/show_bug.cgi?id=71133 patch pending test + deploy

Beta cluster

 * monitoring based on graphite / Shinken. See YuviPanda announcements
 * http://shinken.wmflabs.org/host/beta-cluster (guest/guest)
 * http://graphite.wmflabs.org/ ( Look in the left tree for Graphite -> deployment-prep, all instances have metrics generated by Diamond)
 * Notifications:
 * IRC to #wikimedia-qa already
 * emails sent to a few people ACTION: need more people to be notified and act
 * TODO: Greg to sync up with the potential deployers

Browser tests

 * Pretty much everything that should be passing is passing
 * Throttling executors on Jenkins has improved pass rate
 * Chris is combing through Flow/Echo repos (and a little bit of MobileFrontend) doing refactoring and education per the quarterly goal
 * Chris would like to set up pairing sessions in SF Oct 22/23
 * MMV tests are driving performance measurement
 * Getting them up and running on a new labs instance (multimedia-perf.eqiad.wmflabs)
 * SauceLabs is likely culprit of inaccurate metrics
 * Yslow related, bug asking to add it as a job to run on patchset proposal https://bugzilla.wikimedia.org/show_bug.cgi?id=57137

Vagrant

 * Spun up a new labs instance for MMV performance testing using MWV (yay)
 * Finalize and send out the survey!
 * Researching:
 * Lightweight monitoring
 * resource monitoring
 * auto bug report helper
 * anonymous reporting
 * cookies & donuts?

Hiring


Vacations/Confs/etc

 * 10/3: Zeljko - Conference
 * 10/8: Zeljko - Croation Holiday
 * 10/6 - 10/10 Chris vacation
 * 10/20 - 10/30 - Antoine might skip morning and work during evening {european pov}.
 * 10/27 - 10/29 Chris at Google Test Automation Conference Seattle
 * 10/30 - 10/31 Chris vacation
 * 11/3 - 11/7: Antoine - OpenStack Summit Paris
 * 11/11 - Antoine Holiday (WW1)

= 2014-09-16 =

Phabricator

 * redirection scripts (BZ urls to Phab urls)
 * almost finished
 * Just waiting on launch, will work with chase to deploy it
 * the gerrit -> phab bot
 * Java :)
 * ask for help from Nik/Chad/Christian

Deployment tooling

 * l10n and scap colliding
 * https://bugzilla.wikimedia.org/show_bug.cgi?id=70446

Jenkins

 * extensions now being tested with:
 * proper mediawiki/core branch (deployed today)
 * mediawiki/vendor
 * Still have to migrate the extensions -qunit jobs (WIP)
 * Wikidata related jobs partly reintegrated on Wikimedia Jenkins
 * Next items:
 * early adopt phabricator
 * isolating tests using labs infra.
 * merge zuul-config and jjb-config repositories (suggested by cscott)
 * auto deploy CI related changes on +2
 * Anyone interested in some Jenkins training ? ( Timezones sucks :-/ )
 * Chris started re-doing docs on mw.o. Starting with getting rid of references to Cloudbees.
 * Jenkins perf improvements
 * plan a sane master/slave arrangements
 * Design load structure
 * TODO: GREG to find find out Timo's involvement

Beta cluster
Antoine: both are huge additions with a long record of proven success.
 * jeremyb rampage
 * matanya granted root on beta (ops & puppet volunteer)
 * Search slowness; https://bugzilla.wikimedia.org/show_bug.cgi?id=70869
 * We have monitoring on beta cluster thanks to Yuvi!
 * Wait on second cluster

Browser tests

 * early WIP of environment abstration layer
 * https://gerrit.wikimedia.org/r/#/c/159644/
 * Geolocation use case? Chris to send email I think :)
 * helping MMV with performance testing using MW-Selenium and an "isolated" labs instance
 * Chris to refactor Echo tests first in conjunction with corefeatures team

Vagrant

 * helping MMV with performance testing using MW-Selenium and an "isolated" labs instance
 * looking for ways to setup a "traffic shaper" role in MWV (using `tc` perhaps)
 * not sure how to achieve more isolation in labs (bigger instance == more dedicated?)
 * need to finalize MWV survey
 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Vagrant_survey

Hiring
HR.....................................................