Wikimedia Release Engineering Team/Checkin archive/20150310

= 2015-03-10 =

Team Quarterly Goals
https://phabricator.wikimedia.org/maniphest/query/O9isnUt5IGLP/#R Next quarter goals... Made by groups. Our is ops + all platform Next year's budget: * send doc to team (Greg)
 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/201415Q4

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/?statuses=open%2Cstalled&allProjects=PHID-PROJ-arpazvuktn2l647rb6us#R

Beta Cluster stability

 * https://phabricator.wikimedia.org/project/board/497/?order=priority
 * Quarterly Priority: Green nightly builds on staging: https://phabricator.wikimedia.org/T88701


 * Working on merging staging and production into unified puppet roles
 * Quarterly Priority: Stable uptime metrics of the Staging cluster: https://phabricator.wikimedia.org/T88705


 * mmodell proposed and prototyped a solution to create metrics from the varnish proxy logs/stats. Essentially the metric is error_rate = 5xx_responses / 2xx_responses (per time period)
 * Something went badly wrong with either beta or Jenkins or both overnight 9 March PST (aka last night?). I've looked in the logs on beta labs and didn't find any smoking guns.

Test history

 * Quarterly Priority: By team test history: https://phabricator.wikimedia.org/T88706

Trying to work around a Selenium bug with Chrome and how it interacts with WMF-style "overlays". Affects MobileFrontend, Echo, probably VisualEditor repos at least. Antoine:
 * Outlined a few different options with varying scope and complexity
 * Option #2 is worth a spike (Mama Bear's option; just right); can fallback on option #1
 * Need elasticsearch node for storing build/cucumber results (extend json formatter for structured build info and cucumber scenario results)
 * Dashboard to display results

That one is new to me. OpenStack send results to logstash / ElaticSearch and build artifacts to Swift. One can reach out #openstack-infra to figure out how it is done for them.

IIRC they are using a test result protocol named 'subunit' and have plans to collect all their tests in a huge central DB to build report from. Subunt infos: http://www.tech-foo.net/making-the-most-of-subunit.html

Dan: slide 29 of http://docs.openstack.org/infra/publications/2014-gerrit_user_summit-overview/#%2829%29

Demo: http://logstash.openstack.org/

Isolated CI instances

 * CI board https://phabricator.wikimedia.org/tag/continuous-integration/board/?order=priority

Kunal "Legoktm" / Timo "Krinkle" have been quite busy.
 * https://phabricator.wikimedia.org/tag/continuous-integration-isolation/
 * Quarterly Priority: Disposable VMs - https://phabricator.wikimedia.org/T47499

Next: Packaging with dh-virtualenv (which embeds python modules in the .deb) is a mess. We should probably migrate the whole infra to Debian/Jessie including the production boxes.
 * Created a Debian package for Zuul targetting Ubuntu Precise https://phabricator.wikimedia.org/T48552 // https://gerrit.wikimedia.org/r/#/c/195272/
 * test / refine it
 * create one for Trusty
 * Design the Jenkins isolation architecture https://phabricator.wikimedia.org/T86171
 * OVERDUE: Antoine to meet with relevant ops.  Who?
 * Action for Antoine: fill a Task for ops. Require one labs ops + a debian packaging guru :)
 * off meeting: filled https://phabricator.wikimedia.org/T92324

MediaWiki Releases

 * Quarterly Priority: Release MediaWiki 1.25: https://phabricator.wikimedia.org/T88709

Vacations/Confs/etc

 * Chad at Elasticon with Nik 10-11 March, in SF
 * Dan in France the week before offsite (May 11-15)
 * Elena on vacation May26-June07
 * Week before hackathon: Team offsite in France - https://phabricator.wikimedia.org/T89036
 * May: Hackathon in Lyon, France
 * Lyon used to be the capital of the area a long time ago. Nice old city, lot of great food, reasonably sunny/hot.
 * Chad vacation after offsite