Wikimedia Release Engineering Team/Checkin archive/20150324

= 2015-03-24 =

Team Business
(antoine summarized)
 * Deployment discussions happening
 * Streamline our service development and deployment process https://phabricator.wikimedia.org/T93428
 * Gabriel Wicke contacted Antoine about using Docker to provision/deploy services and wondering whether we could use it for CI. Seems to overlap Vagrant, would be nice if Dan could figure out with Gabriel :-)
 * Next quarter:
 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/201415Q4#People.2FThings_for_Q4
 * https://www.mediawiki.org/wiki/Wikimedia_Engineering/2014-15_Goals#Release_Engineering

Team Quarterly Goals
https://phabricator.wikimedia.org/maniphest/query/O9isnUt5IGLP/#R

Scrum of Scrums

 * https://phabricator.wikimedia.org/project/board/64/
 * Blocked on us: https://phabricator.wikimedia.org/maniphest/?statuses=open%2Cstalled&allProjects=PHID-PROJ-arpazvuktn2l647rb6us#R

Isolated CI instances

 * https://phabricator.wikimedia.org/tag/continuous-integration/board/?order=priority
 * Quarterly Priority: Disposable VMs - https://phabricator.wikimedia.org/T47499


 * Procurement for 2 servers in labs subnet with Jessie: https://phabricator.wikimedia.org/T93076
 * Zuul packaged for Precise and apparently working \o/
 * Have to rebase Trusty then work on a Jessie package with OpenStack folks / Debian python teams.
 * Started Nodepool packaging targetting Jessie (way easier)
 * Will need puppet work with Zeljkof.

Beta Cluster stability

 * https://phabricator.wikimedia.org/project/board/497/?order=priority
 * Quarterly Priority: Green nightly builds on staging: https://phabricator.wikimedia.org/T88701
 * Quarterly Priority: Stable uptime metrics of the Staging cluster: https://phabricator.wikimedia.org/T88705

- possibly need additional volumes mounted on backend (see what we do with Elastic?)
 * l10nupdate is broken, the log file says it is out of disk space (no task filled) /var/log/l10nupdate/l10nupdate.log
 * beta-scap-eqiad always rebuild l10n cache since March 17th causing build to take more than 10 minutes.
 * https://phabricator.wikimedia.org/T93737
 * chad working on finalizing staging-tin
 * I'm digging into swift stuff...confused...but progress
 * aside from parsoid, most are mw app servers remaining, i.e. need staging-tin

Test history

 * Quarterly Priority: By team test history: https://phabricator.wikimedia.org/T88706


 * Spiking on Elasticsearch + custom dashboard this week
 * if it's too difficult/expensive will implement Jenkins dashboards next

CI

 * Jenkins upgraded to latest LTS yesterday at 11pm UTC (security update)
 * Some Jenkins plugins upgraded (some we dont want to upgrade like git* and ansicolor)
 * Gallium/Lanthanum filling disk due to mediawiki/core cloned in each job.
 * https://phabricator.wikimedia.org/T93703
 * We could use a shared clone out of a mirror.
 * In Zuul, most repos share the same gate-and-submit queue because they now share common jobs.
 * Would need to patch Zuul server to change the behavior.

MediaWiki Releases

 * Quarterly Priority: Release MediaWiki 1.25: https://phabricator.wikimedia.org/T88709
 * Branching at wmf26 (aka on April 8th)

Vacations/Confs/etc

 * Dan in France the week before offsite (May 11-15)
 * Week before hackathon: Team offsite in France - https://phabricator.wikimedia.org/T89036
 * May: Hackathon in Lyon, France
 * Chad vacation after offsite (maybe, dubious now)
 * Elena on vacation May26-June07
 * Antoine: observing french holidays: Mon April 6th, Fri May 1, Fri May 8, Thurs May 14