Wikimedia Release Engineering Team/Quarterly review, April 2014/Notes

The following are notes from the Quarterly Review meeting with the Wikimedia Foundation's Release and QA team, April 30, 2014, 11:00 am-12:30pm PDT

Present: Greg Grossmeier, Tilman Bayer (taking minutes), Chris Steipp, Dan Garry, James Forrester, Rob Lanphier, Quim Gil, Erik Möller, Terry Chay

Participating remotely: Bryan Davis, Mark Bergsma, Chris McMahon, Tomasz Finc, Željko Filipin, Arthur Richards, Antoine Musso

''Please keep in mind that these minutes are mostly a rough transcript of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material''

Agenda

Greg:

welcome

last quarter: What we said
still processing pain points from session we had

refactored Scap, now in Python

matrix of deployment requirements per tool - useful to see differences between tools

e.g. can see that "single tool to rule them all" won't happen

Beta Cluster:

last q, planned migration to eqiad - was a much bigger task than anticipated

set up db slaves in beta

and use beta to test Scap

Scap is now the only thing that deploys code on beta cluster

Erik: Awesome ;) how is it deployed, in batches, or every time?

Bryan: job every 10min will update core + all extensions, takes about 1min to check out from git

then triggers a 2nd Jenkins job that runs Scap

Currently every Puppet run updates Scap to latest master

In the future we will migrate deploying Scap itself to deploy with Trebuchet so we can cherry-pick things from Gerrit

Chris M: on browser testing:

wanted to be able to use API for test runs

made that happen (Firefox), create article, create user

Bug 62509 - investigate updating test2 Jenkins builds weekly to pull by branch

Work with Mobile took priority

Erik: so we will eliminate all dependencies on Cloudbees?

Chris: yes, working on that currently

had a lot of false failures simply because of Internet connection problems, don't own either endpoint, so can't debug this

will also get 5x increase in speed

and many diagnostic tools we don't want to build ourselves, screencasts (e.g. currently trying to debug VE stuff without actually seeing what happens on the screen)

Erik: yes, matches what I heard from VE team

Chris M: don't want to decommission Cloudbees completely yet though

Greg: last q planned 2 hires, Release Engineer and Automation Engineer, getting close

shows diff from Goals page ;) (slide) - things changed a lot in 6 months

next quarter
deployment tooling:

big one: integrate HHVM - lot of work for Bryan

stretch goal: integrate Scap and Trebuchet

Beta cluster:

completed transition to Scap

HHVM support

stretch goal: Swift cluster (media storage, thumbnails) into beta cluster. Depends on new hire or support from Ops

Mark: Ops may be able to work on this. We are also looking at including Swift in Labs, could reuse that

Erik: Thought about previsioning mechanism for virtual machines, eg. for Vagrant testing

Greg: Vagrant support in general is on

Erik: also to test multiple branches. Beta labs can't be the only part of testing infrastructure. new engineer could work on that?

RobLa: yes

Chris: yes, within expertise of new hires

(Greg:)

MediaWiki release (mostly my own work)

support release of MW 1.23 with Markus and Mark

begin second RFP around mid May

Chris St: do we have browser tests for MW 1.23, since it's kind of LTS?

Chris M: not really

James: VE hopefully will keep working with 1.23, other extensions maybe too

QA

ACTION: create a plan for browser testing of MediaWiki 1.23
 * use tags for builds corresponding to release versions
 * retire Cloudbees
 * integrate WMF Jenkins with new Saucelabs account (5x faster)
 * Use API - extensively used for MobileFrontend, but also e.g. VE, Flow.
 * Work with Matt on browser tests. had some browser tests for GS, but they became less useful. Growth team has fast iterations, longer term tests less useful. New era now where we can target any dev environment

New hires (cf. above)

Dependencies

Ops: Swift (should be fine per Mark's remarks), Icinga - have instance, but still needs to be set up for Beta labs

Mark: Actually, Ops is moving away from Icinga (new tool: Shinken). I will look into that on our side.

Greg: and I'll talk to Antoine

ACTION: Greg get firm requirements from Antoine, circle back to Mark who'll have an idea of Ops' timelime for production Shinken.

Questions/discussion
James: right now, HipHop uses the vast majority of testing time/resources... and still fails

could that be done after the build, for now? ("by the way, this didn't pass HipHop")

Greg: ok, action item. Some repositories may still need this

ACTION: figure out how to keep HHVM unit tests from delaying +2 for standard production commits

Erik: VE team figured out how to run browser tests locally some weeks ago. Wondering about this kind of knowledge gap in other teams. Chris, are there other teams you want me to talk to about this?

Chris M: worked out some things with Mobile (Jon), two week deep dive. With VE, I learned things that enabled me to run things more efficiently, find bugs. So these 1:1 conversations are very helpful. Talked with Flow. Not yet with Growth team beyond some basic conversations with Matt.

Erik: Growth might not need this level of browser testing for their short-term experiments - but for their stuff that goes into production, yes

What about Language Engineering?

ChrisM: Zeljko?

Zeljko: talked to various members of the team about testing, productive conversation

Erik: working on browser tests for content translation?

Zeljko: yes

Erik: I'm most worried about that one, lot of complexity (uses VE + some). I'll look into other opportunities for collaboration

Quim: about syncing between your deployment newsletter and Tech News, working out?

Greg: yes, frequently PMing with Odder

Wondering about coverage in the Signpost - Tech News not that well read everywhere

involve DCE (director of community engagement)

Any feedback from VE and Mobile?

James: really happy about move from Cloudbees. loss of Jeff was sad of course, but:

CI worked well for us, Antoine was super helpful. We are happy with the level of support

Regarding Beta Features, not the same level of testing there

Greg: add objective: move conversation forward on ...[fill this out]

Erik: Arthur, do you want ot comment on team practices, org wide?

Arthur: mobile web team pushed forward on testing, I paired with ChrisM, got a lot of pain points resolved during 2 weeks

might be a challenge on a wider scale because different teams have different testing priorities

QA did a good job changing culture with e.g. browser tests, but more to be done re e.g. unit testing

some standards and best practices might be needed, but don't have anything concrete right now

Erik: not a fan of global strict enforcement of standards, but should raise awareness among code reviewers

would welcome your ideas about e.g. cross team embedding

Arthur: getting a good sense of what works from current collaboration with e.g. Flow

code review is more the enforcement aspect, but need culture change before that

Greg: Other feedback?

ChrisM: want to highlight Zeljko's work, he's also the contact point for Wikidata

Tomasz: support for apps?

ChrisM: no plan right now, should be possible with Saucelabs though

Erik, Greg: thanks everyone