Continuous integration/Browser tests

Status: https://integration.wikimedia.org/ci/view/BrowserTests/ runs browsertests on Sauce Labs browser instances on a schedule. No more CloudBees.

Tracking bug: | Tree view

Context
VisualEditor is an important project to the Wikimedia community. It has been deployed in production and now needs to be matured to come with support for more languages, input methods and different browsers. The QA team is pairing with the VisualEditor team to have them write Selenium-based browser tests. We eventually want those tests to be run whenever a new patchset is sent to Gerrit and block it whenever a regression occurs.

Ultimately, the QA team and the release team would like the browser tests to run whenever someone sends a new patch to the Gerrit repository to make sure patches landing the repository are not introducing regressions.

Perimeter
The Selenium WebDriver uses Ruby gems that are not welcome in production since they are often not packaged for Debian. We historically went around that limitation by using the hosted Jenkins service cloudbees.com. However, it comes with several limitations:


 * The current CloudBees setup has a limited number of executors. We could get more if needed though.
 * Gerrit events are processed by Zuul which, as of September 18 2013, is not able to drive more than one Jenkins instance. It could potentially drive more than one Jenkins if we get it upgraded to support Gearman and have CloudBees install the related plugin. That seems unrealistic since Antoine already attempted to get Zuul upgraded for a few months already.

The browser tests run twice per day. Web browsers are provisioned in SauceLabs a service we will definitely reuse, each browser is being pointed to test2.wikipedia.org or the beta cluster instance.

Goals
After a few discussion with Željko, the main sub projects are:


 * Migrating the jobs from CloudBees. ✅
 * Have a labs instance setup with ruby, related gems, the selenium WebDriver and PhantomJS.
 * Make the labs instance a Jenkins slave and  ✅
 * Reuse the mw/core QUnit idea to bootstrap a MediaWiki instance that browsers will be pointed at.
 * Migrate CloudBees jobs to Wikimedia Jenkins. ✅
 * Configure Zuul to trigger the jobs. currently tests twice a day on a schedule
 * Configure Sikuli for testing typing in different languages.

CloudBees jobs migrations
Ideally we would want the CloudBees jobs to be migrated to Jenkins job builder. That requires some more extra steps which are not really necessary to successfully trigger VisualEditor jobs from Gerrit. It seems easier to simply import whatever templating system is being used on CloudBees and reuse it as is on the Wikimedia Jenkins server.

The migration to Jenkins job builder should be made later on.

Labs instance setup
We can stick to a small instance at the beginning. Scaling out must be as simple as creating a new instance, applying a puppet manifest and add the instance as a slave of the Wikimedia Jenkins installation This part of the document describes the bricks needed to build a worthwhile Jenkins slave able to drive browser tests.

phantomjs
Željko knows about this part

The idea would be to point a javascript headless browser to the MediaWiki instance. It would interact with it on the loopback interface and would barely be subject to any network latency. Passing the pahntomjs tests would be a prerequisite before having the SauceLabs browsers pointed at the patch.

Selenium webdriver
The Selenium Web driver is using ruby and several gems. The list is maintained in qa/browsertests git repository in Gemfile. One will first have to install the bundler gem, then use it to interpret the Gemfile.lock and install Selenium and all dependencies.

The bundler gem is not provided as a Debian package which make it hard to properly puppetize it. Wikimedia puppet does not support gems as package provider, hence we would have to get puppet to run the gem install command.

Whenver the qa/browsertests repository is updated, we will have to rerun the bundler installer whenever the gem list is altered.

The qa/browsertests repository provides several features and implementations for other to build on. The VisualEditor extensions has its own set of features provided in.

Jenkins slave
The puppet class role::ci::slave is suitable to run most of Wikimedia Jenkins jobs. The class would need to be made more generic since the class:
 * installs packages not needed by Browsertests (contint::packages)
 * set the host as a Gerrit replication destination which might be unwanted in our context.
 * creates a tempfs

Once we have a generic class, we can extends it to setup a browser tests Jenkins slave.

The bug requesting the puppet class is

MediaWiki script
When a patch is submitted, we will want to fetch MediaWiki, clone the VisualEditor patchset under extensions/ then install MediaWiki in a unique directory. We need a generic Apache virtual host that would publish MediaWiki under .instance-proxy.wmflabs.org.

The setup is very similar to the qunit jobs we have described in Jenkins Job Builder. We would need:
 * integration/Jenkins scripts (tools/extensions-loader.php and fetch-mw-ext)
 * an Apache configuration file
 * the shell oneliner that runs MediaWiki maintenance/install.php

Zuul configuration
The exact workflow would need to be figured out with the VisualEditor team. A first step would be to run the browser tests after a change has been merged and report back in Gerrit for their information.

At first we will only want to run browser tests after a change has been merged. This let us load the system progressively without disrupting the VisualEditor authors.

We can later on make Zuul report back in Gerrit a message stating the result of the tests, that will raise awareness among developers and they can work on fixing the unsuccessful tests. Additionally, we can have the browser tests to be triggered on every patchset, the labs instance load will have to be carefully monitored.

Once tests are properly passing and the setup has been proven useful, we can make Zuul to block patchsets not passing the browser tests. The developers would then be required to fix the code or the test to have their change merged in.

This is tracked by