Continuous integration/status

Last update on: 2013-11-04

2011-07-25
This project aims to rebuild the Wikimedia continuous integration legacy server (currently hosted on a virtual machine) on a dedicated server in eqiad, our new data center. Chad Horohoe started to consolidate the platform to run automated tests systematically at post-commit time, to check that the SVN trunk is in an (almost) constantly deployable state. This project also relates to the will to have more frequent code deployments, as continuous integration will give us more confidence in new code if it already passed the automated tests. The new server will be combined with TestSwarm, a distributed continuous integration tool for JavaScript, currently hosted on the Toolserver. Timo Tijhof reached out to the TestSwarm team, who were enthusiastic about incorporating our improvements, notably on performance.

2011-08-31
Chad Horohoe continued to set up the virtual machine environment, while the Operations team set up the physical hardware in the Virginia data center. The final server will use Jenkins instead of CruiseControl.

2011-09-30
Chad Horohoe worked with Daniel Zahn to set up the dedicated server in our Virginia data center. Its configuration was automated with Puppet. TestSwarm remains to be pupettized. The server is expected to be put in production in early October.

2011-10-31
Chad Horohoe worked with the operations team to finalize the setup of the testing server, now online at https://integration.wikimedia.org. It is currently running Jenkins (PHPUnit), and TestSwarm should be added soon. Antoine Musso will be leading this project going forward.

2011-11-16
Antoine Musso has been making many improvements and additions to the system. Chad Horohoe has the beginnings of a TestSwarm .deb package, which he will soon check in and hand off to Antoine (see bug 32433).

2011-11-30
Chad Horohoe and Antoine Musso created a Debian package for TestSwarm, so it can be installed in our common infrastructure with Jenkins. Antoine and Timo Tijhof wrote a script to fetch individual revisions of MediaWiki's code, in order to run tests against each one of them. PostgreSQL testing in Jenkins is planned for implementation in the coming weeks.

2011-12-16
TestSwarm (used to test our javascripts) was packaged and now runs from Labs. A first step toward production deployment (tracked by RT 2059).

2011-12-22
TestSwarm is now deployed in production and is linked from the continuous integration portal. The package was Debianized and its configuration is almost entirely managed by puppet. The operation team was of a great help in assisting us deploying that. Thanks to them!

2011-12-31
<section begin=2011-12-31/>The TestSwarm package was Debianized and its configuration almost entirely entered into Puppet. Antoine Musso and Daniel Zahn deployed TestSwarm to the production continuous integration portal.<section end=2011-12-31/>

2012-01-12
<section begin=2012-01-12/>Jenkins now runs our PHPUnit test suite against a PostgreSQL backend. That should help us stabilize our support for that DBMS.<section end=2012-01-12/>

2012-01-31
<section begin=2012-01-31/>The team has rearranged Jenkins jobs to make them easier to manage in the long run, and to add capacity. TestSwarm is pending testing of the new Special:JavaScriptTest page.<section end=2012-01-31/>

2012-02-29
<section begin=2012-02-29/>The focus for February has been around integration with Git and Gerrit (bug 34141). Various Ant configurations were merged into a single Ant configuration, reducing duplication and opportunity for breakage. Also new in February is that the tests have been broken up into three suites: dbless, db, and parser. Only dbless tests (which run quickly) will block a commit. Work on CI slowed down due to Antoine being pulled into 1.19 bugfixing.<section end=2012-02-29/>

2012-03-31
<section begin=2012-03-31/>This activity was somewhat deprioritized in March in favor of the git migration. Nonetheless, Jenkins is now running the PHPUnit test suite and reporting tests results in Gerrit interface. This will help catch possible culprits as soon as a patch is submitted. Timo Tijhof wrote workflow specifications for continuous integration. Over the course of April, the Jenkins/Gerrit interaction will be polished and we will start looking at Selenium and bringing Testswarm back in action.<section end=2012-03-31/>

2012-04-25
<section begin=2012-04-25/>Jenkins has been upgraded, providing a nicer GUI and the jobs rewrite deployed.

Progress was made on implementing a universal linter for all gerrit changes (so far it lints all (modified) PHP files as part of the Jenkins job for each mediawiki/core changeset in gerrit).

The TestSwarm connection with Jenkins has been established - TestSwarm is now running MediaWiki's QUnit test suite again. The TestSwarm installation became idle after the migration to Gerrit+Git because it was configured for SVN. The old configuration is now disabled, and everything is now handled by Jenkins instead.

Christian Aistleitner created a test suite for the (rewritten) MWDumper system and are also monitored live on Jenkins. <section end=2012-04-25/>

2012-04-monthly
<section begin=2012-04-monthly/>Jenkins has been upgraded, providing a nicer GUI. Progress was made on implementing a universal linter for all gerrit changes, and not just those with modified PHP files. The TestSwarm connection with Jenkins has been established, and TestSwarm is now running MediaWiki's QUnit test suite again. Christian Aistleitner created a test suite for the (rewritten) MWDumper system, and the tests are monitored live on Jenkins.<section end=2012-04-monthly/>

2012-05-monthly
<section begin="2012-05-monthly"/>Timo Tijhof continued to work on the TestSwarm rewrite. The team is considering moving the continuous integration environment into Wikimedia Labs. The new TestSwarm version will probably be first deployed in the new environment instead of the current environment.<section end="2012-05-monthly"/>

2012-06-27
<section begin="2012-06-27"/>Timo Tijhof is working on installing the new Testswarm version on a labs instance. We will use browserstack system to setup various browsers to run tests for us.

Jenkins has been upgraded to the latest version 1.472.<section end="2012-06-27"/>

2012-06-monthly
<section begin="2012-06-monthly"/>Timo Tijhof is working on setting up the new TestSwarm in Wikimedia Labs. We will use the TestSwarm and BrowserStack API through the testswarm-browserstack bridge to automatically populate the swarm with needed browsers. Antoine Musso upgraded Jenkins to the latest version, 1.472.<section end="2012-06-monthly"/>

2012-07-16
<section begin="2012-07-16"/>A recurring request is to run extensions unit tests. Antoine wrote a set of ant helpers to get extension code and apply a Gerrit change. The first experiment is with the Wikidata project which revealed issues with other part of the build scripts. Still a work in progress. Middle term, we will add unit testing for ArticleFeedbackv5 and MobileFrontend. In september the rest of the extensions will follow.<section end="2012-07-16"/>

2012-07-monthly
<section begin="2012-07-monthly"/>Antoine Musso automated the process of updating extension code from Git/Gerrit using Ant, for purposes of automating unit tests on extensions. The first experiment was with the Wikidata project which revealed issues with other parts of the build scripts, so this is still a work in progress. Antoine will be out for much of August, and his primary focus has been on Beta Labs, so work in this area will resume in September.<section end="2012-07-monthly"/>

2012-08-29
<section begin="2012-08-29"/>The Extension:TitleBlacklist is the first extension for which tests are no automatically run under Jenkins. The dashboard is at https://integration.wikimedia.org/ci/job/Ext-TitleBlacklist/ and build status is sent back to Gerrit.<section end="2012-08-29"/>

2012-08-monthly
<section begin="2012-08-monthly"/>The TitleBlacklist extension is the first MediaWiki extension for which tests are now automatically run via Jenkins. The dashboard is at https://integration.wikimedia.org/ci/job/Ext-TitleBlacklist/ and build status is sent back to Gerrit.<section end="2012-08-monthly"/>

2012-09-24
<section begin="2012-09-24"/>Antoine is integrating a new Gerrit/Jenkins gateway to let us finely tune how we trigger jobs in Jenkins. The system comes from OpenStack and is written in python: http://ci.openstack.org/zuul/zuul.html<section end="2012-09-24"/>

2012-09-27
<section begin="2012-09-27"/>The new gateway has been setup on labs. The production jobs are being migrated to use the new system. The Gerrit tool will need some upstream patches, the way to get them our our production server is being discussed with Chad Horohoe.<section end="2012-09-27"/>

2012-09-monthly
<section begin="2012-09-monthly"/>Antoine is integrating a new Gerrit/Jenkins gateway to let us finely tune how we trigger jobs in Jenkins. The system comes from OpenStack and is written in Python. Also, we've set up the new gateway on Labs. The production jobs are being migrated to use the new system. The Gerrit tool will need some upstream patches, and the way to get them onto our production server is being discussed with Chad Horohoe. Timo Tijhof has rewritten the testswarm-browserstack bridge in preparation for a more scalable deployment with automated browser worker creation and termination following the TestSwarm queue. This is currently being tested at integration.wmflabs.org.<section end="2012-09-monthly"/>

2012-10-16
<section begin="2012-10-16"/>The new Gerrit/Jenkins gateway has been integrated on labs and is under alpha testing. Antoine has been experimenting with a Jenkins job builder that should make it easier to maintain our various jobs.<section end="2012-10-16"/>

2012-10-monthly
<section begin="2012-10-monthly"/>The continuous integration server has been upgraded to Precise, which will let us install more recent versions of various testing software. This upgrade also made it possible to deploy Zuul in production.<section end="2012-10-monthly"/>

2012-11-13
<section begin="2012-11-13"/>CI summit happened in The Netherlands over the weekend, more information will be forthcoming this week. <section end="2012-11-13"/>

2012-11-20
<section begin="2012-11-20"/>integration-jenkins2 is now fully operational with Jenkins / Gerrit and a Zuul installation. Antoine has generated the new MediaWiki core Jenkins jobs using Jenkins Job Builder python script. The testing showed that our old ant script does not play well, will attempt to migrate php linting and PHPUnit triggering to grunt tasks<section end="2012-11-20"/>

2012-11-22
<section begin="2012-11-22"/>Zuul has been deployed in production successfully. It triggers a new set of Jenkins jobs that will eventually replace the old MediaWiki-.* ones. Nothing is reported back to Gerrit yet until we are happy with the new jobs.<section end="2012-11-22"/>

2012-11-26
<section begin="2012-11-26"/>The new Jenkins jobs for MediaWiki core (triggered by Zuul) have been tested in production and are successful. The new workflow has been documented on Continuous integration/Workflow and comes with a flowchart.<section end="2012-11-26"/>

2012-11-monthly
<section begin="2012-11-monthly"/>A continuous integration summit occurred during the Netherlands Hackathon. integration-jenkins2 is now fully operational with Jenkins / Gerrit and a Zuul installation. Antoine Musso has generated the new MediaWiki core Jenkins jobs. Zuul has been deployed in production successfully. It triggers a new set of Jenkins jobs that will eventually replace the old MediaWiki-.* ones. The new Jenkins jobs for MediaWiki core (triggered by Zuul) have been tested in production and are successful. The new workflow has been documented.<section end="2012-11-monthly"/>

2012-12-11
<section begin="2012-12-11"/>Jenkins jobs are now triggered by Zuul. Several MediaWiki extensions have been added in Jenkins. We still have to figure out how to install an extension with all its dependencies. The vote casted by Jenkins are being reviewed and will most probably be adapted by the end of the month.<section end="2012-12-11"/>

2012-12-14
<section begin="2012-12-14"/>The last Jenkins jobs (mostly Analytics ones) that were still using Gerrit Trigger plugin have been migrated to be triggered by Zuul. Volunteer Merlijn van Deen build a script to replicate our Jenkins installation https://lists.wikimedia.org/pipermail/wikitech-l/2012-December/065088.html and worked on having extensions tests to be run on different MediaWiki branches.<section end="2012-12-14"/>

2012-12-19
<section begin="2012-12-19"/>Zuul now support triggering tests for whitelisted users. This has been deployed to let Wikimedia staff to have unit tests run whenever they send a patchset in mediawiki/core .<section end="2012-12-19"/>

2012-12-monthly
<section begin="2012-12-monthly"/>The last Jenkins jobs (mostly Analytics ones) that were still using the Gerrit Trigger plugin have been migrated to being triggered by Zuul. Zuul now support triggering tests for whitelisted users. This has been deployed to let trusted users have unit tests run whenever they send a patchset in mediawiki/core. Volunteer Merlijn van Deen built a script to replicate our Jenkins installation and worked on having extensions tests run on different MediaWiki branches.<section end="2012-12-monthly"/>

2013-01-22
<section begin="2013-01-22"/>Integrated PHP_CodeSniffer for mediawiki/core.git, that report style errors. That is a long awaited feature which was tracked with. A basic documentation is available at https://www.mediawiki.org/wiki/Continuous_integration/PHP_CodeSniffer<section end="2013-01-22"/>

2013-01-31
<section begin="2013-01-31"/>Lot of extensions have been added to the Jenkins system, some of them with unit tests.<section end="2013-01-31"/>

2013-01-monthly
<section begin="2013-01-monthly"/>Antoine Musso worked with several MediaWiki extension authors to ensure that the unit tests for those extensions are run by Jenkins and that they work. He hopes to have all extensions that run on the Wikimedia production cluster fully operational by the end of February. Antoine also integrated PHP CodeSniffer into our automated test runs.<section end="2013-01-monthly"/>

2013-02-monthly
<section begin="2013-02-monthly"/>Antoine Musso worked with several MediaWiki extension authors to ensure that the unit tests for those extensions are run by Jenkins and that they work. He hopes to have all extensions that run on the Wikimedia production cluster fully operational by the end of February. Antoine also integrated PHP CodeSniffer into our automated test runs.<section end="2013-02-monthly"/>

2013-03-monthly
<section begin="2013-03-monthly"/>Timo Tijhof implemented a Jenkins job to run the QUnit javascript test for MediaWiki core. That will definitely help us catch most of the javascript issues.

The continuous integration site has been moved from integration.mediawiki.org to integration.wikimedia.org and is now always on HTTPS. The index page has been rewritten based on Twitter Bootstrap (see integration.wikimedia.org).

Antoine Musso has given our Zuul status page an overhaul. It features live reloading through ajax and contains direct links to the Gerrit changesets and Jenkins jobs. A big improvement over the plain text version.

Antoine Musso and Timo Tijhof set up the new doc.wikimedia.org portal. The MediaWiki core (Doxygen-generated) PHP documentation has been moved here (svn.wikimedia.org/doc is now a redirect). We're currently working on packaging jsduck and writing Jenkins jobs to generate JavaScript documentation with JSDuck.

We've packaged various Python modules for the Debian project, which will in turn let us simplify deployment. Meanwhile, we're experimenting with having our Debian/Ubuntu packages built by Jenkins directly.

This month we've continued to extend Jenkins coverage for Gerrit repositories. We're happy to announce that almost all repositories for MediaWiki extensions in Gerrit now have Jenkins integration.<section end="2013-03-monthly"/>

2013-04-24
<section begin="2013-04-24"/>Upgraded Zuul to the latest master version. Made possible since the python modules dependencies have been packaged for Debian.<section end="2013-04-24"/>

2013-04-monthly
<section begin="2013-04-monthly"/>In April, the Jenkins/Zuul platform encountered several issues such as the gating job running tests against the current version of the branch instead of the to-be-merged change. Antoine Musso solved several performances issues by using tempfs and a new SSD drive and upgrading Zuul to the latest upstream version.

Timo Tijhof overhauled the automatically generated MediaWiki documentation for Javascript and PHP with Doxygen 1.7. He also fixed the duplicate test runs that happened in specific cases. Finally he set up QUnit tests for the VisualEditor extension; if this proves successful, QUnit runs will be generalized to all extensions.

Mark Holmquist improved the Jenkins jobs that track Parsoid regressions tests.

Finally, we've added linters for several languages: Python, Ruby and even Yaml. If your git repositories are missing a lint check, please contact us or file in a bug against Wikimedia > Continuous Integration.<section end="2013-04-monthly"/>

2013-05-06
<section begin="2013-05-06"/>Upgraded Zuul, it now supports project templates in its configuration files and reports back in Gerrit the duration of the job run.<section end="2013-05-06"/>

2013-05-monthly
<section begin="2013-05-monthly"/>In the beginning of May, Jenkins/Zuul faced overload for a few days; this was resolved by upgrading Zuul and tweaking some time-expensive part of the code. Zuul now lets us define which jobs it triggers by using a predefined template which makes it easier to add new projects. Zuul is now faster to report changes back to Gerrit, which was a complaint during rush hours.

The Wikibase client and repo components are now tested in Jenkins. All puppet repositories are now verifying the puppet manifests and erb templates for syntax validity. The Qunit tests being run for MediaWiki core and VisualEditor seem to be in good shape.

PHP CodeSniffer has been upgraded as well as the standard for MediaWiki code. We have yet to enforce it though, since the current code base does not pass the standard.<section end="2013-05-monthly"/>

2013-06-11
<section begin="2013-06-11"/>Jenkins queue is now being monitored in Ganglia. The feature request was tracked with .<section end="2013-06-11"/>

2013-06-monthly
<section begin="2013-06-monthly"/>Timo Tijhof and Antoine Musso triaged continuous integration bugs. Antoine has setup a Jenkins slave and migrated most jobs on it. It will be very easy to add new servers.<section end="2013-06-monthly"/>

2013-08-28
<section begin="2013-08-28"/>Simplified the Jenkins Job Builder configuration files by dropping out the 'gerrit-name' variable in favor of using $ZUUL_PROJECT directly. <section end="2013-08-28"/>

2013-09-19
<section begin="2013-09-19"/>For the last few weeks, Antoine has been working toward having jobs running on a second machine. Today, all jobs are ready to freely roam on the Jenkins slave we have and Antoine started migrating the pep8 and pyflakes jobs.<section end="2013-09-19"/>

2013-10-monthly
<section begin="2013-10-monthly"/>October has been dedicated to consolidating the Jenkins configuration to make it easier to edit. Most actions are now handled by shell scripts under ; editing the scripts doesn't require updating Jenkins jobs. The second slave server has been added to production and is successfully running PHPUnit tests. The packaging of dependencies required to upgrade Zuul has been completed, and Antoine Musso now has a version working in Labs. Finally, we investigated the possibility of running the browser tests whenever a change is submitted in Gerrit; that work is still in progress. Thanks to Carl Fürstenberg's work during the Summer, we are now able to build some Debian packages straight into Jenkins using a dedicated instance and the Jenkins Debian glue scripts. The jobs are listed in Jenkins under the Ops-DebGlue view.<section end="2013-10-monthly"/>

2013-11-04
<section begin="2013-11-04"/>Antoine installed in labs Zuul version that uses gearman to trigger jobs and did all the puppet works for it. Will have to test it more thoroughly and schedule an upgrade of the production setup.<section end="2013-11-04"/>