Continuous integration/Jenkins

Jenkins is a Java tool used to handle recurring tasks such as running tests or building packages. Our primary install is at https://integration.wikimedia.org/ci/.

The tool is permanently connected to our review tool (Gerrit) and can be made to react on changes submitted to Gerrit. A typical example, is running MediaWiki unit tests whenever a change is submitted to the  repository.

There main repository for the layout of Jenkins itself is  (installed in /var/lib/jenkins on gallium).

The configuration of individual jobs is abstracted via Jenkins job builder. The jobs are triggered with Zuul (which abstracts Gerrit's events).

You can add new Jenkins slaves.

Services
These services run alongside on the Jenkins server as a result of certain jobs that publish results outside the Jenkins realm for other use:

Nightly snapshots
They are being generated at 3am UTC via the Jenkins job "Nightly - MediaWiki core". The ant target is nightly-mediawiki-core and is really straightforward : it get the latest version of master and uses git-archive to generate a zip file. It is then copied under /org/mediawiki/integration/nightly hierarchy which is publicly available via https://integration.wikimedia.org/nightly/.

The latest snapshot is always available at https://integration.wikimedia.org/nightly/mediawiki/core/mediawiki-latest.zip (it is a symbolic link to the latest snapshot).

Android applications
The WMF produces three Android application (WikipediaMobile, WLMMobile and WiktionaryMobile), hosted on GitHub. A pre commit hook notifies Jenkins whenever a change is submitted in GitHub. The ping back URL is  https://integration.wikimedia.org/ci/github-webhook/ . Jenkins will then fetch the branch the commit was made in and build the application.

Jenkins expects an ant script named build.xml in the root directory and will then execute ant debug. Once it completes a build, it copies the generated apk under /srv/org/mediawiki/integration/*Mobile to make it available publicly. For example, the WLMMobile application at https://integration.wikimedia.org/WLMMobile/nightly/.

The mobile team is informed of build completion via an IRC bot idling in their irc channel irc://irc.freenode.net/#wikimedia-mobile

Automatic installation
curl https://raw.github.com/valhallasw/wikimedia-mkjenkins/master/mkjenkins.sh | bash</tt>

To make the install go faster, it helps to have a mediawiki-core checkout in ~/src/mediawiki-core - if this repository exists, it will make a local clone. If it doesn't, it will download from gerrit instead (slow!).

Manual installation

 * git clone https://gerrit.wikimedia.org/r/p/integration/jenkins.git ~/.jenkins</tt>
 * ~/.jenkins</tt> is the default jenkins configuration directory
 * WM-specific configuration patch I - ln -s $HOME/.jenkins /var/lib/jenkins</tt>
 * because some jobs assume jenkins is installed in /var/lib/jenkins</tt>
 * Download jenkins and place it in ~/.jenkins
 * install the following plugins (download into ~/.jenkins/plugins</tt>):
 * git
 * git-client
 * ansicolor
 * notification
 * scm-api
 * timestamper
 * build-timeout
 * xunit
 * Download and its  ‒ see Continuous_integration/Jenkins job builder for more information. You don't need a password when you install Jenkins locally.
 * WM-specific configuration patch II - Patch the JBB configuration that depends on Zuul (see the mkjenkins script for a diff)
 * If you already have a checkout of mediawiki-core: git clone --mirror -l -- your_existing_checkout /var/lib/jenkins/git/mw-core-bare</tt>. Otherwise, git clone --mirror -- https://gerrit.wikimedia.org/r/p/mediawiki/core.git /var/lib/jenkins/git/mw-core-bare</tt>
 * Start Jenkins: cd ~/.jenkins && java -jar jenkins.war& </tt>
 * When Jenkins is running, install the JBB jobs: rm -f $HOME/.cache/jenkins_jobs/jenkins_jobs_cache.yml && jenkins-jobs --conf jenkins_jobs.ini update config/</tt>.

Hung beta code/db update

 * Take deployment-bastion offline in Jenkins https://integration.wikimedia.org/ci/computer/deployment-bastion.eqiad/markOffline
 * Kill any jenkins jobs running on deployment-bastion via Jenkins UI
 * Kill all pending jobs in the Jenkins queue that are "waiting on executors"
 * Disconnect deployment-bastion https://integration.wikimedia.org/ci/computer/deployment-bastion.eqiad/disconnect
 * Bring deployment-bastion back online (button labeled "Bring this node back online")
 * Launch slave agent (there's a button that says this)
 * Check agent log to see that it connected https://integration.wikimedia.org/ci/computer/deployment-bastion.eqiad/log

Sometimes you have to do this whole dance twice before Jenkins realizes that the there are a bunch of executors that it can use.

This deadlock seems to happen more often than not following or during a database update that is taking a while to complete.

Restart all of Jenkins
Jenkins init script is bugged and would not kill the process :-/

via Web interface
Apply the self-serve Jenkins repair!

a) Head to https://integration.wikimedia.org/ci/safeRestart b) Login with your labs account being part of the 'wmf' LDAP group c) press [Yes] d) in #wikimedia-operations : !log restarting stuck Jenkins

After a few minutes it should be back. Zuul will repool jobs automatically.

Hardcore way
On gallium.wikimedia.org find the pid of Jenkins (it is a java process) and kill -9 it.

Make sure it is gone.

/etc/init.d/jenkins start

After a few minutes it should be back. Zuul will repool jobs automatically.

'demon', 'krinkle', 'reedy', 'mholmquist' have the proper sudo rights. And ops of course :-]

Debugging
Start Jenkins with Java option:

-Dhudson.plugins.git.GitSCM.verbose="true"

Text thread dump: https://integration.wikimedia.org/ci/monitoring?part=threadsDump