Continuous integration/Jenkins

From MediaWiki.org
Jump to: navigation, search
Jenkins logo with title.svg

Jenkins is a Java tool used to handle recurring tasks such as running tests or building packages. Our primary install is at https://integration.wikimedia.org/ci/.

The tool is permanently connected to our review tool (Gerrit) and can be made to react on changes submitted to Gerrit. A typical example, is running MediaWiki unit tests whenever a change is submitted to the mediawiki/core.git repository.

The main repository for the layout of Jenkins itself is integration/jenkins.git (installed in /var/lib/jenkins on contint1001.

The configuration of individual jobs is abstracted via Jenkins job builder. The jobs are triggered with Zuul (which abstracts Gerrit's events).

You can add new Jenkins slaves.

Services[edit]

These services run alongside on the Jenkins server as a result of certain jobs that publish results outside the Jenkins realm for other uses:

Nightly snapshots[edit]

They are being generated at 3am UTC via the Jenkins job "Nightly - MediaWiki core". The ant target is nightly-mediawiki-core and is really straightforward : it get the latest version of master and uses git-archive to generate a zip file. It is then copied under /org/mediawiki/integration/nightly hierarchy which is publicly available via https://integration.wikimedia.org/nightly/.

The latest snapshot is always available at https://integration.wikimedia.org/nightly/mediawiki/core/mediawiki-latest.zip (it is a symbolic link to the latest snapshot).

Wikipedia Android alpha app builds[edit]

The mobile apps team uses Jenkins to build its alpha release of the Wikipedia app, available at http://android-builds.wmflabs.org/. Each time a change to /apps/android/wikipedia is merged in Gerrit, an apps-android-wikipedia-publish job is triggered in Jenkins. Upon successful completion, the resulting APK and a JSON file containing the build timestamp are preserved as build outputs.

Every 15 minutes, a script invoked by a cron job running on the (somewhat misleadingly named) 'android-builder' instance in Wikimedia Labs checks for a new APK in the job outputs, and publishes it at the above web site if found.

For more info, see Wikimedia_Apps/Team/Android/App_hacking/Alpha_build_server.

Installation[edit]

Automatic installation[edit]

curl https://raw.github.com/valhallasw/wikimedia-mkjenkins/master/mkjenkins.sh | bash

To make the install go faster, it helps to have a mediawiki-core checkout in ~/src/mediawiki-core - if this repository exists, it will make a local clone. If it doesn't, it will download from gerrit instead (slow!).

Manual installation[edit]

Issue?[edit]

Hung beta code/db update[edit]

This deadlock seems to happen more often than not following or during a database update that is taking a while to complete.

Sometimes you have to do this whole dance several times before Jenkins realizes that the there are a bunch of executors that it can use.

Alternate method:

This second method may interrupt communication between running Jenkins jobs and Zuul but it seems to work even when the offline/online method fails to clear the deadlock.

Restart[edit]

The Debian init.d script for Jenkins is broken (unable to find the PID, T53817).

Zuul should not be restarted. Zuul preserves the queue and continues after the restart.

Via web interface

Apply the self-serve Jenkins repair!

With a safeRestart any currently running jobs will block a restart until they are canceled. Any long running jobs should be killed. Check for jobs on the main jenkins dashboard, cancel any long-running jobs there. Bonus points: make a note of the patches for which you have canceled jobs on the zuul dashboard, comment "recheck" for any patches in the test queue that you have aborted.

  1. Head to https://integration.wikimedia.org/ci/safeRestart
  2. Login with your labs account being part of the 'wmf' LDAP group
  3. press "Yes"
  4. in #wikimedia-operationsconnect: "!log restarting stuck Jenkins".

Shell

On contint1001.wikimedia.org, find the PID of Jenkins (it is a java process) and sudo -u jenkins kill -9 .. it.

Ensure the process is gone (grep through ps aux).

$ sudo /etc/init.d/jenkins start

'demon', 'krinkle', 'reedy', 'mholmquist' have the proper sudo rights. And ops of course :-]

OOM Issues[edit]

Troubleshooting

Whenever Jenkins appears to be stuck or facing high CPU usage, you will want to look at the Java threads: https://integration.wikimedia.org/ci/threadDump

This is the way to do it from the CLI

   jstack -l -F <pid of jenkins>

Last time this happened (2017-05-20) a restart of Jenkins "fixed" the problem, but we were unable to troubleshoot without a stacktrace from jstack

Debugging[edit]

Start Jenkins with Java option:

-Dhudson.plugins.git.GitSCM.verbose="true"


Text thread dump: https://integration.wikimedia.org/ci/monitoring?part=threadsDump

See also[edit]