GitLab/Workflows/Deploying services to production

GitLab implementation

This page describes the process of deploying your application on the Wikimedia production cluster using GitLab CI. You can learn more about every part of the process by reading the materials linked in every section and at the end of this page.

Overview

[edit]

Create a project in GitLab.
Push your code to the newly created project.
Create your production container using Blubber.
Configure your GitLab CI pipeline with Kokkuri to:
- test your code
- build your container
- push your container to the Wikimedia registry
Add your project to trusted runners in the gitlab-trusted-runner repository.
Run your GitLab CI pipeline to create an image in the Wikimedia registry.
Create a Helm configuration for your project in the deployment-charts repository.
Deploy your application to production.
Configure and publish your project’s documentation using docpub.

Explore code examples in the examples repository.

The rest of this page provides more details for every stage of this process.

GitLab

[edit]

To use GitLab CI, you need to host your code on GitLab. You can create a new project and push the code from your local repository, or migrate from a different code hosting service. For information on how to do this, see Importing code to GitLab or the upstream documentation. Projects created on the GitLab instance hosted by the Wikimedia Foundation must follow the rules and guidelines outlined in Hosting a project on GitLab.

If you are migrating from Gerrit and already have a CI configuration, see GitLab/pipeline conversion for details on how to switch to the new deployment pipeline.

Using a custom container image

[edit]

With your code hosted in the right place, you can now think about configuring the runtime environment for your application.

Production deployments using GitLab CI run applications inside containers on a Kubernetes cluster. This makes configuring a container for your application an important part of setting up your deployment pipeline.

A common way of configuring a container is to use a Dockerfile. A Dockerfile will work for testing your application and publishing your container to an external registry. But if you plan to publish the container to the Wikimedia registry or to deploy it to Wikimedia production, you must use Blubber.

Blubber provides a layer of abstraction around Dockerfiles, creating a configuration that has reasonable and well-tested defaults. If you have a Dockerfile for the application you intend to deploy to production, you must rewrite it as a Blubber file. This page uses Blubber in all descriptions and examples.

Configuring a container using Blubber

[edit]

This section describes common elements of Blubber configuration necessary to build, test, and run your application. For more information about the Blubber specification, see Blubber reference documentation. For up-to-date examples of Blubber configuration, see the examples repository.

To use Blubber, create the .pipeline/blubber.yaml configuration file in your project’s root directory.

Blubber configuration overview

[edit]

A standard .pipeline/blubber.yaml file has two main elements:

syntax declaration
variant definitions with settings for separate container images

Syntax declaration

[edit]

First lines of the file contain basic information about Blubber specification. You can copy these lines directly into your configuration.

The first line makes it possible to use blubber.yaml directly when running docker build. The second line indicates a version of Blubber specification and changes when new versions of Blubber introduce breaking changes.

# syntax = docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v0.21.1
version: v4

Container variants

[edit]

Next lines specify container image variants - essentially separate containers.

For example, a basic NodeJS application might use a configuration like this:

variants:
  build-js:
    base: docker-registry.wikimedia.org/nodejs18-devel:0.0.1-20231102
    copies: [local]
    node: { requirements: [package.json, package-lock.json],
            use-npm-ci: true }
  test-js:
    includes: [build-js]
    entrypoint: [npm, test]
  run-js:
    includes: [build-js]
    entrypoint: [npm, run, server]
    node: { env: production }

The build-js variant uses the following keywords:

base specifies the variant’s base image: nodejs18-devel, available in the Wikimedia Docker registry.
copies indicates the need to copy local context (that is, the root directory of your repository) to the container.
node - settings necessary to establish a NodeJS environment in the container by running the appropriate npm commands. In this case, the container uses npm ci to install dependencies and requires package.json and package-lock.json to be present.

The test-js variant uses the following keywords:

includes indicates that it inherits the configuration of the build-js variant. This is a helpful mechanism of sharing configuration between different variants.
entrypoint specifies the command to run after the container starts: npm test.

The run-js variant resembles test-js, except for the following settings:

entrypoint specifies a different command: npm run server
node specifies that the container is a production environment, which will set the NODE_ENV variable and run npm dedupe.

For more information on the configuration options used in this section, see Blubber reference documentation.

Building and running a custom container locally

[edit]

With the Blubber configuration in place, build your container locally:

docker build -f .pipeline/blubber.yaml --tag test-js --target test-js .

-f .pipeline/blubber.yaml specifies the Blubber configuration file to use when building the image
--tag test-js tags the image, allowing you to later run it using the tag
--target test-js specifies the Blubber variant to build

After building the container, run it using:

docker run test-js

You can then inspect the output of the command to ensure your application works as expected.

Building and running a custom container using GitLab CI

[edit]

To build and run your container using GitLab CI, you must configure your project’s GitLab CI pipeline by editing the .gitlab-ci.yml file.

To learn about the fundamental concepts of using GitLab CI, see Get started with GitLab CI/CD (upstream). For a full reference of keywords you can use in .gitlab-ci.yml, see .gitlab-ci.yml keyword reference.

GitLab CI configuration overview

[edit]

A standard .gitlab-ci.yml file has the following elements:

Kokkuri includes
pipeline stages that group different jobs together
workflow rules that define when GitLab CI should run
separate jobs for different CI tasks, such as:
- unit tests
- container image deployment to registry
- end-to-end tests

The following sections give a more detailed overview of these elements in the form of a configuration example. For even more examples, see the examples repository.

Including Kokkuri

[edit]

To make it easier to construct idiomatic, complex pipelines, the Release Engineering team maintains Kokkuri. Kokkuri provides GitLab CI templates and includes that simplify your GitLab CI configuration.

All code available in the Kokkuri repository is well documented. You can explore usage examples in the README, or analyze the YAML includes available in the includes directory.

To add Kokkuri to your project, use the include key at the beginning of your .gitlab-ci.yml:

include:
  - project: 'repos/releng/kokkuri'
    file: 'includes/images.yaml'

Defining stages

[edit]

In GitLab CI, stages group related jobs, and run in the order they’re defined.

stages:
  - unit-test
  - end-to-end-test
  - deploy

Defining workflow settings

[edit]

Use the workflow key to specify when GitLab CI should run.

workflow:
  rules:
    - if: $CI_PIPELINE_SOURCE == 'merge_request_event'
    - if: $CI_COMMIT_REF_PROTECTED

if: $CI_PIPELINE_SOURCE == 'merge_request_event' indicates that the pipeline should run for changes to a merge request.
if: $CI_COMMIT_REF_PROTECTED indicates that the pipeline should run after changes to a protected branch or tag.

For information on predefined variables, like CI_PIPELINE_SOURCE or CI_COMMIT_REF_PROTECTED, see Predefined variables reference (upstream). To learn how to specify when jobs should run, see Choose when jobs run (upstream), specifically Common if clauses for rules (upstream).

Defining jobs

[edit]

This section has examples of three jobs you might want to use:

test for unit tests of your application
build-prod for publishing the container in a registry
end-to-end-test-prod for end-to-end tests of your application

test:
  extends: .kokkuri:build-and-run-image
  stage: unit-test
  variables:
    BUILD_VARIANT: test-js

The test job has the following features:

It extends the standard .kokkuri:build-and-run-image job. You can use it to build and run your container variant.
It runs as part of the unit-test stage.
It builds the container variant test-js defined in Blubber configuration in .pipeline/blubber.yaml.

build-prod:
  extends: .kokkuri:build-and-publish-image
  stage: end-to-end-test
  variables:
    BUILD_VARIANT: run-js

The build-prod job has the following features:

It extends the standard .kokkuri:build-and-publish-image job. You can use it to deploy your application’s container to the Wikimedia registry (for production, or to use it in another job).
It runs as part of the end-to-end-test stage.
It builds the container variant run-js defined in Blubber configuration.

end-to-end-test-prod:
  stage: end-to-end-test
  image: docker-registry.wikimedia.org/wmfdebug:0.0.6-20231106-20231106
  needs: [build-prod]
  services:
    - name: "${BUILD_PROD_IMAGE_REF}"
      alias: run-prod
  script:
    - curl -sS http://run-prod:4040/

The end-to-end-test-prod has the following features:

It’s defined in this .gitlab-ci.yml file directly and doesn’t use Kokkuri.
It runs as part of the end-to-end-test stage.
It uses a minimal Docker image with commonly used debug tools (wmfdebug).
It specifies a needs keyword, indicating that it depends on the build-prod job to run.
It runs the container published in the build-prod job as a service, referring to the image in the name keyword. That image reference is automatically available under a variable with the following name: <job-name>_IMAGE_REF, in this case: BUILD_PROD_IMAGE_REF.
The running service is then available under a network name specified in the alias.
Inside the wmfdebug container, this job runs the curl command, connecting to the service running the build-prod container built and published earlier.

Publishing an image for use in production

[edit]

Publishing an image means pushing it to the Wikimedia Docker registry. This is possible in GitLab CI jobs running for protected branches or tags on a trusted runner. To learn more about protecting branches and tags, see Protected branches (upstream) and Protected tags (upstream). To learn more about runners, see Runners.

To use trusted runners, you must add your project to the list of trusted projects in projects.json. You can do this by creating a merge request in the gitlab-trusted-runner repository, or by reaching out to the Release Engineering team.

When your project is on the list, you can tag the job that publishes your image with the trusted tag like below:

publish-my-production-variant:
  extends: .kokkuri:build-and-publish-image
  stage: publish
  variables:
    BUILD_VARIANT: production
  tags:
    - trusted
  rules:
    - if: $CI_COMMIT_TAG && $CI_COMMIT_REF_PROTECTED

This job:

extends the .kokkuri:build-and-publish-image used for publishing images in a registry
builds the production variant specified in the Blubber file
has a trusted tag
runs when a protected tag is created

This job will now run on a trusted runner and publish the resulting image in the Wikimedia registry. You can use this image to deploy your application in production.

Publishing an image without the trusted tag will place it in a temporary registry instead of the Wikimedia registry. You can use an image from the temporary registry, for example, for end-to-end tests like in the end-to-end-test-prod job defined earlier. It’s impossible to deploy an image from the temporary registry to production.

Deploying an image to production

[edit]

Deployments to the production cluster use Helm charts and the helmfile command.

If you want to deploy a new service, you first need to create a set of Helm configuration files in the deployment-charts repository. You can do this using the create_new_service.sh script available in the same repository by following the instructions in the README file. To learn more about deployment charts, see Deployment Charts.

Once the files are in place, all further updates will involve changes in the values.yaml and values-*.yaml configuration files for your service. You must ensure that image and version values in these configuration files match the image created and published to the Wikimedia registry in the GitLab CI pipeline. Unlike in the Jenkins-powered deployment pipeline, GitLab CI doesn’t update image versions automatically.

After making sure that Helm configuration is correct, log in to the deployment server (deployment.eqiad.wmnet always points to the active server ). Then run the following commands:

cd /srv/deployment-charts/helmfile.d/services/${SERVICE}
 
helmfile -e ${CLUSTER} -i apply --context 5

${SERVICE} is the name of your service in Helm configuration
${CLUSTER} is the name of the Kubernetes cluster you are deploying to

This will deploy your image. Typically, you will want to deploy to all three clusters: staging, eqiad, and codfw, which means running the helmfile command three times. First, deploy to the staging cluster and run smoke tests to ensure your application is working as expected. Then deploy to the remaining clusters.

To learn more about this part of the process, see Kubernetes/Deployments.

Publishing project documentation

[edit]

To publish your project’s documentation on doc.wikimedia.org, read the docpub documentation or follow the instructions below.

Include docpub in your project’s .gitlab-ci.yml file.
Write a GitLab CI stage and job responsible for building the documentation.
Add .docpub:build-docs to the extends section of the job that builds the documentation.
- This job can’t use the after-script keyword as it would override the after-script specified by docpub.
- This job must specify the DOCS_DIR variable, pointing to the location of the documentation produced by the job.
Write a GitLab CI stage and job responsible for publishing the documentation. This job must:
- extend the .docpub:publish-docs job
- depend on the job responsible for building the documentation
- specify the PUB_LOCATION variable, pointing to the desired location of your documentation published on doc.wikimedia.org.
Add your project to allowed_projects by creating a merge request or reaching out to Release Engineering.

For an example of GitLab CI configuration that builds and publishes documentation to doc.wikimedia.org, see the docpub README file. For more information, see Publishing docs.

Extra information

[edit]

GitLab CI jobs run their containers in environments called runners. The Wikimedia Foundation GitLab instance has two types of runners: trusted, and untrusted.

Trusted runners

[edit]

You can use a trusted runner for deployments to Wikimedia production. These runners can push to the Wikimedia-hosted Docker registry, and can use images hosted in the same registry, but not images from other registries. These runners don’t support the use of Dockerfiles for configuring your containers. If you need to use a custom container image on a trusted runner, you must configure it using Blubber.

For special purposes like base image builds or other foundational images a special Trusted Dockerfile runner is available to a restricted set of projects. However the runner can not be used to build services images, see Trusted Runners.

For more information on trusted runners, see Trusted Runners.

Untrusted runners

[edit]

You can use untrusted runners for jobs that don’t deploy to Wikimedia production or push to the Wikimedia Docker registry.

Two types of untrusted runners are available in GitLab CI pipelines:

Runners on the DigitalOcean Kubernetes cluster - used by default. These runners allow pipelines to use Docker images from different registries (notably Docker Hub). Configuration and administrator documentation of these runners is available in the Gitlab Cloud Runner repository.
Runners on WMCS infrastructure - used when a job or pipeline has the wmcs tag. These runners allow pipelines to use Docker images from the Wikimedia registry, just like trusted runners. For administrator documentation of these runners, see Shared runners.

All untrusted runners support both Blubber and Dockerfile container configurations.

Overview

GitLab

Using a custom container image

Configuring a container using Blubber

Blubber configuration overview

Syntax declaration

Container variants

Building and running a custom container locally

Building and running a custom container using GitLab CI

GitLab CI configuration overview

Including Kokkuri

Defining stages

Defining workflow settings

Defining jobs

Publishing an image for use in production

Deploying an image to production

Publishing project documentation

Extra information

Resources

GitLab and GitLab CI

Blubber

Kokkuri

Helm

Documentation

Runners

Other

Runners

Trusted runners

Untrusted runners