Wikimedia Release Engineering Team/CI Futures WG/Requirements

As of February/March 2019 the Release Engineering team at the Wikimedia Foundation is taking a look at what tooling the deployment pipeline software should be like. As the first step, we're collecting requirements. Feel free to add yours here; we are also reaching out to interview the most important stakeholders.

had some answers related to this.

Requirements that are better than our current situation are labelled (NEW!) for clarity.

= Very hard requirements =

Most CI software will meet most hard requirements. This section helps to quickly rule out some solutions. These are meant to be easy and quick to check for each candidate.


 * Must be free software / open source. "Open core" like GitLab might be good enough.
 * Must be hostable by the Foundation. It's not acceptable to rely on outside services.
 * Must support git.
 * Must have a version we can easily use for evaluation.
 * Must be understandable without too much effort to our developers so that they can use CI/CD productively.
 * (NEW!) Must support self-serve CI, meaning we don't block people if they want CI for a new repo.

= Hard requirements =


 * Must be fast enough that it isn't perceived as a bottleneck by developers.
 * Must make its status and what-is-going-on visible so that its operation can be monitored and so that our developers can check the status of their builds themselves.
 * Must provide feedback to the developers as early as possible for the various stages of a build, especially the early stages ("can get source from git", "can build", "can run unit tests", etc.).
 * Must be secure enough that we can open it to community developers to use without too much supervision.
 * Must be maintained and supported upstream.
 * Must be able to handle the number of repositories, projects, builds, and deployments that we have, and will have in the foreseeable future.
 * Must enable us to instrument it to get metrics for CI use and effectiveness as we need. Things like cycle times, build times, build failures, etc.
 * Must empower our developers and remove Release Engineering team as a bottleneck for their productivity.
 * Must work with Gerrit as well as other self-hostable code-review systems (e.g., GitLab), if we decide to move to that later.
 * Must enable us to have a short cycle time (from idea to running in production). CI is not the only thing that affects this, but it is an important factor.
 * Must promote (copy) Docker images and other build artifacts from "testing" to "staging" to "production", rather than rebuilding them, since rebuilding takes time and can fail.
 * Must allow developer to replicate locally the tests that CI runs. This is necessary to allow lower friction in development, as well as to aid debugging.
 * Must allow deployment to be fully automated.
 * Must be automatically deployable by us or SRE, using puppet, onto a fresh server. I don't know if that means .deb packages are needed or not, but it certainly wouldn't be a bad thing.
 * Must be horizontally scalable: we need to be able to add more hardware easily to get more capacity.
 * Must be able to support all programming languages we currently support.
 * Must support linking to build results for easier reference and discussion.
 * Must support saving of build artifacts.
 * Must keep configuration in version control.
 * Must support gating / pre-merge testing.
 * Must support periodic / scheduled testing.
 * Must support post-merge testing.
 * Must support tooling to do the merging, instead of developers.
 * Must support storing tests in version control.
 * Must support collection of relevant metrics: cycle time, job run time, wait time, etc.
 * Must support reporting to Gerrit, IRC, and Phabricator.
 * Must have some way to declare dependent repositories / software needed for testing.
 * Must support services for tests — i.e., some PHPUnit tests require MySQL.
 * Must allow changing git repository, code review, and ticketing systems from Gerrit and Phabricator.
 * Must protect production by detecting problems before they're deployed, and must in general support a sensible CI/CD pipeline.
 * Must allow Release Engineering team to enforce tests on top of what a self-serving developer specifies, to allow us to set minimal technical standards.
 * Must support dependency caching – we have castor, maybe we could do better? Maybe some CI systems have this figured out?

= Softer requirements =


 * Should have a hosted instance we can play with for evaluation to avoid having to install everything from scratch.
 * (NEW!) Should not require software development from the Foundation.
 * Should allow builds to happen in K8s containers, but probably should also support running jobs on bare metal or VMs. For example, building Docker containers can't happen inside Docker containers.
 * (NEW!) Should allow the developers to define or declare at least parts of the pipeline jobs in the repository: what commands to run for building, testing, etc.
 * Should be highly available - can restart any component without disrupting service.
 * Should be fast at checking out code and running tests to give quick feedback to developers.
 * Should have live console output of build.
 * Should have build timeouts.
 * Should support secure storage of credentials / secrets.
 * Should provide a clean workspace for each test run - either a clean VM or container.
 * Should allow archiving build logs and possible artifacts for a long period, to allow extracting metrics from a long time period.
 * (NEW!) Should have rate limiting - one user/project can not take over most/all resources.

= Would be nice =


 * (NEW!) Would be nice for Release Engineering team to not be a bottleneck: we should not be required to approve a change to how a job runs for a repo.
 * (NEW!) Would be nice for tests not to have to be written in shell script.
 * Would be nice for tests not to have to be written in Groovy script.
 * (NEW!) Would be nice for tests not (likely) be written in a "language" but an abstraction.
 * The abstraction should be easily extended in a "real" language – more than shell, ideally.
 * Would be nice for test abstractions to limit boiler-plate, i.e., all of our services are tested roughly the same way.
 * Would be nice to prioritize jobs.
 * Use case: if there is a queue of jobs, there should be some mechanism of jumping that queue for jobs that have a higher priority.
 * We currently have a Gating queue that is a higher priority than periodic jobs that calculate Code Coverage.
 * Would be nice to support isolation / sandboxing.
 * Jobs should be isolated from one another.
 * Jobs should be able to install apt-packages without affecting dependencies of other jobs.
 * (NEW!) Would be nice to have configurable job requirements/affinity.
 * Be able to schedule a job only on nodes that have at least X available disk space/ram/cpu/whatever OR try to schedule on nodes where a current build of this job isn't already running.
 * Would be nice to build artifacts suitable for production.
 * Currently we only do container images in a limited way – nice to haves: deb packages, java jars, go binaries, packagist downloads.
 * Would be nice to make it easy for developers to recreate failures locally.
 * (NEW!) Would be nice to post-merge  to find patch that caused a particular problem with a Selenium test.
 * (NEW!) Would be nice to have a mechanism for deployment to staging, production, pypi, packagist, toollabs.
 * (NEW!) Would be nice to have efficient matrix builds.
 * E.g., we currently run phpunit tests and browser tests for the Cartesian product of [PHP7 PHP7.1 PHP7.2 HHVM][MySQL, SQLite, PostgreSQL][Composer, MediaWiki vendor], but we preform setup/git clone for all of those tests. Doing that in a space and time efficient way would be good.
 * (NEW!) Developers should have an option to ssh to VM/container that CI used to run the tests for debugging.
 * (NEW!) Would be nice to support building and testing mobile applications (at minimum for iOS and Android).
 * (NEW!) Would be nice to be able to run for secret/security patches.