Wikimedia Release Engineering Team/CI Evaluation Cabal/2019-09-10

This meeting was focused on demoing and talking about GitLab.

Meta

 * Weights are done
 * Spreadsheet is done
 * Rankings 0-10

Lars' evalution writeup
Executive summary: GitLab could be made to work, but Argo is probably preferable.

My GitLab PoC is not currently working, something about TLS is broken. However, here is a writeup based on my notes.

The PoC uses GitLab for CI/CD only, not as a git server. Our developers do not get direct access to GitLab in any form: they push to Gerrit, Gerrit notifies a custom controller component, which drives GitLab to do builds and tests. Other custom compents store build artifacts, manage test environments, and deploy to test environments and production.

I wrote three custom components: controller, VCS worker, and deployer. The VCS worker retrieves code from Gerrit; it is a separate component to isolate access to credentials for non-public Gerrit repositories (think security repositories).

Good things about building on top of GitLab:
 * used by some quite big projects, such as Debian and GNOME, and is thus likely to scale to our needs as well, and also indicates GitLab and its CI are reasonably battle-tested to otherwise work well enough for us
 * has an API that seems sufficiently powerful and flexible to allow us to extend CI as needed
 * supports both a VM and K8s containers as build workers and each project can be configured to prefer one or the other kind

Bad things:
 * will require writing and maintaining some extra components, of which the controller is the most challenging
 * not fully free software, but "open core"
 * GitLab as a whole seems rather complicated with many moving parts; experience tells me this means it's more likely to break from time to time
 * internals are not entirely clean and are at least partly written in Ruby; I don't know if the team knows Ruby
 * for unknown reasons, my GitLab prototype broken (see above), which may or may not be a deal-breaker, but is a bad smell

Irrelevant things:
 * Some people would like us to switch from Gerrit to GitLab for git hosting and code review. We can do that regardless of what we use for our new CI system: GitLab can be used with any CI, it doesn't need to be used with it's own.

My conclusion: I think we could live with GitLab as the base of our new CI, and I would love to write some custom components to extend it for our use cases. However, I suspect GitLab is not the best candidate: I think Argo may be a better choice, if we can bend it to our will.

GitLab overview Q&A

 * GitLab Evaluation Sheet: https://docs.google.com/spreadsheets/d/1bLIWKRfq0-H9b3HSxuwAMb7cDFu71NS_Nhwy5trwSoo/edit#gid=593312859


 * We can create a CI system around GitLab

Problems

 * We need a "controller" that gives the components commands to do certain things
 * Estimates:
 * 1 month to spec it out
 * 2(ish) months to write code (if we use python)
 * Need a Zuul replacement for GitLab
 * It is "open core" not fully FOSS -- although we don't need anything specifically from the enterprise version
 * GitLab is complicated/has many moving parts -- the more parts the more breakage
 * Written in Ruby (not our strongest language as a team)
 * Seems like it's not a deal breaker, but it is something we will need to learn

Positives

 * GitLab is being used as a git-server + CI system for large projects (i.e., Debian and Gnome) -- GitLab can scale for us
 * This project has existed for a decade
 * Powerful API that can be used to extended it
 * Good documentation
 * Build workers that are: bare metal, VMs, or Containers
 * Will allow us to move away from Gerrit; although it is not entirely necessary to choose this as CI system

Conclusions

 * "I think we should use Argo instead" - liw
 * Argo will scale up better than GitLab
 * The controller, primarily

Questions

 * Has a plugin API?
 * Lars has only been using the API.
 * https://docs.gitlab.com/ee/administration/plugins.html


 * Code for controller, etc.
 * http://git.liw.fi/wmf-ci-arch/tree/api.py
 * Single file contains all three components
 * Controller
 * Tells VCS worker to push repo to GitLab
 * Waits for Build to finish and gets build artifacts and puts them in the test env
 * VCS worker
 * Deployer
 * Runs on its own host -- takes artifacts in a test env or prod env


 * Concerns about Concurrency for api.py?
 * i.e., two patchsets uploaded to the same repo
 * Yes: it will cause issues, especially in this implementation in this instance
 * Conceptually, this is not a problem for a Controller
 * Each patchset could use a uuid4/new branch


 * You think Argo is better - if this group decides on GitLab, would you be opposed?
 * No - I think all 3 candidates are sufficiently good. Perhaps we should use dice.


 * Getting GitLab CI to run is only by submitting to GitLab itself
 * Concern: tight coupling to gitlab code hosting/code review
 * GitLab could revamp CI since it's tightly coupled to code host, not expecting that we are using GitLab CI by itself
 * Pushing triggers a CI job; don't expect that to change, but there is a possiblity that GitLab would put us in a Zuulv2 situation


 * Concern: Going all in: it would make a lot of sense to use CI, but obscuring CI from users sets aside many of the benefits of GitLab itself


 * Is there artifact storage in GitLab?
 * Yes.


 * How do you define a job in GitLab?
 * .gitlab-ci.yaml -- https://docs.gitlab.com/ee/ci/yaml/
 * In the prototype used this for build and unit test phase