User:TCipriani (WMF)/GitLab consultation

From mediawiki.org

Why[edit]

For the past two years, our developer satisfaction survey has shown that there is some level of dissatisfaction with Gerrit, our code review system. This dissatisfaction is particularly evident for our volunteer communities. The evident dissatisfaction with code review, coupled with an internal review of our CI tooling and practice makes this an opportune moment to revisit our code review choices.

While Gerrit’s workflow is in many respects best-in-class, its interface suffers from usability deficits, and its workflow differs from mainstream industry practices. This creates barriers to entry for the community and slows onboarding for WMF technical staff. In addition, there are a growing number of individuals and teams (both staff and non-staff) who are opting to forgo the use of Gerrit and instead use a third-party hosted option such as GitHub. Reasons vary for the choice to use third-party hosting but, based on informal communication, there are 3 main groupings:

  • lower friction to create new repositories
  • easier setup and self-service of Continuous Integration configuration
  • more familiarity with pull-request style workflows

All these explanations point to friction in our existing code-review system slowing development rather than fostering it.

The choice to use third-party code-hosting hurts our collaboration (both internal and external), adds to the confusion of onboarding, and makes it more difficult to maintain code standards across repositories. At the same time, there is a requirement that all software which is deployed to Wikimedia production is hosted and deployed from Gerrit.

If we fail to address the real usability problems that users have with Gerrit, people will continue to launch and build projects on whatever system it is they prefer—Wikimedia's GitHub already contains 152 projects that are not on Gerrit.

Gerrit improvements[edit]

This begs the question: if Gerrit has identifiable problems, why can't we solve those problems in Gerrit? Gerrit is open source (Apache licensed) software; modifications are a simple matter of programming.

We are unique in the community of Gerrit users which include large companies such as SAP, Ericsson, Qualcomm, and Google. Google, in particular, is singular in their use of Gerrit for projects like Android and Chromium. To support these large, open projects multi-site capabilities are needed; however, much of that work is either closed-source or does not support multi-site writes.

What[edit]

The Wikimedia Release Engineering team has taken an initial look at GitLab.

GitLab is a capable and scalable code review system written in Ruby. GitLab is available for self-hosting so there is a parity with the rest of our development tooling infrastructure—alleviating concerns about data privacy or usage restrictions of third-party hosting. As GitLab offers an MIT licensed community edition (CE), it adheres to the foundation's guiding principle of Freedom and open source. Further, GitLab is a system used successfully by many other members of the Free Software community (Debian, freedesktop.org, KDE, and GNOME).

Finally, GitLab was evaluated as part of the Release Engineering team's Continuous Integration working group. While GitLab's continuous integration systems were found to be adequate for our needs, code review was out of scope for the charter of that working group. Therefore GitLab's capabilities were weighed against the likelihood of introducing social friction and added complexity—integrating GitLab's CI with Gerrit. Replacing Gerrit with GitLab would significantly change the equation of our CI tooling—allowing us to use an industry-standard workflow and self-serve CI tooling built into GitLab. For these reasons we'd like to limit the scope of this discussion to evaluation of whether it’s feasible and advisable to move from Gerrit to GitLab for code review. Things within scope include: code review workflows and processes; tools and bots involved in the code review process; merge, branch, and deployment strategies; as well as unforeseen blockers to adopting a new code review system within our technical community. For the avoidance of doubt, continuous integration (CI) and task/project tracking are out of scope of this evaluation.

Upstream has improved the UI in recent releases, and releases have become more frequent; however, upgrade path documentation is often lacking. The migration from Gerrit 2 to Gerrit 3, for example, required several upstream patchsets to avoid the recommended path of several days of downtime. This is the effort required to maintain the status quo. Even small improvements require effort and time as, often, our use-case is very different from the remainder of the Gerrit community.

Definitions and comparisons[edit]

A few useful terms and definitions for the comparison of GitLab and Gerrit code review:

Mainline branch
A single, shared branch that acts as the current state of the repo. [1]
Integration
The process of combining a new piece of code into a mainline codebase.[2]
Patchset
The method of integration in Gerrit. A single commit under review. It is very common to amend a commit during the code review process. [3]
Merge request
The method of integration in GitLab. A request to merge one branch into another. [4]
Feature branching
Put all work for a feature on its own branch, integrate into mainline when the feature is complete. [5] These branches are often short-lived. [6]
Continuous integration
Developers do mainline integration as soon as they have a healthy commit they can share. [7]
Squashing
Making a series of Git commits into a single git commit. [8]
Fast-forward
Merging git commit history retaining git commit history without a merge commit. [9]