Topic on Talk:GitLab consultation

OpenCore / our use case does not match GitLab business cases

1
Hashar (talkcontribs)

I have been acting on the migration from subversion to git/gerrit. My primarily responsibility is maintaining the CI infrastructure and a secondary one is administrate our Gerrit


I would like to dig into Gitlab's business model (open core), how it splits the features between free to use and proprietary ones, list the features we would definitely want and expose the workaround we will have to implement. My thesis is that if we stick solely to the open source version we do not fit GitLab vision and will endure a long journey of implementing on own tooling on top of their limited open source offering. It might not even be cost effective and would not address the maintenance and upgrade needs.

My conclusion is that we should rather adopt their full offering which implies relying on proprietary code but with the benefit of a fully integrated environment. If we do not want to compromise on the open source principle, the alternative is to seek an alternate code review system or stick with the statu quo and invest in it.

GitLab open core

business model

There is one thing I really like about the GitLab company is that they are extremely transparent (see for example their staff handbook). There is thus no surprise on their offering.

Gitlab is a company, it has to make money somehow which is achieved by selling a product to consumers. Their business model is Open core: the main features of their product are released as open-source software and are thus free to use (as the license grants you the ability to use it and modify it however you want AND the license is not subject to a financial fee). Extended features are offered under a proprietary license which comes with a fee, in Gitlab case the code is additionally available for reading, but you can not use it without agreeing to their proprietary license.

My understanding is the open core business model emerged as a way to fund an open source software. By restricting the availability of some features, that effectively forces corporate users to pay for them which in turn fund the company and the open source part. Wikimedia already uses the ‘’’open source part’’’ of such open core projects: Kafka, Cassandra, ElasticSearch, Redis to name a few.

open core and Wikimedia

One of the Wikimedia Foundation guiding principle is freedom and open source . All software written by the Foundation is open-sourced. My opinion is that it allows anyone to fork our projects and have all the software stack to do that without requiring any licence.

For the organization itself, we do prefer using open sources software, but do not forbid proprietary ones. When there is no effective open-source alternative or the infrastructure would be too challenging to maintain, we would adopt a proprietary solution. As an example, the Wikimedia Foundation uses the Google suite for emails, calendar, some documents and for video call. Open source alternatives do exist, but integrated them all together in a friendly to use suite is probably not achievable.

I don't think GitLab proprietary code goes against the foundation guiding principle about freedom and open source, as long as it is shown that those proprietary features can't be fulfilled effectively by open source tools. Afterall, the code review system is not essential in order to fork a wiki project, just like gmail or using a proprietary code editor do not prevent producing open source software.

features tiers

The way GitLab determines whether a feature should be open source or in one of the proprietary tiers is described on their stewardships page and specially on pricing page. A summary is that when a feature is introduced, they would ask themselves who is going to be the likely buyer:

  • a single developer: open source
  • a team manager, director, executive: proprietary tiers.

I will take in example two features that were discussed on this talk page previously:

merge approval
when you are a single user, it does not make much sense to request a self approval or ask yourself to review your code. It is thus not intended for a single developer and the feature is thus for a paid tiers.
searching
as a single developer, you already have all the repositories on your local machine and can just search through them on disk. You would probably not bother setting up an ElasticSearch backend and running the indexer either. It is thus a paid tier.

Based on community feedback, they might move some features to the open-source tier. Then, and quoting GitLab: the premium product needs to hold value. Since they can't just open-source everything or there will be little incentive for corporate users to be willing to pay for a license which in turns will financially dry the company.

Features requirements

GitLab features by tiers is available at https://about.gitlab.com/pricing/self-managed/feature-comparison/ . We already went to list the features we are after and how the gap could be filed: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/GitLab/Features. The bottom of the tables lists them as likely non blocker, I thus assume the top are the hard requirements.

There are a few deal breakers left:

merge approvals
it is the feature that enforces code review. The lack of that feature prevents us from implementing our privilege policy and would make it challenging to figure out whether a merge request is suitable for approval. A proposed workaround is marge-bot, but its documentation mentions that you have to turn on the required approval proprietary feature in GitLab. marge-bot seems to be more about a CI system that runs tests of the branch against the target branch to ensure nothing breaks. The same feature we have in our current CI / Zuul. I could not find any lead as to how marge-bot could potentially replace the proprietary feature, it actually seems to rely on it.
Merge Request Dependencies
for a single project in Gerrit, you would chain your commits in the order you want them merge, in GitLab that would be a branch and a single merge request. We often have to express a dependency between projects, such as a breaking change in mediawiki/core that affects several extensions, we want to make sure extensions get updated before merging the breaking change. It is not really a blocker for GitLab since Gerrit does not manage cross repositories dependencies, that is enforced by the CI system (Zuul). Still it would be nice to have rely on the builtin GitLab feature.
permissions
GitLab comes five roles: Guest, Reporter, Developer, Maintainer, Owner. The model seems to have clear separation of concerns, but it is not clear to me whether that would fit our current permissions schemes which are based on groups of users rather than generic roles. As an example,mediawiki/core fundraising branches are only actable by the Wikimedia fundraising team, GitLab offers a way to protect branches but that is based on one of the existing roles (for example Developers). We would thus be unable to restrict that branch to a subset of people.
Cross repositories search
I could not find a system to find related merge requests across projects. Similar to how in Gerrit I can look for any change I am a reviewer for (is:open reviewer:self) or any change related to a given bug (bug:T12345).

GitLab core / open-source set of features is strong, however there are several proprietary features that would also be very nice to have to replace existing tooling. We thus prevent ourselves from proposing a nicely integrated solution. A partial list of such features would be:

Code Owners
It assists in finding reviewers. We currently run a bot configured via https://www.mediawiki.org/wiki/Git/Reviewers, our Gerrit installation does have a similar feature (reviewers plugin) although it is not used. We can surely port our existing bot to GitLab but would still lack the nice integration.
global search
currently worked around with https://codesearch.wmcloud.org/search/ and mirroring to GitHub.
All the contributions metrics served currently by Bitergia.
Security analysis of dependencies, currently implemented as a custom script run by CI or relying on GitHub security analysis.


Benefit from the full suite

GitLab offers a well integrated suite and could even replace a lot of our custom toolings which would enhance our overall experience. Unfortunately restricting ourselves to the sole set of open-source features prevents us from benefiting from the whole experience which in my opinion should have been the main driver toward migrating to GitLab beside just git hosting and code review. Security audit, code search, metrics are all features that are currently badly exposed to developers and would be more prominently shown via a higher tier of GitLab, in turns leading people to use them and enhance our daily tasks.

The compromise of restricting ourselves to just the open source features might address the usability issues that people encounter with Gerrit. But it comes with important costs that we should not underestimated:

  • developers workflow will be disrupted, and we must accept that some of the existing workflow would not be implementable in GitLab.
  • the GitLab architecture involves several components we would have to maintain. Whereas Gerrit is a java jvm and flat files, GitLab involves way more components (rails, redis, postgreSQL, diff storage at least). We should not underestimate the resources that would need to be allocated in sustaining it. It has already proven to be troublesome for Gerrit.
  • a lot of our tools do not have a drop-in replacement and would have to be migrated.

Getting the most out of GitLab

I would like to suggest we evaluate GitLab proprietary paid tiers. Oh, I see pitches and forks raising. As alluded above the freedom and open source guidance comes first and foremost for the software we write and the wiki projects. It still allows use of proprietary one when no alternatives exist. There does not seem to be any proper open source alternative for a modern code forge (beside Phabricator/Differential which we ended up rejecting).

I strongly believe in open-source, but we also have to be pragmatic and understand that not all open-source software can be heavily funded via fundraising. People end up having to make a living out of it somehow and the open core is a compromise between economical reality and the purity of open-source. If we were to adopt the proprietary features, we would relieve ourselves of the burden of reinventing the wheel and in turn allocate the saved resources toward producing more open source software and better sustain our very own projects.

One might as well ask whether it still makes sense to self host the application. The Gerrit upgrades proved to be somehow problematic due to lack of funding and or active participation with upstream (though we had at least two volunteers dramatically helping on that front). I would guess we would suffer from the same trouble with GitLab which architecture is an order of magnitude more complicated, or at least involves more components. Using a SASS or a managed on premise appliance would free us up from the maintenance burden.


Conclusion

Hypothetically, if we were to agree to use proprietary software and SASS, we would have a top of the art code hosting solution with all the whistles and bells that makes life of developers so much easier.Under that hypothesis, we might as well consider using Github which is already the canonical place for several repositories. GitHub does have an on-premise offer which would fit some of our privacy requirements. Afterall GitLab and GitHub offer very similar experiences in the end.

The cost (time and money) of migration will be fairly large and I don't think it is offset by the limited set of features offered. We will still need to deal with the infrastructure maintenance and add a lot of new and custom tools on top of GitLab to make it fit our requirements.

There are alternatives though. One such is to elevate ourselves from being a Gerrit consumer to an actor of its open-source community. A lot of the usability concerns can be addressed by proposing code and enhancing the software. The UI is now using a JavaScript templating engine instead of java. Changing the project creation capability to no more be global but instead be a regular permission is probably not that complicated to implement had we had a couple of our java developers to look into it. But that needs resourcing on our behalf, either in our own developers or by contracting Gerrit familiar developers.

We could also consider other forms of hosting for Gerrit. Be it through a company such as GerritForge or via a likely minded organization: OpenStack and their OpenDev project. The latter offers more or less the same stack of tooling we use, are entirely open source and we borrowed our CI system from it.

The gitlab full suite is a good fit, but the limited subset of features in the open source version only gives us the branch workflow, a somehow more pleasant UI and repository creation. It is in my opinion too limited to consider worth migrating too.

Antoine "hashar" Musso (talk) 12:40, 1 October 2020 (UTC)

Reply to "OpenCore / our use case does not match GitLab business cases"