GitLab/Hosting a project on GitLab

Where should your project live?
You have two options:


 * 1) Host the project under your personal namespace. This is appropriate and expected for things which are of benefit in the Wikimedia movement and/or the MediaWiki ecosystem which are:
 * 2) * Personal or user-specific tooling such as dotfiles
 * 3) * Early experimental prototypes and short-lived exploratory work
 * 4) * Forks of projects you can't push to directly, but wish to make merge requests for
 * 5) Host the project under a group. This is appropriate for:
 * 6) * Long-lived, shared projects
 * 7) * ...that benefit the Wikimedia movement and/or the MediaWiki ecosystem

Supported GitLab projects live in a group named for the functional area of code, under the top-level /repos. This gives access to a shared pool of CI runners.

There are some exceptions to this layout, where there are differences in policy around project trust and access to CI runners:
 * gitlab:toolforge-repos
 * gitlab:cloudvps-repos

Users should typically be added to the project's group with an appropriate access level, rather than to the specific project.

Locating a group
First, check the list of subgroups under repos to see if an appropriate group already exists. Most of the existing ones correlate to WMF teams or affiliate organizations, or large functional areas like MediaWiki.

If you see a group that seems like a match, view the group and click on "Subgroup information" in the upper lefthand corner. Then navigate to "Members" to see existing members of the group. Ask one of them to invite you to the group. (TODO: What's the best general contact mechanism here? Phabricator form?)

For projects traditionally under the  namespace on Gerrit, we're porting most of the existing structure across:


 * Extensions:
 * Skins:
 * Services:

If your new project fits in one of these categories, create or request a project under the appropriate group.

info template: Note that Gerrit supports parent repositories which contain both code and child repos. For example, on Gerrit  is both a git repo with code and a container for extension repos. This is not possible on GitLab, so  is a group and the code repo there is moving to.

Creating a new group
If you do not find an appropriate group but think one should be created please use this form to request a new group.

Keep in mind that groups may only contain other groups and individual projects. They may not themselves be a code repository, so you may have to change project layouts you are familiar with from Gerrit.

Creating a new project

 * Visit https://gitlab.wikimedia.org and make sure you're signed in with your Wikimedia developer account.
 * Click "Menu"
 * Under "Projects", click "Create new project"
 * Click "Create blank project" or "Create from template"
 * "Create from template" will prompt you to select from a list of templates, then on to the project creation form
 * Keep in mind that templates may include use of features which are not supported on the Wikimedia instance, such as issues and wikis. If so, these will be automatically disabled at some point after your project is created.
 * Look under "Project URL" and click the "Pick a group or namespace" field
 * For a project under your user namespace, type your username in the search box or scroll down until you see the "Users" section and select your username
 * For a project under a group, type the group name in the search box or scroll until you see it
 * Click "Create project"

GitLab private (restricted) repos
Wikimedia aims for open collaboration—free code is free knowledge.

However, there are times when developers may wish to share sensitive information within a restricted group.

GitLab private repos are for restricted information
Restricted information is non-confidential information that is sensitive and should only be shared with a small group.

Examples include:
 * Security incident response activities
 * Security configurations (e.g., CAPTCHA dictionaries, spam filter settings)
 * Embargoed information, soon to be public
 * Hiring tasks
 * Data that is unlikely to be used to cause harm and is private for administrative reasons
 * Sensitive analyses of internal data

GitLab private repos are NOT for confidential information
Confidential information is regulated, privileged, or highly sensitive.

Confidential information is NEVER ALLOWED in GitLab private repos.

Examples include:
 * Personal Identifiable Information (PII)
 * IP addresses
 * Plain-text or weak-encryption (e.g., md5) passwords
 * Private key material

Requesting a private (restricted) repo
Prerequisites
 * Ensure that you and any collaborators are using strong passwords and have two-factor authentication set up for your account

Process

To request a Private repo, file a request on Wikimedia Phabricator the GitLab private (restricted) repo request form

If you're requesting a private repo within your personal namespace (as opposed to a group namespace e.g. ), you need to make a public repo and then that can be converted into a private repo. A GitLab admin cannot make a private repo within your personal namespace.

Once you request it, that request will be reviewed by at the next weekly Developer Experience team meeting and somebody will be assigned to handle it.

GitLab private repo requirements

 * If you intend to make a private repo public at some point, create a new public repo instead when you're ready. If there was sensitive/restricted information in the repo at some point, making it public will also make the entire commit history public. Once you get a repo to a "ready for public consumption" state, you should start a fresh commit history. What to do with the original (private) repo is up to you, but I suggest deleting it and using the new (public) repo as the source of truth going forward.
 * Do not have a single shared repository for multiple unrelated or loosely connected projects; use one repo per logical project. Since we have to request private repos it may be tempting to just request a single repo for all private work instead of requesting separate repos for different projects. However, "monorepos" are both a version control anti-pattern and create additional, unnecessary risk. Having multiple repos means you can manage access for them separately, and giving somebody access to one repo will not automatically give them access to other repos/projects.
 * PII like IP addresses cannot be shared even in private repos, use -restricted tasks in Phabricator only. There's an established, recognized process for keeping Phabricator info Security- or T&S-protected, and it also decreases the number of potential points of failure for that information. If you have to analyze traffic or edit data for specific IP addresses, either:
 * Store the IP addresses in a separate table to join with
 * Use variables in your query
 * Store the IP addresses in a non-tracked/committed file or as environment variables
 * Read the file or access the environment variables in the notebook
 * Fill-in the variables in the query with the values
 * Don't print the query before running it (or if you did for troubleshooting, make sure to remove those cells before committing)
 * Essentially: a user should not be able to use your notebook or repo's commit history to see which pages an IP address visited more than 90 days ago. (Or what IP addresses a registered user used more than 90 days ago.)

Importing code to GitLab

 * Click on New Project > Import Project
 * Choose Repository by URL
 * Get URL from source repository
 * Go to source code management tool (Gerrit, GitHub, gitlab.com, etc.), find https URL
 * Something like: https://gerrit.wikimedia.org/r/mediawiki/services/[project]
 * Paste URL into "Git repository URL"
 * Name can be free-form
 * Slug should correspond to previously existing repo name
 * Fill out project description in detail
 * Visibility will be public by default, you will likely not be able to change this.
 * Press "Create Project"

This will create a complete copy of the repository, including all branches and review notes from Gerrit.

Branch renaming
By policy (see T281593), repositories on GitLab should use "main" as a default branch name, unless they already use some other default than "master", such as "production" or "wmf/stable":


 * New branch -> Called main
 * Settings -> Repository -> Branch defaults -> main
 * Settings -> Repository -> Protected branches -> add main -> Protect
 * Branches -> trash can icon for master, type master

Updating references
All references to the repo need to be updated. This may include existing checkouts, submodule URLs, scripts, Puppet configuration, etc.

Tip: Use codesearch to find references in Wikimedia code.

Updating existing checkouts
git remote set-url origin [gitlab url] git fetch git checkout main git branch -d master

Updating .gitreview
If there's an existing .gitreview, update the file w/ gitlab info (for users who may want to use gerritlab), using the example below as a template. Replace  with your project id (which appears in the top left of the page when you visit the repo in https://gerrit.wikimedia.org).

Archiving old projects

 * TODO: Remove from CI

Mirroring projects to other code forges
See GitLab/Mirroring projects.

Enabling GitLab CI for a project
For new projects, see GitLab/Workflows/CI.

For projects with an existing, see GitLab/pipeline conversion.

To publish docs to, see GitLab/Publishing docs

TODOs

 * TODO: Update Phabricator mirroring?
 * TODO: Updating external references and docs
 * TODO: cf https://phabricator.wikimedia.org/project/profile/2829/