Security/Application Security Pipeline

Purpose
This document provides guidance on how to implement security into the CI/CD pipeline, leveraging both GitLab's integrated tools and custom tools provided and developed by the Security Team.

In an effort to improve application security testing, our goal has been to “shift left” to remove more vulnerabilities earlier. The idea is to empower developers to find and fix vulnerabilities earlier in the software development lifecycle, when changes are less costly and more timely.

With security embedded into the development workflow, developers can get feedback on the security of their code as they are working, they can remediate in real time, and free up the security team’s time to focus on monitoring issues, assessing risk, and solving vulnerabilities that can’t be fixed by the developer. By continuously testing even small, incremental code changes, an avalanche of work is avoided at the end of the SDLC.

Note: The Security Team strongly recommends including security scanning tools into either migrated or new repository pipelines. These features have to be triggered both for new Merge Requests and for Continuous Development/Delivery.

Templates Provided by the Security Team
The Security Team strongly suggests to use the relevant templates located at this GitLab repository. In order to include these templates, please see the documentation.

Custom Templates
Even though the Security Team strongly advises against the use of custom templates, there is the possibility to implement your own custom ones.

GitLab Default Templates
For all of the remaining Languages/Frameworks not supported by the Security Team, there is the possibility to use the GitLab default one. It features automatic language detection which works even for mixed-language projects. If any supported language is detected in the project source code it automatically runs the appropriate SAST analyzers.

Even though the results may not be the greatest, it could still provide some value. Here is the list of GitLab supported languages.

How to configure GitLab default SAST templates
To enable and configure SAST with default settings:
 * 1) On the top bar, select Menu > Projects and find your project.
 * 2) On the left sidebar, select Security & Compliance > Configuration.
 * 3) In the SAST section, select.
 * 4) Review the draft MR that enables SAST with the default recommended settings in the   file.
 * 5) Merge the MR to enable SAST. You should see SAST jobs run in that MR’s pipeline.

OSV NodeJS
Name of the template:

Description:  is a vulnerability database and triage infrastructure for open source projects aimed at helping both open source maintainers and consumers of open source. This include file uses the osv-scanner client to analyze various lock files and SBOMs.

How to include it: change your  accordingly:

Environmental variables: These tools are triggered for each and every new push or merge request. It is also possible to trigger them manually by visiting the CI/CD section on GitLab.


 * the suggested image is:
 * overrides default arguments passed. The default is package-lock.json.

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. NodeJS OSV

OSV Php
Name of the template:

Description:  is a vulnerability database and triage infrastructure for open source projects aimed at helping both open source maintainers and consumers of open source. In order to use the vulnerability scanner written in Go, a Software Bill Of Materials (SBOM) has to be generated from a "lockfile" which contains information about the versions of packages.

How to include it: change your  accordingly:

Environmental variables: These tools are triggered for each and every new push or merge request. It is also possible to trigger them manually by visiting the CI/CD section on GitLab.


 * the suggested image is:

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output.

OSV Golang
Name of the template:

Description:  is a vulnerability database and triage infrastructure for open source projects aimed at helping both open source maintainers and consumers of open source. In order to use the vulnerability scanner written in Go, a Software Bill Of Materials (SBOM) has to be generated from a "lockfile" which contains information about the versions of packages.

How to include it: change your  accordingly:

Environmental variables: These tools are triggered for each and every new push or merge request. It is also possible to trigger them manually by visiting the CI/CD section on GitLab.


 * the suggested image is:

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output.

OSV Python
Name of the template:

Description:  is a vulnerability database and triage infrastructure for open source projects aimed at helping both open source maintainers and consumers of open source. In order to use the vulnerability scanner written in Go, a Software Bill Of Materials (SBOM) has to be generated from a "lockfile" which contains information about the versions of packages.

How to include it: change your  accordingly:

Environmental variables: These tools are triggered for each and every new push or merge request. It is also possible to trigger them manually by visiting the CI/CD section on GitLab.


 * the suggested image is:

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output.

Npm Audit
Name of the template:

Description: The audit command submits a description of the dependencies configured in your project to your default registry and asks for a report of known vulnerabilities. If any vulnerabilities are found, then the impact and appropriate remediation will be calculated. If the  argument is provided, then remediations will be applied to the package tree.

How to include it: change your  accordingly:

Environmental variables: These tools are triggered for each and every new push or merge request. It is also possible to trigger them manually by visiting the CI/CD section on GitLab.


 * the suggested image is:
 * Use this variable to specify custom flags. Default ones:

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can also output in.

Nancy
Name of the template:

Description: It is a tool to check for vulnerabilities in your Golang dependencies, powered by Sonatype OSS Index. It currently works for projects that use  or   for dependencies.

How to include it: change your  accordingly:

Environmental variables:


 * the suggested image is:
 * Use this variable to specify custom flags.

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. Multiple different output formats are supported such as:,  ,   and.

AuditJS
Name of the template:

Description: Audits JavaScript projects using the OSS Index v3 REST API to identify known vulnerabilities and outdated package versions.

How to include it: change your  accordingly: Environmental variables:
 * the suggested image is:
 * use this variable to specify custom flags.

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can output directly as  or as   specifically formatted for   test cases.

Npm Outdated
Name of the template:

Description: This tool will check the registry to see if any (or, specific) installed packages are currently outdated.

How to include it: change your  accordingly: Environmental variables:


 * the suggested image is:


 * use this variable to specify custom flags.

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can also output in.

Php composer outdated
Name of the template:

Description: Composer Outdated is a sub-function of composer that checks for outdated dependencies.

How to include it: change your  accordingly: Environmental variables: Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can also output in.
 * the suggested image is:
 * Use this variable to specify custom flags.

Php security checker
Name of the template:

Description: PHP Security Checker is a command line tool that checks if your PHP application depends on PHP packages with known security vulnerabilities. It uses the Security Advisories Database behind the scenes.

How to include it: change your  accordingly: Environmental variables: Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can also output in,  ,  ,  , and.
 * the suggested image is:
 * Use this variable to specify custom flags.

Safety db
Name of the template:

Description: Safety DB is a database of known security vulnerabilities in Python packages. The data is made available by pyup.io and synced with this repository once per month. Most of the entries are found by filtering CVEs and changelogs for certain keywords and then manually reviewing them.

How to include it: change your  accordingly: Environmental variables: Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output.
 * the suggested image is:
 * Use this variable to specify custom flags.

Bandit
Name of the template:

Description: Bandit is a tool designed to find common security issues in Python code. To do this Bandit processes each file, builds an AST from it, and runs appropriate plugins against the AST nodes. Once Bandit has finished scanning all the files it generates a report.

Bandit was originally developed within the OpenStack Security Project and later rehomed to PyCQA.

How to include it: change your  accordingly: Environmental variables: Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. can also output in.
 * the suggested image is:
 * Use this variable to specify custom flags.

Php Phan-taint-check
Name of the template:

Description: Phan is a static analyzer for PHP that prefers to minimize false-positives. Phan attempts to prove incorrectness rather than correctness. The phan-taint-check plugin is a phan plugin which performs security-related static analysis against PHP codebases, with a specific adaptor for MediaWiki codebases.

How to include it: change your  accordingly: Environmental variables:
 * the suggested image is:  (php8+ is required for a symphony dependency of phan)
 * Use this variable to specify custom cli flags.
 * The branch of mediawiki to clone, if analyzing a MediaWiki extension or skin. Default is  .  At some point, it would be nice to automate this to reliably use.
 * The current supported MediaWiki types are  and  .  For any other type of php application, set this variable to an empty string or any other value than "extensions" or "skins".
 * The project's git name or slug. Defaults to.
 * Specify whether you'd like to see all phan issues (prefixed with ) or just those from the phan-taint-check plugin (prefixed with  ).  Defaults to

Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output.

Gosec
Name of the template:

Description: Inspects source code for security problems by scanning the Go AST.

How to include it: change your  accordingly: Environmental variables: Expected output: The Default output is the textual report of vulnerabilities in the Gitlab CI console output. also supports,  ,  ,  ,  ,   and   output formats.
 * the suggested image is:
 * Use this variable to specify custom flags.

Semgrep
In order to use semgrep, include the relevant CI template within your  like any other provided security include/template.

You can configure which rule set or collection of rules to run by changing the variable. Not that you can chain multiple  options to leverage multiple policies or rules. For example:

Note: It is always strongly recommended to use one of the available python-based docker images from https://docker-registry.wikimedia.org/.

Note: The Security Team discourages the use of  because metrics will be reported back to Semgrep.

Default Rule Sets
Unfortunately, due to potential licensing issues with Semgrep's primarmy rule repository (see also: T304737) we cannot currently take advantage of any default rules or policies hosted at https://semgrep.dev/r, which are typically referenced via.

Semgrep Merge Tool
The Semgrep Merge Tool (hosted on toolforge) is a tool developed and maintained by the Wikimedia Security Team for use with the semgrep CI template. This tool offers a curated group of OSI- and free-culture-compliant rules and policies which can be referenced via the URLs linked upon the index page. For additional usage questions, please contact the current project maintainers.

Note: the Semgrep Merge Tool should be considered the default method of using semgrep rules within Gitlab CI/CD.

Custom Rule Sets
Semgrep supports the development of custom rules. The suggested way to do this is to create a new  file that contains your rules (either within your repository, or within the wikimedia-semgrep-rules repository), and then referencing these   files within the   file like this:

Default Scanning behaviour
By default, semgrep will only check altered files from any Gitlab push or merge request. In order to trigger a full scan, you must manually run the pipeline or schedule it.

Note: It is still possible to override this default behavior by setting the variable  to "true" within your repository's   file:

Results
Security automation is far from perfect. Even though our goal is to have straight-forward scan results and fully automated remediations, unfortunately that is not always possible. Results may be polluted by false positives, and remediations may appear unclear. When these scenarios occur, please get in touch with the Security Team to work on possible solutions.

Risk review and acceptance
Note: Repository owners are implicitly accepting the risk of the findings from various security tools run in Gitlab CI. It is thus their responsibility to fix issues or to accept the risk of these results. Please contact the Security Team to get help on how to address the issues.

How to check results
In order to check the output of various security tools, please go to the Jobs section of your repository, under the  section of your repository. Make sure you are viewing the correct branch of a repository as well.

Based on the specific tool configuration, a job could never fail, or fail only when new critical/high/moderate vulnerabilities are discovered. For example, the  and   are set to allow failures within a given pipeline, implying they are more informational in nature and shouldn't block any merge or deployment. Whereas as  and other tools would fail within a pipeline, with the expectation that a code maintainer would work through various reported issues and either resolve them or report them as false positives and possibly suppress future output with various mechanisms supported by an individual tool.

JSON Artifacts
Sometimes security tools emit JSON output in the form of a report. The JSON report file can be downloaded from the CI pipelines page, or the pipelines tab on merge requests by setting  to. For more information see the documentation and. This can be a more secure and customizable approach to security reports, but as nearly all Wikimedia repos, merge requests, etc. are or will be public, there isn't much advantage to this approach given current realities.

Example

For phpcs-security-tool add this to the  file:

Known Guides and Documentation

 * Configure SAST in the UI with default settings
 * GitLab CI Security Repository