Manual:Developing libraries/en

These are guidelines for creating, publishing and managing libraries based on code originally developed as a part of MediaWiki or a related Wikimedia project. Some of these are PHP-specific, but others are general and apply to all languages.

Rationale
There is a growing desire to separate useful libraries from the core MediaWiki application and make them usable in non-MediaWiki based projects. This "librarization" of MediaWiki is thought to have several long term advantages:


 * Make life better for new (and experienced) developers by organizing the code into simple components that can be easily understood.
 * Reverse inertia toward ever expanding monolithic core by encouraging developers (in core) to develop their work as reusable modules with clearly-defined interfaces
 * Start making true unit testing of core viable by having individually-testable units
 * Provide an interim step on the way to service-oriented architecture in a way that is useful independently of that goal
 * Encourage reuse and integration with larger software ecosystem. Done correctly, this will provide a useful means of expanding our development capacity through recruitment of library authors eager to showcase their work on a top 10 website.
 * Share our awesome libraries with others and encourage contributions from them even if they aren't particularly interested in making our sites better.

In order for this strategy to be successful, these libraries need to develop a life of their own, independent of MediaWiki. Therefore, it will be important for library authors to have some latitude and independence in making the library successful. The policies surrounding these should largely be dictated by the primary maintainer of the library, and the choices made may diverge from MediaWiki core. Note that that primary maintainer may not be the original author of the majority of the code (or even any of it), and will have latitude to make independent decisions about the library. The amount of latitude a maintainer gets is proportional to the amount of commitment, credibility and hard work with respect to the library they maintain.

Repository hosting guidelines

 * Hosted in with GitHub mirror
 * Hosted under the Wikimedia GitHub organization
 * Hosted under another GitHub organization
 * Hosted in with GitHub mirror (?? are we ready to try arc out with libraries?)

The Wikimedia Foundation will likely invest in tooling that makes code review transfer from GitHub to internal review tools (Phabricator) in the future. Eventually this should eliminate the difference between hosting the primary git repository with the Wikimedia Foundation or GitHub. In the near term, Gerrit hosting is the default hosting option except in cases where an effort is being made to attract a significant portion of contributions from external developers.

Don't host under an individual user's GitHub account. It complicates code review based on pull requests for the repository owner and makes management of the repository by a shared group difficult. Note: this doesn't mean it has to be hosted under the Wikimedia account specifically; in fact, it can be more convenient to have the project under a different organizational account specific to the project, like CSSJanus.

Repository naming guidelines
Probably varies somewhat based on hosting location. Follow the local conventions as far as reasonable and possible. Do not add "wikimedia-" prefixes to repository names.

Issue tracking guidelines

 * Phabricator
 * GitHub

Your library's issue tracking should match your git hosting, in order to reduce the friction of matching commits and pull requests to issues and vice versa. Thus Gerrit-hosted repos should Wikimedia's Phabricator instance and GitHub-hosted repos should use GitHub's built-in issue tracker.

Code review guidelines
Project code review should use the tool most closely associated with the primary git hosting. Regardless of choice of hosting platform, pre-merge code review and unit testing are strongly encouraged. Blatant self-merge behavior should be seen just as distasteful on GitHub as it generally is in Gerrit.

If primary hosting is via GitHub, changes should be proposed via pull requests rather than direct push to master. In most cases the pull requests should originate from a fork of the repository associated with the user's own GitHub account. GerritHub is a Gerrit powered code review service that can be used with GitHub hosted repositories for projects that want to use Gerrit but for some reason do not want to host with Wikimedia.

Code style guidelines
We encourage MediaWiki style or PHP PSR-2, but the most important things are clarity, consistency, and best likelihood of adoption by the library's developer community.

When creating a new repository, you MUST choose a coding style standard and enforce it with CodeSniffer. Also point to the style guide in the project's README file.

For the MediaWiki coding style, use the [//packagist.org/packages/mediawiki/mediawiki-codesniffer mediawiki/mediawiki-codesniffer] package.

Be sure to use the latest version of mediawiki-codesniffer (check packagist.org). Don't use a wildcard version, upgrades must be done explicitly to prevent a non-passing state of the master branch.

For the PSR-2 coding style, use PHP CodeSniffer.

Automated testing guidelines
Both pre and post-merge testing should be used. The testing should include basic lint, unit tests and coding style checks.

GitHub
GitHub-hosted projects can use Travis. See cssjanus's .travis.yml and composer.json as an example.

Gerrit
Projects hosted in Wikimedia Gerrit can use Jenkins. Use the  entry point. See Continuous integration/Entry points for details.

Once created, be sure to also enable the post-merge publisher for automatically generating documentation and code coverage (e.g. for IPSet, https://doc.wikimedia.org/IPSet/ and https://integration.wikimedia.org/cover/IPSet/). These jobs can be enabled for your project in the  repository (example commit: ). Alternatively, file a task in the #ci-config project.

Packagist guidelines
PHP libraries should be published on packagist.org. When adding new packages:


 * 1) Follow the composer.json best practices and add one to your repository.
 * 2) Configure the Packagist.org service hook in the GitHub repo:
 * 3) * Use the github.com mirror as the git url, this will allow composer to download zipballs which can be cached.
 * 4) * Ask an owner in the Wikimedia GitHub organization to set up a Packagist.org service hook in the GitHub repo.
 * 5) Submit the repo to Packagist.org:
 * 6) * Log in to Packagist.org with the wikimedia account and submit the GitHub url.
 * 7) Ensure both the "mediawiki" and "wikimedia" accounts are maintainer of the package.

The following people have access to the different accounts in case a package needs to be updated for any reason:


 * wikimedia: BDavis, ...? (credentials kept on palladium and accessible by roots)
 * mediawiki: ^demon, hashar, Legoktm, BDavis, Krinkle ...? (credentials kept on palladium and accessible by roots)

License guidelines
For almost anything that gets extracted from MediaWiki, it's likely that it will need to be GPLv2 (or later). All contributors must agree to a change of license from GPLv2+ in order for anyone to change the license (other than changing to GPLv3). The license of the new library needs to remain clearly marked in the headers of the code, and the full license file (typically called "LICENSE" or "COPYING") must be carried into the new project.

For a library consisting entirely of new code any license complying with the Open Source Definition is likely to be acceptable, but the MIT, GPLv2+ and Apache License 2.0 licenses may be the easiest to adopt. Include both contributor copyright grant clauses which are important for ensuring the integrity of the project's code base. The Apache2 and MIT licenses are seen as more permissive, in that it allows derivative works to include a separate license for new contributions.

Readme
Any library should have a README.md file that describes the project at a high level. This file should be formatted using Markdown syntax (e.g. for headers, links, and code blocks) which is commonly supported by the majority of git browsers while also being human readable.

A good README.md will include:
 * A brief description of the primary use case the library solves
 * How to install the library
 * How to use the library (prose and brief code example if possible)
 * How to contribute
 * License name (GPL-2, ...)
 * Where to submit bugs
 * Where to submit patches
 * Link to coding standard
 * How to run tests
 * Where to see automated test results

Code documentation
Add a pipeline for generating code documentation. Common choices are:


 * Doxygen (for PHP)
 * JSDuck (for JavaScript)
 * Sphinx (for Python)
 * Yard (for Ruby)

View existing libraries' documentation at https://doc.wikimedia.org for an example of what these look like. See Continuous integration/Entry points for how to configure these.

On-wiki documentation
Libraries should have a brief page on mediawiki.org documenting their purpose, history, and related links. Examples (complete list):

Common combinations

 * Hosted in Gerrit and published to Packagist under the  namespace
 * Hosted in the Wikimedia GitHub organization and published to Packagist under the  namespace
 * Hosted in another GitHub organization and published to Packagist as something other than wikimedia or mediawiki (e.g. cssjanus).

When hosting is in Gerrit the project should run as any other "typical" Wikimedia sponsored project with Gerrit code review, Phabricator bug tracking and wiki documentation.

If hosted at GitHub the project could reasonably choose to do code review via pull requests and host the bug tracker on GitHub. This choice should be considered carefully on a project by project basis as divergenceof code review and issue tracking tools from the larger Wikimedia community has some disadvantages:


 * The Bugwrangler will not be expected to monitor your project for issues.
 * Members of the MediaWiki developer community will probably file bugs against your product in Phabricator anyway.
 * Moving a bug report between your project and MediaWiki will be a more involved process.

These downsides may diminish over time if better bots can be created to integrate between the two environments.

Hosting under an independent GitHub organization makes sense for certain projects (CSSJanus, Wikidata, Semantic MediaWiki, ...) where an effort is being made to develop an independent and sustaining community for the project. It is especially reasonable in the case of a library like CSSJanus that is attempting to establish a cross-community and cross-language standard set of tools where only a portion of the tools overlap with the Wikimedia universe.

Transferring an existing GitHub repo to Wikimedia

 * File a ticket in the [//phabricator.wikimedia.org/maniphest/task/create?projects=Librarization&title=GitHub+transfer:&priority=50 Phabricator Librarization] project requesting transfer.
 * When contacted, add the responding Wikimedia GitHub administrator to the project as a "Collaborator".
 * The administrator will move the project to Wikimedia and give you access.

Tips for extracting a library
The details of extracting a library will vary depending on the code being extracted and its current entanglement with other MediaWiki specific classes. In the best case, the code you want to extract is already contained in the  directory of mediawiki/core.git and thus completely unencumbered. It is suggested that code which is not in this state is progressively updated and refactored until that is the case.

Once you get all the code into, things become a little more straight forward:
 * 1) Create a new project following the rest of the guidelines in this RFC.
 * 2) Import the code from   into your new project.
 * 3) * It may be possible to use  to extract a copy of the files with commit history preserved, but that is not currently considered a prerequisite for extraction. It should be sufficient to take the current head of the files into the new project and provide documentation of the file provenance in the new project's README.
 * 4) Create a proper   file that follows best practices for the project and publish to Packagist.
 * 5) Tag the repository to create a stable release.
 * 6) Propose a change to   importing the stable release of your new project . See Manual:External libraries for additional details.
 * 7) Propose a change to   in   to require the stable release of your new project.

It may also be necessary to introduce shim classes  to provide a backwards compatible bridge between your extracted library and the existing MediaWiki code base. The CDB library did this to provide backwards-compatible class names which did not require the use of the new  namespace.

Tagging a release

 * When determining the next version number, stick to semantic versioning
 * GPG sign the tag by adding  to the   command
 * You may also want to provide a changelog in the annotated tag notes

After the first release
Once the repository is bootstrapped and the initial release has been made. Here's a few next moves to consider:


 * Create a page for the library here on mediawiki.org. This page should point to:
 * Source code (e.g. git.wikimedia.org, or github.com).
 * Published package (e.g. on Packagist.org or npmjs.org).
 * Issue tracker (e.g. Phabricator workboard, or GitHub Issues).
 * API Documentation (e.g. doc.wikimedia.org).
 * Enter description and URL for GitHub mirror. Especially if it's only mirrored to GitHub, this is easy to forget. Enter a one-line description and enter the url to the mediawiki.org page (if it exists) or else the doc.wikimedia.org page. See https://github.com/wikimedia/cdb and https://github.com/wikimedia/oojs for examples.

RFC information
This originated as a request for comment, "Guidelines for extracting, publishing and managing libraries". The decision was to move this page out of the RFC space and improve it as documentation.