Manual:Developing libraries

These are guidelines for creating, publishing and managing libraries based on code originally developed as a part of MediaWiki or a related Wikimedia project.

Some of these are PHP-specific, but others are general and apply to all languages.

Rationale
There is a growing desire to separate useful libraries from the core MediaWiki application and make them usable in non-MediaWiki based projects.

This "librarization" of MediaWiki is thought to have several long term advantages:


 * Make life better for new (and experienced) developers by organizing the code into simple components that can be easily understood.


 * Reverse inertia toward ever expanding monolithic core by encouraging developers (in core) to develop their work as reusable modules with clearly-defined interfaces


 * Start making true unit testing of core viable by having individually-testable units


 * Provide an interim step on the way to service-oriented architecture in a way that is useful independently of that goal

Done correctly, this will provide a useful means of expanding our development capacity through recruitment of library authors eager to showcase their work on a top 10 website.
 * Encourage reuse and integration with larger software ecosystem.


 * Share our awesome libraries with others and encourage contributions from them even if they aren't particularly interested in making our sites better.

In order for this strategy to be successful, these libraries need to develop a life of their own, independent of MediaWiki.

Therefore, it will be important for library authors to have some latitude and independence in making the library successful.

The policies surrounding these should largely be dictated by the primary maintainer of the library, and the choices made may diverge from MediaWiki core.

Note that that primary maintainer may not be the original author of the majority of the code (or even any of it), and will have latitude to make independent decisions about the library.

The amount of latitude a maintainer gets is proportional to the amount of commitment, credibility and hard work with respect to the library they maintain.

Repository hosting guidelines

 * Hosted in  with GitHub mirror


 * Hosted under the [https://github.com/wikimedia Wikimedia GitHub organization]


 * Hosted under another GitHub organization


 * Hosted in  with GitHub mirror (?? are we ready to try [http://phabricator.org/applications/arcanist/ arc] out with libraries?)

The Wikimedia Foundation will likely invest in tooling that makes code review transfer from GitHub to internal review tools (Phabricator) in the future.

Eventually this should eliminate the difference between hosting the primary git repository with the Wikimedia Foundation or GitHub.

In the near term, Gerrit hosting is the default hosting option except in cases where an effort is being made to attract a significant portion of contributions from external developers.

Don't host under an individual user's GitHub account.

It complicates code review based on pull requests for the repository owner and makes management of the repository by a shared group difficult.

Note: this doesn't mean it has to be hosted under the Wikimedia account specifically; in fact, it can be more convenient to have the project under a different organizational account specific to the project, like [https://github.com/cssjanus CSSJanus].

Repository naming guidelines
Probably varies somewhat based on hosting location. Follow the local conventions as far as reasonable and possible. Do not add "wikimedia-" prefixes to repository names.

Issue tracking guidelines

 * Phabricator
 * GitHub

Your library's issue tracking should match your git hosting, in order to reduce the friction of matching commits and pull requests to issues and vice versa. Thus Gerrit-hosted repos should phab>Phabricator|Wikimedia's Phabricator instance and GitHub-hosted repos should use GitHub's built-in issue tracker.

Code review guidelines
Project code review should use the tool most closely associated with the primary git hosting. Regardless of choice of hosting platform, pre-merge code review and unit testing are strongly encouraged. Blatant self-merge behavior should be seen just as distasteful on GitHub as it generally is in Gerrit.

If primary hosting is via GitHub, changes should be proposed via pull requests rather than direct push to master. In most cases the pull requests should originate from a fork of the repository associated with the user's own GitHub account. GerritHub is a Gerrit powered code review service that can be used with GitHub hosted repositories for projects that want to use Gerrit but for some reason do not want to host with Wikimedia.

Code style guidelines
We encourage conventions>Special:MyLanguage/Manual:Coding conventions|MediaWiki style or PHP PSR-2, but the most important things are clarity, consistency, and best likelihood of adoption by the library's developer community.

When creating a new repository, you MUST choose a coding style standard and enforce it with CodeSniffer. Also point to the style guide in the project's README file.

For the MediaWiki coding style, use the [//packagist.org/packages/mediawiki/mediawiki-codesniffer mediawiki/mediawiki-codesniffer] package:

Be sure to use the latest version of mediawiki-codesniffer (check <tvar|packagist>packagist.org</>). Don't use a wildcard version, upgrades must be done explicitly to prevent a non-passing state of the master branch.

For the PSR-2 coding style, use <tvar|sniffer>PHP CodeSniffer</>:

Automated testing guidelines
Both pre and post-merge testing should be used. The testing should include basic lint, unit tests and coding style checks.

GitHub
GitHub-hosted projects can use Travis. See cssjanus's <tvar|travis>.travis.yml</> and <tvar|json>composer.json</> as an example.

Gerrit
Projects hosted in Wikimedia Gerrit can use Jenkins. Use the  entry point. See <tvar|entry>Continuous integration/Entry points</> for details.

Once created, be sure to also enable the post-merge publisher for automatically generating documentation and code coverage (e.g. for IPSet, <tvar|doc-ipset>https://doc.wikimedia.org/IPSet/</> and <tvar|int-ipset>https://integration.wikimedia.org/cover/IPSet/</>). This can be done in  in the integration/config repo. (Or file a task in phab-ci>phab:tag/ci-config/</>|the #ci-config project.)

Packagist guidelines
PHP libraries should be published on <tvar|packagist>packagist.org</>. When adding new packages:


 * 1) Follow the man-json>Manual:composer.json best practices</>|composer.json best practices and add one to your repository.
 * 2) Add the package to packagist.org:
 * 3) * Use the github.com mirror as the git url, this will allow composer to download zipballs which can be cached.
 * 4) * Ask an [<tvar|org>https://github.com/orgs/wikimedia/people</> owner in the Wikimedia GitHub organization] to set up a Packagist.org service hook in the GitHub repo (using API token of Packagist user "mediawiki").
 * 5) Add the "mediawiki" and "wikimedia" accounts as co-maintainers.

The following people have access to the different accounts in case a package needs to be updated for any reason:


 * mediawiki: <tvar|demon>^demon</>, <tvar|hashar>hashar</>, <tvar|legoktm>Legoktm</>, <tvar|bdavis>BDavis</>, ...? (credentials kept on palladium and accessible by roots)
 * wikimedia: <tvar|davis>BDavis</>, ...? (credentials kept on palladium and accessible by roots)

License guidelines
For almost anything that gets extracted from MediaWiki, it's likely that it will need to be GPLv2 (or later). All contributors must agree to a change of license from GPLv2+ in order for anyone to change the license (other than changing to GPLv3). The license of the new library needs to remain clearly marked in the headers of the code, and the full license file (typically called "LICENSE" or "COPYING") must be carried into the new project.

For a library consisting entirely of new code any license complying with the <tvar|osd>Open Source Definition</> is likely to be acceptable, but the <tvar|mit>MIT</>, <tvar|gpl2>GPLv2+</> and <tvar|apache>Apache License 2.0</> licenses may be the easiest to adopt. Include both contributor copyright grant clauses which are important for ensuring the integrity of the project's code base. The Apache2 and MIT licenses are seen as more permissive, in that it allows derivative works to include a separate license for new contributions.

Readme
Any library should have a README file that describes the project at a high level. This file should be formatted using <tvar|markdown>Markdown</> syntax (e.g. for headers, links, and code blocks) which is commonly supported by the majority of git browsers while also being human readable.

A good README will include:


 * A brief description of the primary use case the library solves
 * How to install the library
 * How to use the library (prose and brief code example if possible)
 * How to contribute
 * License name (GPL-2, ...)
 * Where to submit bugs
 * Where to submit patches
 * Link to coding standard
 * How to run tests
 * Where to see automated test results

Code documentation
Add a pipeline for generating code documentation. Common choices are:


 * Doxygen (for PHP)
 * JSDuck (for JavaScript)
 * Sphinx (for Python)
 * Yard (for Ruby)

View existing libraries' documentation at <tvar|doc-wiki>https://doc.wikimedia.org</> for an example of what these look like. See <tvar|entry-pts>Continuous integration/Entry points</> for how to configure these.

Common combinations

 * Hosted in Gerrit and published to Packagist under the  namespace
 * Hosted in the Wikimedia GitHub organization and published to Packagist under the  namespace
 * Hosted in another GitHub organization and published to Packagist as something other than wikimedia or mediawiki (e.g. cssjanus).

When hosting is in Gerrit the project should run as any other "typical" Wikimedia sponsored project with Gerrit code review, Phabricator bug tracking and wiki documentation.

If hosted at GitHub the project could reasonably choose to do code review via pull requests and host the bug tracker on GitHub. This choice should be considered carefully on a project by project basis as divergenceof code review and issue tracking tools from the larger Wikimedia community has some disadvantages:


 * The Bugwrangler will not be expected to monitor your project for issues.
 * Members of the MediaWiki developer community will probably file bugs against your product in Phabricator anyway.
 * Moving a bug report between your project and MediaWiki will be a more involved process.

These downsides may diminish over time if better bots can be created to integrate between the two environments.

Hosting under an independent GitHub organization makes sense for certain projects (CSSJanus, Wikidata, Semantic MediaWiki, ...) where an effort is being made to develop an independent and sustaining community for the project. It is especially reasonable in the case of a library like CSSJanus that is attempting to establish a cross-community and cross-language standard set of tools where only a portion of the tools overlap with the Wikimedia universe.

Transferring an existing GitHub repo to Wikimedia

 * File a ticket in the <tvar|phab-lib> [//phabricator.wikimedia.org/maniphest/task/create?projects=Librarization&title=GitHub+transfer:&priority=50 Phabricator Librarization] </> project requesting transfer.
 * When contacted, add the responding Wikimedia GitHub administrator to the project as a "Collaborator".
 * The administrator will move the project to Wikimedia and give you access.

Tips for extracting a library
The details of extracting a library will vary depending on the code being extracted and its current entanglement with other MediaWiki specific classes. In the best case, the code you want to extract is already contained in the  directory of mediawiki/core.git and thus completely unencumbered. It is suggested that code which is not in this state is progressively updated and refactored until that is the case.

Once you get all the code into, things become a little more straight forward:


 * 1) Create a new project following the rest of the guidelines in this RFC.
 * 2) Import the code from   into your new project.
 * 3) * It may be possible to use  to extract a copy of the files with commit history preserved, but that is not currently considered a prerequisite for extraction. It should be sufficient to take the current head of the files into the new project and provide documentation of the file provenance in the new project's README.
 * 4) Create a proper   file that follows best practices for the project and publish to Packagist.
 * 5) Tag the repository to create a stable release.
 * 6) Propose a change to   importing the stable release of your new project . See Manual:External libraries for additional details.
 * 7) Propose a change to   in   to require the stable release of your new project.

It may also be necessary to introduce shim classes  to provide a backwards compatible bridge between your extracted library and the existing MediaWiki code base. The [//github.com/wikimedia/cdb CDB] library did this to provide backwards-compatible class names which did not require the use of the new  namespace.

After the first release
Once the repository is bootstrapped and the initial release has been made. Here's a few next moves to consider:


 * Create a page for the library here on mediawiki.org. This page should point to:
 * Source code (e.g. git.wikimedia.org, or github.com).
 * Published package (e.g. on Packagist.org or npmjs.org).
 * Issue tracker (e.g. Phabricator workboard, or GitHub Issues).
 * API Documentation (e.g. doc.wikimedia.org).
 * Enter description and URL for GitHub mirror. Especially if it's only mirrored to GitHub, this is easy to forget. Enter a one-line description and enter the url to the mediawiki.org page (if it exists) or else the doc.wikimedia.org page. See https://github.com/wikimedia/cdb and https://github.com/wikimedia/oojs for examples.

RFC information
This originated as a request for comment, "Guidelines for extracting, publishing and managing libraries". [<tvar|labs>http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20150114.txt</> The decision] was to move this page out of the RFC space and improve it as documentation.