Requests for comment/Streamlining Composer usage

Background
MediaWiki core and its extensions depend on libraries that are managed via composer. This RFC intends to continue from where Requests for comment/Composer managed libraries for use on WMF cluster left off. Library infrastructure for MediaWiki will increase the use of Wikimedia maintained libraries hugely. To not hinder this effort we need to streamline the process for adding and upgrading composer dependencies and for building and deploying with composer.

Besides library dependencies composer can be used to build an autoloader for parts of core and parts of extensions. To properly make use of that we need to ensure that our build and deployment process works with that.

For development purposes, most people currently run composer in the root of Mediawiki. This loads the composer-merge-plugin which merges all dependencies from other specified composer.json files, usually from extensions. For Wikimedia production deployment from them wmf branches, we do not use composer directly but instead an intermediate repository mediawiki/vendor which is manually updated. In between we have the master development branches, continuous integration jobs and the beta cluster environment which all currently use mediawiki/vendor. Each of the branches might need to use a different strategy for continuous integration in the future.

Wikidata (Wikibase, related extensions and dependencies)
Wikidata will bring in 19 more components, only maintained by people trusted with Wikimedia merge rights (i.e. not components from outside Wikimedia). Its CI uses one composer run like during development instead of mediawiki/vendor. Once per day a virtual machine builds the Wikidata "extension". It contains all extensions needed for Wikidata.org and dependencies. The build output is proposed as a patch to Gerrit for the mediawiki/extensions/Wikidata repository. It is then +2ed by a human.

Wikidata dependencies:

 * 1) composer/installers @v1.0.21 (already in core)
 * 2) data-values/common @0.2.3
 * 3) data-values/data-types @0.4.1
 * 4) data-values/data-values @1.0.0
 * 5) data-values/geo @1.1.4
 * 6) data-values/interfaces @0.1.5
 * 7) data-values/javascript @0.7.0
 * 8) data-values/number @0.4.1
 * 9) data-values/serialization @1.0.2
 * 10) data-values/time @0.7.0
 * 11) data-values/validators @0.1.2
 * 12) data-values/value-view @0.14.5
 * 13) diff/diff @2.0.0
 * 14) serialization/serialization @3.2.1
 * 15) wikibase/data-model @3.0.0
 * 16) wikibase/data-model-javascript @1.0.2
 * 17) wikibase/data-model-serialization @1.4.0
 * 18) wikibase/internal-serialization @1.4.0
 * 19) wikibase/javascript-api @1.0.3
 * 20) wikibase/serialization-javascript @2.0.3
 * 21) propertysuggester/property-suggester @2.2.0 (extension, would become submodule of core)
 * 22) wikibase/wikibase @dev-master (extension, would become submodule of core)
 * 23) wikibase/Wikidata.org @dev-master (extension, would become submodule of core)
 * 24) wikibase/wikimedia-badges @dev-master (extension, would become submodule of core)

Problem
TODO submodule updates in wmf deployment branches are now automatic.

Double Review
Upgrading a dependency of e.g. the extension Wikibase if it were included in mediawiki/vendor.git: This is work that could be automated. Now a human might not notice when something doesn't match even though they are the magic prevention mechanism for problems that are not specified in enough detail to know what automatic mechanisms could prevent them instead. This is extra manual review work while Wikimedia can't even keep up with the normal influx of reviews :-(.
 * 1) A patch to the dependency (e.g. wikibase/data-model) is proposed.
 * 2) It is reviewed and merged by a Wikimedian.
 * 3) A release for the dependency is done.
 * 4) A patch that updates the requirement in mediawiki/extensions/Wikibase.git is proposed, reviewed and merged.
 * 5) An update of mediawiki/vendor is proposed, causing a second review!

A dependency is updated in master.
CI and beta cluster runs composer to bring in the new dependency. Staging breaks because the dependency is missing from vendor. On wmf branch cut, the first job breaks because the dependency has not been included in mediawiki/vendor

To unbreak staging, one need to push the new dependency to mediawiki/vendor. On wmf branch cut, it will thus be included for production usage.

Potential breakage:
 * 1) Extension Foo updates its dependency Bar from 2.0.0 to 3.0.0.
 * 2) CI against master branch pass since it uses composer
 * 3) beta works since it uses composer
 * 4) mediawiki/vendor change is proposed to gerrit, but is not merged in time
 * 5) Staging deployment happens automatically.
 * 6) Staging is broken, because it still contains Bar 2.0.0 but Foo extension requires 3.0.0
 * 7) Usually after the cut of WMF branches the tests are triggered manually. Those would fail a phpunit  and selenium test because of the inconsistency and we can’t deploy. (Currently the checks in update.php to see if vendor is up to date with composer.json is not sufficient because it does not look at the extension composer.json merged by the merge plugin.)

A class is added to an Extension that uses a composer generated autoloader
As above if vendor is not updated this will break staging and a phpunit and selenium test before production deployment. mediawiki/vendor.git needs the updated class map for its autoloader. A new wmf branch is cut New core, extensions, vendor branches are created.

Creation of 1.26wmf9: * 20c7219 - Submitting branch for review so that it gets tested by jenkins. refs T101551 (7 days ago)  * 11015b2 - Creating new WMF 1.26wmf9 branch (7 days ago)  * 6521b36 - Creating new WMF 1.26wmf9 branch (7 days ago)  So some Jenkins tests are run see https://gerrit.wikimedia.org/r/#/c/217043/  done for https://phabricator.wikimedia.org/T101551 ( Run checkComposerLockUpToDate.php after creating a new WMF deployment branch)

A dependency is updated in vendor.git wmf branch
It works fine, because the submodule is in core is still pointing to the old version. In a patch to mediawiki core update the vendor submodule and any changes needed to make core or submodules compatible in the same patch.

Circular dependencies
We are unable to update libraries in core and vendor due to version mismatch and circular dependencies without breakage (https://phabricator.wikimedia.org/T88211#1317971). Suggestion for the short term is to use --skip-external-dependencies in the vendor CI. This allows one to update vendor ahead of core and then to unbreak core by doing the change in core.

operations/mediawiki-config
There is a composer.json file, the dependencies are embedded in the git repo under /multiversion/vendor. We could have a version conflict with mediawiki/core and/or mediawiki/vendor.

Usage of github
Some of the parts here might be on github. Manual:Developing_libraries suggests it is ok. The merge-plugin is hosted there.

Proposal
 automatically build and commit mediawiki/vendor (composer): https://phabricator.wikimedia.org/T101123

The envisioned strategy to bring in dependencies is:

TODO upload image

With CI matching it based on the branch a patch proposed against.

If we had the staging cluster, the vendor repo would be up to date already and thus we can cut the wmf branch from it.

The person cutting the wmf branches and deploying the train is likely affected the most by this.

Main question is:

How do we update mediawiki/vendor.git automatically?

Do we instead do this during scap? Would we then stop using mediawiki/vendor.git?

TODO referencerefernce:

https://www.mediawiki.org/wiki/Requests_for_comment/Extensions_continuous_integration

https://www.mediawiki.org/wiki/Requests_for_comment/Extension_registration

https://www.mediawiki.org/wiki/Requests_for_comment/Improving_extension_management