Extension:WikibaseMediaInfo/Development

This extension is still under active development. This document contains some additional information that may be of interest to developers contributing to this project.

Setting up a Federated Development Environment in Vagrant
In production, WikibaseMediaInfo is used to enrich media files on Commons with data that lives elsewhere (Properties and Items from Wikidata). The process by which these two separate wikis communicate is known as Federation.

Setting up a similar relationship between two local wikis is not strictly required for WBMI development but in many situations it will be useful; such a system is also a closer approximation to the production environment.

Configure Vagrant
Mediawiki-Vagrant can be configured to set up a local "Wikidata" wiki and a local "Commons" wiki on separate URLs. Here's how to do it:


 * 1) Follow the  quick-start  instructions for MediaWiki-Vagrant (install VirtualBox, install Vagrant, pull down the latest MediaWiki code with , and run the   script.
 * 2) Run the following commands to update Vagrant's configuration:
 * 3) Enable the appropriate Vagrant roles. In this case, you'll want the following
 * 4) Spin up Vagrant and provision all roles:   (warning, can take a long time)
 * 5) Run `vagrant git-update` once that process completes
 * 1) Spin up Vagrant and provision all roles:   (warning, can take a long time)
 * 2) Run `vagrant git-update` once that process completes

Configure Wikis
Once the Vagrant environment is ready, you'll need to add wiki-specific configuration (certain settings need to be enabled only for certain wikis). We are primarily concerned with Wikidatawiki (http://wikidata.wiki.local.wmftest.net:8080) and Commonswiki (http://commons.wiki.local.wmftest.net:8080/wiki).

Per the Mediawiki-Vagrant recommendations, config should be placed in a dedicated file inside of the   directory; see the README file in that directory for more details. Note: files created directly inside  are safe, but anything placed inside the sub-directories is expected to be managed by Puppet, meaning they will be erased if you run   later.

Configuration files inside  should be prefixed with a 2-digit numerical code that determines the order they are to be run in. In this case, we want a collection of WBMI-specific settings that will be applied after everything else, so a name like  would make sense.

The file below assumes that the ID of your local "depicts" property is P1; if not, replace with the appropriate property ID

Import Wikidata (optional)
If you'd prefer not to manually populate the local Wikidata wiki with properies and items for WBMI to use, you can use the WikibaseImport extension to import data from the command line.

Warning: typically when importing a given entity, many related entities will get imported along with it; if you don't set a very limited range you could end up with a process that runs for hours and ends up importing hundreds or thousands of items.


 * 1) Clone the WikibaseImport repo into your extensions folder
 * 2) Add   to your config file (in settings.d or LocalSettings).
 * 3) Shell into Vagrant  and run the following commands:

Set Admin passwords
You may need to set a default admin password for some relevant wikis; I use  for all local wikis here.

Update search index(es)
After setup and import of bulk Wikidata items, you may need to manually update various search indexes in both Commonswiki and Wikidatawiki.

Feature checklist
At this point, the following features should (hopefully) be working:


 * Adding structured data to file pages in commonswiki (including autocomplete suggestions that are populated by entities in the local wikidata instance)
 * Searching local files in commonswiki using Special:MediaSearch (searches should find matches based on both full text of file pages as well as any structured data that has been added)
 * Searches using the traditional search box on commonswiki should support  style queries

Help, I just need to tweak one small UI thing!
If you don't need a fully federated setup, consider using MediaWiki-Docker  and adding   – this will allow Special:MediaSearch to just query the production search API on Commons so you don't need to run queries locally.

Working with Wikibase
WikibaseMediaInfo sits on tops of Wikibase. If you're working with WikibaseMediaInfo you'll need to keep an eye on what's happening in Wikibase, because changes in Wikibase can affect you. For example Wikibase js config vars can come and go, and if they disappear they might catch you out.

Also watch out for conceptual differences between the two. As an example - in MediaInfo every File page has a corresponding MediaInfo item. Sometimes that item doesn't exist in the database, in which case it'll be a virtual item consisting only of an id. As far as Wikibase is concerned that item doesn't exist. This has tripped us up in the case of Wikibase's  hook - it doesn't fire if there is no Wikibase entity in the db, and we need it to fire for a virtual item as well as a concrete one, so we can't use it.

Wikibase code is heavily abstracted and it can take some work to understand it and how to use it from WikibaseMediaInfo. Instantiating objects in particular can be a bit tricky - factories are wrapped inside callbacks that are in turn wrapped inside dispatching factories (factories that delegate object instantiation to other factories depending on their inputs). There's a  service locator which you can access statically using , and to get utility classes like serializers or lookups you can usually find some kind of   method on the service locator that will give you what you need.

Here's an example - on the File page we want to be able to make the MediaInfo item associated with the page available to javascript. We do that by writing a serialized version of the item to a js config var inside the  hook in. Here's a simplified version of the code:

The factories that the service locator ultimately uses are defined in  - for example the serializer used above is defined like this:

Wikibase code munges all the serializer factories together into a dispatching factory, and then when you call  from the service locator with a MediaInfo entity id it uses the callback defined above to return a MediaInfoSerializer.

Testing Strategy
WikibaseMediaInfo is a complicated extension, with complicated dependencies (i.e. Wikibase). Automated testing can play an important role in helping to manage this complexity.

To do this, we are using three different types of tests, which can be likened to levels in a "testing pyramid". The three levels are: JS and PHP unit tests (the "base" of the pyramid), PHPUnit API/integration tests (the middle layer), and end-to-end tests in Selenium (the top of the pyramid).

Javascript unit tests (headless Node/QUnit)
WikibaseMediaInfo introduces lots of new JS code, much of which is concerned with introducing new UI elements that enable users to view and edit structured data in various places (File pages, UploadWizard, Search, etc.). Wherever possible, we want to try and test these new JS components in isolation, using a headless Node.js testing framework instead of the traditional Special:JavascriptTest approach. There is a good discussion around the advantages and reasoning behind this approach at this RFC on Phabricator.

Requirements
Node.js v10 is required to run these tests. QUnit is used as the testing framework. The JSDOM and Sinon libraries are also used extensively.

Writing Tests
For JS code in a Mediawiki extension to be testable this way, we need to be able to load it in an isolated context using Node's  statement. This means that the relevant part of the codebase needs to be re-written using ResourceLoader's new PackageFiles feature. Then the individual JS files used in this module must define a  property (these files no longer need to be wrapped in self-executing functions). In addition to making code more testable, refactoring in this way lets us write JavaScript in a way that is more in line with the current practices of the wider JS community. This refactoring is currently in-progress (some modules in our  use PackageFiles, while others still define an array of scripts).

Tests live in:  and are organized into subfolders. Here is an example with a few simple tests for the LicenseDialogWidget, a basic UI component.

Having good coverage at the JS component level will help to catch regressions and make it easier to refactor code. Things to test for at this level include basic interactions (toggling a component in or out of edit state, for example), ensuring that appropriate API requests are sent when an action is taken, etc.

Running Tests
To run Node QUnit tests, open a terminal and run. They are also included in the larger  script (which means they will run in CI).

PHP tests
PHPUnit tests are located in. They must by run using MediaWiki core’s  like this   (in the vagrant dev environment   is in  ).

Normal unit tests are in. Integration tests are in.

End-to-end tests (Selenium)
End-to-end tests represent the highest level of the "testing pyramid". Tests at this level should focus on the "happy path" for a user. They can also be used to ensure that basic functionality (like logging in and editing a page) is never hampered by a regression.

Currently it is not feasible to run extension-specific Selenium tests for WikibaseMediaInfo in the regular CI process. Instead, tests can be run against Beta Commons on a regular schedule. These tests need to live in their own location ("specs_betacommons" instead of "specs") so that they are not picked up by the Selenium script run by Core (which does happen in the CI pipeline).

There is currently an in-progress patch that adds this functionality to add Selenium tests to this extension here. This document will be updated with more information about how to write and run these tests once that patch is merged.