Release status: beta
|Description||Wikibase extension to manage structured metadata of media files|
|Author(s)||The Wikidata team|
|Latest version||continuous updates|
|License||GNU General Public License 2.0 or later|
|Translate the WikibaseMediaInfo extension|
|Check usage and version matrix.|
|Issues||Open tasks · Report a bug|
WikibaseMediaInfo is an extension to Wikibase adding a MediaInfo entity for handling structured data about multimedia files.
The extension hooks into the File Page. It stores supplemental metadata (captions and depicts statements) about the file in a MediaInfo Entity. The user can view, create, edit, and delete this data.
- 1 Requirements
- 2 Installation
- 3 Configuration
- 4 MediaInfo Glossary
- 5 MediaInfo UI
- 6 Tests
- 7 See also
- Ensure CirrusSearch and Wikibase (client and repo) are set up properly.
- Download and place the file(s) in a directory called
- Run Composer to install PHP dependencies, by issuing
composer install --no-devin the extension directory. (See T173141 for potential complications.)
- Add the following code at the bottom of your LocalSettings.php:
wfLoadExtension( 'WikibaseMediaInfo' );
- Run the update script which will automatically create the necessary database tables that this extension needs.
- Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.
- Add required configuration.
Extension configuration variables are sets of key-value pairs. They are documented in more detail in
WikibaseMediaInfo/extension.json. All config variables are added to
The following config options are available for this extension:
Required Config (must be added to LocalSettings)
$wgMediaInfoEnableFilePageDepicts(temporary feature flag) Enables MediaInfo the depicts widget on the File Page when set to true.
$wgMediaInfoPropertiesEstablishes the main linked property used to build the MediaInfo entity in Wikibase. Value is an array of key-value pairs connecting a label name to an existing wikibase database id.
$wgDepictsQualifierPropertiesEstablishes the descriptors or qualifiers of the MediaInfo entity defined in
$wgMediaInfoProperties. Value is an array of key-value pairs connecting a label name to an existing wikibase database id.
['features' => 'P2', 'color' => 'P3', 'wears' => 'P4', 'part' => 'P5', 'inscription' => 'P6', 'symbolizes' => 'P7', 'position' => 'P8', 'quantity' => 'P9',];
$wgUploadWizardConfig['wikibase']['enabled']Enables MediaInfo data on UploadWizard when set to true.
$wgMediaInfoEnableDefaults to true.
$wgMediaInfoShowQualifiers(temporary feature flag) . Defaults to true.
A property is used to categorize or describe a file. It has a unique id in wikibase in the form
Pxxx such as
P123. Examples of file properties are ‘depicts’ (what an image is a picture of), ‘resolution’, ‘created by’, ‘license’.
An item is the concept, topic, or object. It is represented by a unique id in the form
Qxxx. For example on Wikidata the planet Earth is the item
Q2 and the CC0 licence is
A single fact about a media file consisting of a key-value pair (usually a property-item) such as
Depicts=Dog. Claims are stored simply as strings, using the property ids and item ids as appropriate. For example, an image depicting a black cat could have the claim
A short piece of text describing a media file, plus its language. This is used to WikibaseMediaInfo to provide a short description of the file (the same as ‘labels’ in wikibase).
MediaInfo Entity (M-item)
A Wikibase entity that contains structured data about media files. It is stored in a slot on a File page and consists of
- an ID in the form Mxxx, where xxx is the id of the associated wiki page
- any number of captions
- any number of claims
(Note: if there is no caption or claim data then the entity is not stored in the database - in this case the entity is known as a ‘virtual entity’)
A qualifier is a secondary claim that modifies the primary claim. For example an image might have a tree in the foreground and the sea in the background, in which case it could have 2 ‘depicts’ claims associated with it - ‘depicts=tree(applies to part=foreground)’ and ‘depicts=sea(applies to part=background)’.
MediaInfo entities are shown on, and can be edited from, their associated File page. Captions and claims are shown separately, and claims are split into ‘depicts’ claims and ‘other’ claims.
Users can search for files by their MediaInfo captions just as they would search for anything else. For example, if a user uploads a picture of the Eiffel Tower, and enters ‘Tour Eiffel’ (French) and ‘Eiffel Tower’ (English) as multilingual file captions, the picture is findable by another user searching for either ‘Eiffel Tower’ or ‘Tour Eiffel’.
Searching for a single claim
['depicts': 'P1] is the media info property
To search for a claim, use the
haswbstatement keyword. For example, to search for images with Mont Blanc (Q583) search for
haswbstatement:P1=Q583. Searches for claims can also use qualifiers. For example, to search for images with Mont Blanc (Q583) in the background (Q13217555), where P518 is the property ‘applies to part’ use:
Searching across multiple claims at once
Claims can be combined using a logical OR in a single search keyword using the pipe character
|. For example files depicting a cat (Q146) OR a dog (Q144) can be found using
Claims can be combined using a logical AND by using 2 separate search keywords. For example, files depicting a cat AND a dog can be found using:
Searching for claims with quantity qualifiers
To search for a claim with a quantity, use the
wbstatementquantity keyword. For example, files that depict 2 humans (Q5) can be found using:
The comparison operators
<= can also be used, so a search for files depicting more than 2 humans can be found using:
Searching for a range of values
Ranges can be searched for using two
wbstatementquantity keywords at once. For example, to find files depicting between 2 and 5 humans (Q5) use:
When the File page is saved, the following MediaInfo data is written to the Elasticsearch index (all examples use Wikidata Property and Item ids):
- Captions data in every language is stored in the
- Claims are stored in the format
propertyID=valueas array elements in the
statement_keywordsfield using the wikibase property ID (and item id, if value is an item) - e.g. ‘depicts house cat’ is stored as
- Claims with qualifiers are stored in the
statement_keywordsfield along with their qualifiers in the format
propertyID=value[qualifierPropertyID=qualifierValue]. For example, the Mona Lisa painting (Wikidata item Q12418) depicts a sky (Q13217555) in the background (Wikidata property P518). If we arrange this data in a Wikibase claim, it would be: ‘depicts sky, applies to part background’, which would be stored as
- Note that claims with qualifiers are also stored without the qualifier, to increase their findability. So, for example, if someone entered the above claim-plus-qualifier, the claim
P1=Q12418is also stored, so that someone can find the file by searching for ‘depicts sky’ alone, as well as by searching for ‘depicts sky, applies to part background’.
- Claims data with qualifiers where the qualifier value is a quantity is stored in the
statement_quantityfield in the format
propertyID=value|quantity, eg. ‘depicts human, quantity 1’ is stored as
Note that not all claims are stored. A claim will be indexed in ElasticSearch only if ALL of the following conditions are true:
- The claim has a real value (i.e. its value is not ‘no value’ or ‘unknown value’) AND
- We know how to process its value for indexing. More value processors may be added in future, but currently we require the claim’s value to be either a Q item ID, a string (alphanumeric), or a quantity (numeric) AND
- the claims’s Wikidata property ID is NOT in a configurable list of excluded IDs (
$wgWBRepoSettings['searchIndexPropertiesExclude']) AND either its property ID is in a configurable list of property IDs that should be indexed (
$wgWBRepoSettings['searchIndexProperties']) ORits property type is in a configurable list of property types that should be indexed (
Note that for a claim’s quantities to be stored, the claim must meet all the criteria above AND the property ID for the quantity qualifier must be present in a configurable list of property IDs (
PHPUnit tests are located in
tests/phpunit. You can run tests not requiring the MediaWiki framework (located in
tests/phpunit/composer) by running
composer test. This command also runs code style checks using PHPCS.
Tests relying on the MediaWiki framework (located in
tests/phpunit/mediawiki) must by run using MediaWiki core’s