Extension:MachineVision/Developers

This page documents the internals of the MachineVision extension, with a focus on the backend (PHP) logic.

Overview
When a new image is uploaded to Wikimedia Commons, the MachineVision extension triggers a delayed job request to ensure that the image is still present (i.e., not deleted), and if so, request and store image label suggestions generated by one or more machine vision labeling providers. These label suggestions are then filtered and served to reviewers on the Special:SuggestedTags page on Commons. Accepted label suggestions are saved to the image's structured data as depicts (P180) statements.

In addition to new uploads, lists of image file page titles may be passed to the maintenance script  to have label suggestions retrieved and stored on demand.

When label suggestions are received for an image, an Echo event is fired to notify the uploader that image labels suggestions are available for review, according to the uploader's notification preferences.

The extension is designed to support arbitrary machine vision providers (including issuing requests to multiple providers simultaneously), but the only provider for which support is currently implemented is Google Cloud Vision.

Image
Images are stored by their SHA1 hash in the  table. This means that if an image file is uploaded that is identical to one for which a record exists in the DB, it is the same image for the MachineVision extension's purposes, and labels will not be requested again.

The extension only handles bitmap images and disregards all other file types.

Label
A label is stored as a Wikidata item ID (Q-number) in the  table. Human-readable labels associated with the item ID are fetched from Wikidata at the point of presentation to the end-user. A label will be associated with an image no more than once, even if the label is subsequently suggested by a different machine vision provider.

The distinction between labels and suggestions (below) is to ensure that a suggested label does not receive more than one set of votes (which may be inconsistent); labels should only be voted upon once regardless of the number of times they are suggested by different providers.

Suggestion
A suggestion (stored in the  table) refers to a single instance of a label being suggested for an image. There may be more than one suggestion that refers to an image-label pair, that is, one for each provider that suggests the same label for a given image.

As of September 2020, since there has only ever been one provider configured (namely Google Cloud Vision), there should be a one-to-one relationship between labels and suggestions in practice.

Waiting period
A waiting period is enforced between upload time and the submission of an image to a machine vision provider for label suggestions. This is to reduce the likelihood of making a labeling request for an image that is soon to be deleted. As of September 2020, the waiting period is 48 hours. This value is configured in.

Review state
Review state is a critical concept in the MachineVision extension, because it governs which images are presented on Special:SuggestedTags, and to which audiences. It is important to note that, in the extension's internal logic, review states apply to labels rather than to images (on which see "Data model" under "Quirks and gotchas" below). The review states are represented as integers, with a default state of 0 (unreviewed). Possible states include the following:


 * Unreviewed (0): The default label review state. The label may be presented in either the "popular" or "personal uploads" tab on Special:SuggestedTags.
 * Accepted (1): The label was accepted by a contributor. A corresponding depicts statement should have been created, and the it should no longer appear on Special:SuggestedTags.
 * Rejected (-1): The label was rejected by a contributor. It should no longer appear on Special:SuggestedTags.
 * Withhold from "popular" (-2): The initial review state for a label which is unreviewed but should be withheld from the "popular" tab and only shown to its uploader in the "personal uploads" tab. A label may receive this review state based on the SafeSearch ratings of the image to which it pertains.
 * Withhold from all (-3): The review state for a label pertaining to an image which should be withheld completely from Special:SuggestedTags.
 * Not displayed (-4): A special review state assigned to labels when an attempt to display them fails because a human-readable label could not be found in the requested language. This results in the label no longer being shown on Special:SuggestedTags.

Concept mapping
To interpret suggested labels from Google Cloud Vision as Wikidata item IDs, we rely on a historical mapping (publicly available here) between Freebase IDs and Wikidata item IDs. We take advantage of the fact that many Google entity IDs originated as Freebase IDs, and have only changed in their format; for example, the Freebase ID  would correspond to the Google entity ID. As part of the extension setup, these mappings must be retrieved from their public archive and loaded into the  table.

A drawback to the current setup is that these mappings date from 2013 (when Freebase was acquired by Google) and are naturally becoming outdated over time, as new concepts are added by Google, and Wikidata items are added, updated, and deleted (see also "Redirects and deletions" under "Quirks and gotchas" below). Task T231105 has been filed to create a strategy for keeping our concept mappings up to date.

Priority
Images that are part of target classes of images are to be shown in the "popular" tab on Special:SuggestedTags in preference to general user uploads. To support this, images are assigned a numeric priority value. This value is stored in the  table as.

Label suggestion lifecycle
TODO

Developer setup
Developer setup for the extension is well documented in the README file.

Image and label filtering
Label suggestions have multiple filters applied in  before storage, and each operates differently from the others.

The first filtering pass, based on, is intended to withhold images completely from being shown on Special:SuggestedTags. If a label in  is among the suggested labels returned for an image, the initial review state for all suggested labels is set to WITHHOLD_ALL, which has the effect of excluding it completely. The image is not shown in either the "popular" or "personal uploads" tab on Special:SuggestedTags. The labels are, however, retained in the database.

The second filtering pass, based on, conditionally withholds images from the "popular" tab. If an image receives a SafeSearch rating that exceeds the allowed value on any of the configured dimensions, it is withheld from the "popular" tab but still available to the uploader in the "personal uploads" tab on Special:SuggestedTags. All suggested labels are retained in the database.

The third and final pass, based on, is intended to discard specific label suggestions judged not to be useful to the projects. Suggestions corresponding to labels in  are simply discarded before the remaining suggested labels are stored.

Review state and data model
There is a conceptual mismatch between the extension's data model and its presentation layer. Because the extension was written to support multiple providers, review state is a property of a label rather than an image. In practice, however, all labels for an image are reviewed at once on Special:SuggestedTags on an image-by-image basis, and there is only one labeling provider (Google). This means that in practice, the data model is unnecessarily complicated; an image's eligibility for presentation on Special:SuggestedTags must be derived from the review states of its various suggested labels rather than being stored as a property of the image itself. Besides being needlessly confusing, this created early problems with query performance.

Redirects and deletions
A common source of bugs is that values in the Freebase-Wikidata mappings may refer to a Wikidata item which has been redirected or deleted. The code attempts to resolve redirects as needed to mitigate the effects of outdated mappings, but it is not perfect.