Extension:ORES

The ORES MediaWiki extension integrates data from an ORES service into the RecentChanges view.

Installation
You need to run CheckModelVersions.php maintenance script once it's deployed (and after that you can run PopulateDatabase.php too)

Config variables
Here's the config variables and their default values and a little description about them.

ORES service responses
ORES extension is merely more than an interface to the ORES service. The service returns a probability score of edits being damaging like this (API v1): It means this edit (diff=724030089) is 10% likely to have caused damage. Note that 90% likely doesn't mean 9 out of ten cases will be vandalism. Choosing thresholds should be done via analysing recall (percentage of vandalism it can catch) or false positive rate. In ORES the "soft" threshold is when recall is 75% (meaning it will include 75% of all damaging edits) and the "hard" threshold is when recall is 90%. You can get the thresholds from model info (an example).

Database schema
ORES extension introduces two new tables:

ores_model
This table keeps data about models and their versions.


 * oresm_id: Primary key, oresc_model is a foreign key to this column.
 * oresm_name: Name of the model, like "damaging", etc.
 * oresm_version: Version of the model, like '0.1.1' or etc.
 * oresm_is_current: Whether the model at this version is current version in the service or not.

ores_classification
This table keeps scores of each revision in model and class


 * oresc_id: Primary key
 * oresc_rev: Foreign key to revision_id in revision table.
 * oresc_model: Foreign key to oresm_id in ores_model table.
 * oresc_class: class that ores gives the score. 1 for true and 0 for false in binary models (such as damaging and goodfaith). More numbers for more classes in multi-class models (For example in wp10, 0 is for "Stub", 1 for "B" and so on)
 * oresc_probability: Probability that ORES services sends out, rounded to 3 decimal digits for performance.
 * oresc_is_predicited: Whether ORES service thinks this is the most fitting class or not. If a class is binary, when probability is more than 50% this is 1 but for multi-class models, it's more complicated.

Scores
Once an edit is made the extension triggers a job to hit the service and store the results in the ores_classification. It means it will not include scores for edits made before the deployment. In order to fill the database you can run the maintenance script PopulateDatabase.php. It will hits the service and keeps the score for the last 5,000 edits. You can run it several times if needed.

Once a model gets updated to a newer version CheckModelVersions.php maintenance script needs to be ran to update the ores_model table which will cause to scores stored in ores_classification become deprecated. You can clean these obsolete scores by running PurgeScoreCache.php maintenance script.

Interface
The extension won't show anything when deployed but it will add itself as a beta feature (Extension:BetaFeatures is a dependency of this extension) and once it's enabled by the user it will use hooks in ChangesList (RecentChanges, Watchlist, and RelatedChanges) in both old and enhanced mode and highlights when score exceeds the given threshold.