ORES/FAQ

Overview
Following is a list of questions and answers about ORES. Its purpose is to help readers become more familiar with ORES and how to get involved with the project. If you have a question not covered in this FAQ, please ask on the Discussion Page.

What does ORES stand for?
ORES is an acronym for Objective Revision Evaluation Service. ORES is pronounced like the English word "ores" (ɔɹz). It's a data-mining metaphor because machine learning models are a product of data-mining analysis.

Predicting edit and article quality
ORES is an artificial intelligence service, which helps human editors improve the quality of edits and articles on Wikipedia. ORES uses a combination of open data and open source machine learning algorithms to train models and create scores that help predict the quality of edits as they are made. Learn more about the background and basics of ORES.

Tools and services that use ORES scores
Many tools use ORES scores. These tools automate time-consuming workflows that would otherwise be done manually by human editors.

ORES tools have been used to predict the quality of new edits and articles, quickly identify and address damaging edits (sometimes called "vandalism"), check for copyright violations, and patrol recent changes to articles.

What's an ORES score?
An ORES score is a score assigned to individual edits to articles on (Wikipedia). ORES scores help humans and machines describe the quality of an edit.

ORES scores allow humans and machines to generate better article content and to improve existing content.

ORES generates two types of scores, “article” and “edit” quality.

The “edit quality” ORES score helps to determine which edits are damaging to an article and which edits were made in good faith. This makes it easier to identify well-intended folks who might need extra support to make better edits to articles. It can also help identify and revert obvious vandalism.

What is a patroller?
A patroller is a human user in Wikipedia who has taken on the role of determining whether edits might be damaging. ORES helps support this by using specific filters to determine if new edits may be damaging. Potentially damaging edits are flagged and brought to the attention of human user to make a final determination.

What are damaging edits (sometimes called "vandalism")?
"Vandalism" is an edit that intentionally damages an article.

Sometimes folks make edits that have damaging effects, even if they make them with the best intentions. A patroller's job is to look for "damaging" edits, whether the damage was intended or not.

Additional information on ORES review tools page.

What is an article?
A wiki article is a webpage you can edit.

Because Wikipedia is an encyclopedia anyone can edit and new information is always emerging, articles will change over time.

When ORES scores the content an article it is based on the article's revisions

What is an edit?
Wiki articles can be changed by anyone clicking on the "Edit" button, as explained on this help page. Each edit is made by one user, who opens the article, edits and saves. Edits cannot be changed after the fact, although you can make a new edit to correct mistakes.

When ORES scores an edit, we are analyzing the change to guess whether it was helpful or damaging to the article, and whether the change was made in good faith.

Edits or changes are also known as "revisions". Reading the info page for an edit (example) will show you two things, the change made in that edit, and the contents of the article once that edit is applied.

What does quality mean?
Article content is scored using scales such as the Wikipedia 1.0 assessment, which are designed to give an idea of the completeness and readability of an article, as well as the richness of citations to supporting material. Each wiki language and project will use its own scale. See the English Wikipedia 1.0 scale for more information.

What is a model?
You can think of a model as a trained machine, which takes documents to be scored and outputs scores.

How do models get created?
Models are trained by the Scoring Platform staff, and uploaded to the server.

How do I get support for my wiki in ORES?
ORES/Get support

Getting help
Many folks who have a basic knowledge of how ORES works are interested in how to get help or how to help with ORES projects. A good place to start is the ORES support page. You can also look at our support table to see if we already support your Wiki.

How do I use ORES?
ORES is built into the RecentChanges, Watchlist, and Contributions pages for supported wikis. We recommend using the new filter interface, which gives more flexible searching and highlighting for several thresholds of damaging and good-faith prediction.

If your wiki isn't supported yet, please help us put together the language assets and let us know that you're interested in working with us to develop support. See "How to I get support for my wiki" above.

How can I use ORES to support my editing activities?
The best way to make use of ORES is to find a tool that uses ORES predictions. However, you can always query ORES directly via the API. For example, https://ores.wikimedia.org/v3/scores/enwiki/234234320/damaging returns { "enwiki": { "models": { "damaging": { "version": "0.3.0" }   },    "scores": { "234234320": {       "damaging": { "score": { "prediction": false, "probability": { "false": 0.9785340256606994, "true": 0.021465974339300656 }         }        }      }    }  } }

What tools are available that make use of ORES?
See our list of tools that use ORES for examples of what's available.

How should I write code for querying ORES? (Best practices)
Generally, we can handle

edit quality...?
There are three "edit quality" models, which each analyze changes to article content. The two advanced models try to predict whether the edit was damaging, and whether it was made in good faith.

The basic "reverted" model will try to predict whether the edit will be reverted, but this is problematic because it's based on which articles have already been reverted, which we suspect encodes some reviewer bias, and it also collapses a number of potential problems with the edit into a single outcome. Try to use the advanced models instead, if they are available for your wiki.

article quality...?
This model evaluates the entire article content, and gives it an overall quality score. This lets us evaluate the improvement in an article or set of articles over time, find high-quality articles, or find articles needing work. In English, the article quality model makes predictions using quality classes from the Wikipedia 1.0 scale.

draft quality...?
Used to predict which new articles will need to be speedy deleted.

Who do I talk to about problems/ideas/etc.?
Please contact the Scoring Platform team, using any of the means under "Contact us" at the bottom of the page.

How do I report problems with ORES' scores?
Funny that you should ask. A dedicated feedback system called Judgment and Dialogue Engine (JADE) is in the early stages of development, and that will be the best place to report problems. In the meantime, please create a wiki page, for example under the relevant Patrolling project, and alert us about the page using Phabricator or other means.

For development?
The best place to start is to install MediaWiki-Vagrant, and enable the following roles: "betafeatures", "ores", "ores_service", and "wikilabels".

As a public service?
Nobody outside of the Scoring Platform group has tried this yet. Please contact the team directly for more support.

What does ORES' architecture look like?


Requests to ORES first go to a set of load balancers that distribute the load to a set of WSGI workers that validate requests and perform basic IO for scoring jobs. If a score has already been generated, the WSGI workers will find it in the score cache and respond immediately. Otherwise, the scoring job is farmed out to a set of celery workers via a Redis task queue.

What deployments of ORES are there?
We maintain a production-ish space where uptime and stability are optimized for as well as an experimental space where new models and features are deployed in a high-performance and flexible environment.

How is ORES configured?
ORES is configured via two respositories. The production-ish configuration lives in https://phabricator.wikimedia.org/source/ores-deploy/ and the experimental configuration lives in https://github.com/wiki-ai/ores-wmflabs-deploy. The primary configuration asset is config/00-main.yaml. Other configuration at the machine level comes from https://github.com/wikimedia/puppet.

How do I deploy ORES?
See our documentation on the Wikitech wiki: https://wikitech.wikimedia.org/wiki/ORES/Deployment

Who should I contact about ORES stuff?

 * File bugs on the Phabricator board: https://phabricator.wikimedia.org/tag/scoring-platform-team
 * Contact the wiki AI community via our mailing list: https://lists.wikimedia.org/mailman/listinfo/ai
 * Contact the team on our WMF email address: scoring-internal@wikimedia.org
 * Join us on IRC:

What are my options when querying ORES?
ORES provides basic documentation about how to query the system using Swagger documentation. See https://ores.wikimedia.org for links to swagger documentation for each version of our API. For example https://ores.wikimedia.org/v2/ lists the documentation for the v2 interface.

What models are available and what are their fitness statistics?
The best way to find out what models are available is to query ORES directly. For example, https://ores.wikimedia.org/v2/scores/ lists out all of the wikis and models that are supported with their versions. In order to get fitness statistics, add "?model_info" to the URL (https://ores.wikimedia.org/v2/scores/?model_info). Eventually we'll have a UI to make accessing this information easier.

Where should I set my thresholds for filtering/highlighting
This depends on what you want to optimize for. For example, when using the "damaging" model it's useful to optimize for high recall to make sure that most vandalism is caught. ORES reports "threshold optimizations" that you can use to identify an appropriate threshold. For example:

https://ores.wikimiedia.org/v2/scores/enwiki/damaging?model_info=statistics.thresholds.true.'maximum precision @ recall >= 0.9' returns: { "scores": { "enwiki": { "damaging": { "info": { "statistics": { "thresholds": { "true": [ {                 "!f1": 0.883, "!precision": 0.996, "!recall": 0.794, "accuracy": 0.797, "f1": 0.233, "filter_rate": 0.77, "fpr": 0.206, "match_rate": 0.23, "precision": 0.134, "recall": 0.901, "threshold": 0.09295862121864444 }             ]            }          }        },        "version": "0.4.0" }   }  } }

This means you should set your threshold at 0.093 and expect to get 13.4% precision at 90% recall.

Where do I get more information about what ORES is and how it is used?
ORES on MediaWiki

ORES MediaWiki Extension

PythonHosted: https://pythonhosted.org (Uses python's sphinx doc framework) (Audience: Developers / contributors)