ORES/FAQ

Overview
Following is a list of questions and answers about the ORES AI for Wikipedia.

The purpose of this FAQ is to help readers become more familiar with what ORES is, how ORES is used on Wikipedia and other projects, and how to get involved.

If you are interested learning more about artificial intelligence, machine learning or data science, getting involved with ORES is a great place to start!

If you have a question that is not covered in this FAQ, please ask on the Discussion Page.

What does ORES stand for?
ORES is an acronym for Objective Revision Evaluation Service. ORES is pronounced like the English word "ores" (ɔɹz).

We chose a mining metaphor for ORES, because machine learning models are a product of data mining analysis.

Predicting edit and article quality
ORES is an artificial intelligence (AI) service that helps human editors improve the quality of edits and articles on Wikipedia.

ORES uses a combination of open data and open source machine learning algorithms to train models and create scores. ORES scores help predict the quality of edits, as they are made.

Learn more about the background and basics of ORES on Wikipedia.

Tools and services that use ORES scores
Many tools use ORES scores. These tools make it possible to automate time consuming workflows that would otherwise have to be done manually by human editors.

ORES tools use machine learning to predict the quality of new edits and articles, quickly identify and address damaging edits (sometimes called "vandalism"), check for copyright violations, and patrol recent changes to Wikipedia articles.

What's an ORES score?
ORES scores are scores assigned to individual edits that are made to articles on Wikipedia. ORES scores describe the quality of edits and help humans determine which kinds of edits are damaging to articles and which kinds of edits are made in good faith. ORES scores make it easier to identify folks who might need extra support to make higher quality edits to articles. ORES scores can also help identify and revert obvious vandalism

ORES scores allow humans and machines to work together to generate better article content and to improve existing content.

What is a patroller?
A patroller is a human user on Wikipedia who helps determine whether edits might be damaging. ORES helps support patrollers by using specific filters to determine if new edits may be damaging.

Potentially damaging edits are flagged and brought to the attention of human patrollers who make the final decisions regarding edit quality.

What are damaging edits (sometimes called "vandalism")?
"Vandalism" occurs when an editor decides to make a damaging edit on purpose.

Sometimes people make edits that have damaging effects, even if they make these edits with the best intentions. A patroller's job is to look for "damaging" edits, whether the damage was done on purpose or not.

Additional information on ORES review tools page.

What is an article?
An article or entry, is a page that anyone can edit. Wikipedia is made of encyclopedic articles that anyone can edit. New information is always emerging. Articles are revised often and may change over time.

ORES uses information from the article's revisions to score the content of the article.

What is an edit?
Wikipedia articles can be changed by anyone who wants to make changes or contribute new information. Each edit is made by a user, who opens the article, edits and saves. Edits cannot be changed after the fact, but a new edit can be made to correct mistakes.

When ORES AI scores an edit, it analyzes the change to determine whether it was helpful or damaging to the article and whether the change was made in good faith.

Edits or changes are also known as "revisions." The info page for an edit (example) shows the change made in that edit and the contents of the article, once that edit is applied.

What does quality mean?
Article content is scored using scales such as the Wikipedia 1.0 assessment, which are designed to give an idea of the completeness and readability of an article, as well as the richness of citations to supporting material. Each wiki language and project will use its own scale. See the English Wikipedia 1.0 scale for more information.

What is a model?
Models are trained by the Scoring Platform staff, and uploaded to the server. Models are trained take in documents and output ORES scores.

How do I get support for my wiki in ORES?
ORES/Get support

How do I use ORES?
ORES is built into the RecentChanges, Watchlist, and Contributions pages for supported wikis. We recommend using the new filter interface, which gives more flexible searching and highlighting for several thresholds of damaging and good-faith prediction.

If your wiki isn't supported yet, please help us put together the language assets and let us know that you're interested in working with us to develop support. See "How to I get support for my wiki" above.

How can I use ORES to support my editing activities?
The best way to make use of ORES is to find a tool that uses ORES predictions. However, you can always query ORES directly via the API. For example, https://ores.wikimedia.org/v3/scores/enwiki/234234320/damaging returns { "enwiki": { "models": { "damaging": { "version": "0.3.0" }   },    "scores": { "234234320": {       "damaging": { "score": { "prediction": false, "probability": { "false": 0.9785340256606994, "true": 0.021465974339300656 }         }        }      }    }  } }

What tools are available that make use of ORES?
See our list of tools that use ORES for examples of what's available.

What types of models are available?
edit quality...? There are three "edit quality" models. Each analyzes changes to article content.

The basic "edit quality" model tries to predict whether an edit will be reverted. The two other advanced models try to predict whether the edit was damaging, and whether it was made in good faith.

Note: This basic model can be problematic. It is based on that articles that have already been reverted by human reviewers. There may be some reviewer bias built into the model. If the advanced models are available for your wiki, we recomend you use one of them.

article quality...? This model evaluates the entire article content and gives it an overall quality score. This lets us evaluate the improvement in an article or set of articles over time, find high-quality articles, or find articles needing work.

On the English version of Wikipedia, the article quality model makes predictions using quality classes from the Wikipedia 1.0 scale.

draft quality...? This model is used to predict if new articles will need to be speedy deleted.

How do I run an instance of ORES?
''' For development? ''' The best place to start is to install MediaWiki-Vagrant, and enable the following roles: "betafeatures", "ores", "ores_service", and "wikilabels".

As a public service? Nobody outside of the Scoring Platform group has tried this yet. Please contact the team directly for more support.

What does ORES' architecture look like?


Requests to ORES first go to a set of load balancers that distribute the load to a set of WSGI workers that validate requests and perform basic IO for scoring jobs. If a score has already been generated, the WSGI workers will find it in the score cache and respond immediately. Otherwise, the scoring job is farmed out to a set of celery workers via a Redis task queue.

How can I get help with ORES?
Many people who have a basic knowledge of how ORES works are interested in how to get help or how to help with ORES projects. A good place to start is the ORES support page. You can also look at our support table to see if we already support your Wiki.

 Who do I talk to about problems/ideas/etc.? Please contact the Scoring Platform team, using any of the means under "Contact us" at the bottom of the page.

How do I report problems with ORES' scores? A dedicated feedback system called Judgment and Dialogue Engine (JADE) is in the early stages of development, and that will be the best place to report problems. In the meantime, please create a wiki page, for example under the relevant Patrolling project, and alert us about the page using Phabricator or other means.

What deployments of ORES are there?
We maintain a production-ish space where uptime and stability are optimized for as well as an experimental space where new models and features are deployed in a high-performance and flexible environment.

How is ORES configured?
ORES is configured via two respositories. The production-ish configuration lives in https://phabricator.wikimedia.org/source/ores-deploy/ and the experimental configuration lives in https://github.com/wiki-ai/ores-wmflabs-deploy. The primary configuration asset is config/00-main.yaml. Other configuration at the machine level comes from https://github.com/wikimedia/puppet.

How do I deploy ORES?
See our documentation on the Wikitech wiki: https://wikitech.wikimedia.org/wiki/ORES/Deployment

Who should I contact about ORES stuff?

 * File bugs on the Phabricator board: https://phabricator.wikimedia.org/tag/scoring-platform-team
 * Contact the wiki AI community via our mailing list: https://lists.wikimedia.org/mailman/listinfo/ai
 * Contact the team on our WMF email address: scoring-internal@wikimedia.org
 * Join us on IRC:

What are my options when querying ORES?
ORES provides basic documentation about how to query the system using Swagger documentation. See https://ores.wikimedia.org for links to swagger documentation for each version of our API. For example https://ores.wikimedia.org/v2/ lists the documentation for the v2 interface.

What models are available and what are their fitness statistics?
The best way to find out what models are available is to query ORES directly. For example, https://ores.wikimedia.org/v2/scores/ lists out all of the wikis and models that are supported with their versions. In order to get fitness statistics, add "?model_info" to the URL (https://ores.wikimedia.org/v2/scores/?model_info). Eventually we'll have a UI to make accessing this information easier.

Where should I set my thresholds for filtering/highlighting
This depends on what you want to optimize for. For example, when using the "damaging" model it's useful to optimize for high recall to make sure that most vandalism is caught. ORES reports "threshold optimizations" that you can use to identify an appropriate threshold. For example:

https://ores.wikimiedia.org/v2/scores/enwiki/damaging?model_info=statistics.thresholds.true.'maximum precision @ recall >= 0.9' returns: { "scores": { "enwiki": { "damaging": { "info": { "statistics": { "thresholds": { "true": [ {                 "!f1": 0.883, "!precision": 0.996, "!recall": 0.794, "accuracy": 0.797, "f1": 0.233, "filter_rate": 0.77, "fpr": 0.206, "match_rate": 0.23, "precision": 0.134, "recall": 0.901, "threshold": 0.09295862121864444 }             ]            }          }        },        "version": "0.4.0" }   }  } }

This means you should set your threshold at 0.093 and expect to get 13.4% precision at 90% recall.

Where do I get more information about what ORES is and how it is used?
ORES on MediaWiki

ORES MediaWiki Extension

PythonHosted: https://pythonhosted.org (Uses python's sphinx doc framework) (Audience: Developers / contributors)