JADE, the Judgment and Dialogue Engine, is a system for wiki communities to annotate pages, revisions, and diffs.
- How does JADE work?
- The JADE extension to MediaWiki provides a new namespace called "Judgment" that stores annotations to the wiki's content. Annotations are stored on these pages in a machine-readable format called JSON. These machine-readable annotations can be used to provide feedback to automated systems like ORES, for example by challenging the assertions that ORES makes. Users can edit Judgment-namespace pages directly, or they may interact with them indirectly using tools, including those used in counter-vandalism work.
- What is ORES?
- ORES is a machine learning service run by the Wikimedia Foundation which makes extremely fast predictions on matters such as a given article's quality or whether a given edit to a page damages that page. It helps make the work of human reviewers more efficient by providing data which helps those reviewers triage their work.
- ORES is not JADE, and JADE is not ORES. However, human contributions made through JADE can be used to help make ORES better.
- What is being judged?
- In the first phase of deployment, JADE will be used to judge wiki pages, revisions (individual versions of pages), and diffs (differences between revisions). Later, we'll want to judge other entities such as admin actions (using log entries as a proxy), users, and more. Each entity type can be judged according to several quantitative schemas, for example the "Wikipedia 1.0" assessment scale.
- The quantitative scales in JADE closely mirror ORES predictions, making the data easy to feedback into AI training, but we'll also be studying the full range of rich expression afforded by free text and talk pages.
- What kind of annotations are being made?
- These annotations are necessarily of a subjective nature, including judgments on matters such as the quality of a given page or whether a given diff is damaging to a page. For that reason, Judgment-namespace pages should not be used to store data from automated sources like ORES or other bots making automated assessments.
- How are these annotations stored?
- In the Judgment namespace, annotations for a given page, revision, or diff are stored on a single wiki page. For example, the annotations concerning the page with an ID of 123 would be at Judgment:Page/123. The annotations concerning revision 456 would be stored at Judgment:Revision/456. The annotations concerning the difference between revision 456 and its parent revision would be stored at Judgment:Diff/456.
What are judgments?
Judgments in JADE begin as individuals' subjective opinions about wiki entities, for example that a given article has reached Featured Article quality, or that an edit is damaging to the article. These judgments can be refined through collaborative authorship, complete with a talk namespace, into what we expect to be a new gold standard for data quality: collaborative auditing.
Judgments consist of free text, an evaluation on one of our quantitative scales, and indirectly the associated talk page and metadata about the edit history of the judgment. Currently, the scales each map to an ORES model: damaging, goodfaith, wp10, itemquality, and drafttopic, because that makes it most appropriate for retraining the AIs directly.
Judgment content will normally be authored using tools, although it can be edited and administered as raw wiki pages when needed. JADE data is stored as regular wiki pages in a special-purpose namespace, e.g. https://en.wikipedia.beta.wmflabs.org/wiki/Judgment:Diff/376901 The content format is a bit unpleasant to write by hand, and we're still prototyping the reference UI, so please join us in welcoming a small ecosystem of user interfaces to be developed over time.
The current technical implementation is documented in Extension:JADE and won't be treated further here.
Why is JADE important?
JADE will serve several purposes, to give a rich structure to patrolling and assessment workflows, and to produce high-quality feedback for the ORES AIs. Other uses will likely emerge.
It will facilitate a community of people overseeing our AIs, perhaps even in "partnership" with the AI. JADE is needed so that editors can effectively challenge the AIs' automated judgments. Currently this work is done ad-hoc on wiki pages. E.g. it:Progetto:Patrolling/ORES. JADE represents basic infrastructure to better support this auditing process. The goal is put more power into the hands of the people that ORES' predictions affect.
We hope that JADE will become useful to the patroller community especially, as a way to coordinate work across workflows. For example, edits that have been patrolled as good can be input to ORES's AI training as non-damaging examples.
JADE data should also become an important reservoir of counterexamples which help challenge assumptions and mitigate biases in ORES or other AIs. Judgments in JADE can be used to audit ORES (e.g. tracking bias) as well as to retrain ORES. Doing this in an open and collaborative way will encourage democratic oversight, rather than a handful of technical staff making all the decisions about how to build ORES.
Continuous collaborative auditing has been explored in the industry and is a promising method. Our approach is unique due to the massive public collaboration possible in wiki projects, so we're eagerly looking forward to seeing what emerges.
What will Jade support?
- MediaWiki integration
- Allows users to review a judgement for a wiki entity (revisions, pages, etc.)
- Allows users to submit and edit judgments
- Public API for to tool developers and extension developers (Huggle, RC Filters, etc.)
- ORES integration
- Judgments returned along with predictions.
- Consensus patterns
- Users file dissenting judgments
- Structured discussions (or talk pages) for every wiki entity
- Collaborative analysis
- Judgments open licensed and publicly accessible
- Machine readable dumps/api for generating fitness and bias trend reports
- Curation and suppression
- Recent judgments appear in Special:RecentChanges
- Basic suppression actions supported (hide comment, user, etc.)
Sign up to be contacted about discussions: JADE contact list
See JADE/Implementations for alternative potential technical implementations.
- "Best practices for AI in the social spaces: Integrated refutations"
- Technical work: task T148700 and its subtasks.
- JADE/Open questions
- Past examples of (manual, wiki-based) ORES auditing: