|This page in a nutshell: introducting a new software initiative to provide human feedback about AI judgments.|
The Wikimedia Scoring Platform team is proposing the Judgement And Dialog Engine (JADE) as a way to attach human judgments and dialogue to any wiki artifact (e.g. a revision, a whole page, a user, a log event, etc.), similar to how the Talk namespace relates to articles, but with associated structured data such as an article quality rating to challenge predictions made by ORES. Judgments submitted to JADE can be used to audit ORES (e.g. tracking bias) as well as to retrain ORES. It will serve as a false-positive and feedback gathering system, to tune the AI and encourage democratic oversight.
Why is JADE important?
JADE will facilitate a community of people overseeing the AI, perhaps even in "partnership" with the AI. JADE is needed so that editors can effectively challenge the AIs' automated judgments. Currently this work is done ad-hoc on wiki pages. E.g. it:Progetto:Patrolling/ORES. JADE represents basic infrastructure to better support this auditing process. The goal is put more power into the hands of the people that ORES' predictions affect.
What will Jade support?
- MediaWiki integration and a public API
- Allows users to submit and refute judgments
- Accessible to tool developers and extension developers (Huggle, RC Filters, etc.)
- ORES integration
- Judgments returned along with predictions.
- Consensus patterns
- Users file dissenting judgments
- Structured discussions for every artifact
- Consensus recorded via a "preferred" judgement flag
- Collaborative analysis
- Judgments open to human review and analysis
- Machine readable for fitness and bias trend reports
- Curation and suppression
- Recent judgments appear in Special:RecentChanges
- Basic suppression actions supported (hide comment, user, etc.)
Sign up to be contacted about discussions: JADE contact list
- "Best practices for AI in the social spaces: Integrated refutations"
- Technical work: task T148700 and its subtasks.
- JADE/Open questions
- Past examples of (manual, wiki-based) ORES auditing: