This is a draft paper (< 750 words for the first iteration) explaining the role JADE might play in the global digital political economy.
Human Judgment and Democratic Oversight of AI: Defenses Against Systemic Bias?
We present JADE, an infrastructure that creates a democratic feedback loop between Wikipedia's principal AIs, and the editors in whose interest the AIs are acting. This will be a collaborative AI auditing support system, where human judgments are on level footing with automated predictions, will be returned alongside the automated scores, and can be used to retrain the AIs.
We're hoping to set a new precedent for transparent algorithms at scale, and hoping that the JADE feedback will help us identify and mitigate systemic biases. Most industrial machine learning algorithms are black boxes, and are subject to business pressures which create systemic biases. For example, a loan application evaluation algorithm will tend to make decisions that maximize the company's profit, regardless of issues of social justice, i.e. denying already credit-starved people access to credit. On the Wikimedia wikis, we have an unusual incentive to reveal and mitigate biases, to fight the emergence of any privileged class of edit or editor, and keep Wikipedia running as a communication commons where all volunteers can participate equally in knowledge production.
Examples of threats to Wikimedia as a commons
These are examples of threats which can be counteracted by ORES and JADE:
Restrictive work backlogs
New article creation has been increasingly restricted over time, because the backlog of review work is unmanageable. This has privileged existing editors over new editors, because the distinction makes an convenient test for whether article content can be blindly trusted or not. ORES can solve this bottleneck by directing new articles to specific communities of interest, who have the domain expertise to review and support new editors in their work.
These work backlogs themselves can be oppressive, forcing editors to spend time on difficult and alienating work, when they might rather be doing creative, intellectual work.
As the work backlogs pile up, we're creating scarcity. Having the content reviewed is an necessary input which enables the content to be seen by readers. Unreviewed content also prevents editors from building upon the earlier work. Unreviewed work has much lower use value. Content production is slowed by restrictions on workflows such as new article creation, reducing overall productivity and discouraging editors.
Similarly, new editors face multiple obstacles to participation, from simple "autopatrolled" filters that prevent them from editing popular articles, to biases in ORES that must be identified. There were multiple pitfalls encountered during ORES development, an important one being that the training algorithms turned out to favor a boolean "new editor" feature as a strong predictor of an edit's badness. While this may have been true as an aggregate observation, made by machine learning without the help of human biases, it's not a fair way to treat new editors as individuals. In general, making a different prediction based on any characteristic of the editor is a dangerous game, because its effect will be to privilege the status quo and make it harder for new editors to begin to participate normally. In this case, we removed [verify] the "new editor" feature entirely, finding that it made a very small impact on model "health", but a large improvement in algorithm fairness.
We've made a conscious effort to upgrade from the simplest available training set, of "reverted" revisions, to more advanced "damaging" and "goodfaith" edit quality models whenever possible. The "reverted" model is problematic because we're blindly feeding a history of known bad edits into our training, without understanding why those edits were rejected. A K-cluster analysis showed that there were between 4-7 [check] clusters, depending on wiki, which means there were at least this many types of bad edit. Some of those classes would have been the edits that existing patrollers didn't like for various reasons that we're uncomfortable with perpetuating. For example, there is a known issue with adding material about women or people from so-called developing countries, for whom existing citations will be drawn from different sources than the dominant canon of English-language, North America- and Eurocentric media. If we simply train on the history of reversions, we encode these status quo biases into the machine learning, and create a vicious feedback cycle in which editors see the lower score, and are encouraged to revert this new material. If instead, we ask the explicit question, "Is this edit damaging to the encyclopedia?" to create our training labels, it's more difficult for the same types of bias to enter.
Bias favoring dominant culture
Still, there will be biases, and JADE exists to help us find them. Since ORES is designed to emulate some part of the status quo of content curation, we need to actively mitigate tendencies to make editing more difficult for people outside the middle of the spectrum. JADE will create an open space in which dominant curation can be challenged.
One concern about the JADE system is that we might accidentally encourage group polarization, in which a small number of like-minded people working in an area of curation will begin to form their own dominant consensus. Since our data will be quantitative, it's possible that we can detect and study this effect.
- Themis principles
- Amir's cluster analysis
- Wikipedia: An Info-Communist Manifesto. Sylvain Firer-Blaess, Christian Fuchs. Television & New Media. doi: 10.1177/1527476412450193
- Vasilis Kostakis (2012) The political economy of information production in the Social Web: chances for reflection on our institutional design, Contemporary Social Science, 7:3, 305-319, DOI: 10.1080/21582041.2012.691988
- Lincoln Dahlberg; Cyber-Libertarianism 2.0: A Discourse Theory/Critical Political Economy Examination. Cultural Politics 1 November 2010; 6 (3): 331–356. doi: https://doi.org/10.2752/175174310X12750685679753
- Zhang, Xiaoquan (Michael), and Feng Zhu. 2011. "Group Size and Incentives to Contribute: A Natural Experiment at Chinese Wikipedia." American Economic Review, 101(4): 1601-15.
- Brent Hecht and Darren Gergle. 2009. Measuring self-focus bias in community-maintained knowledge repositories. In Proceedings of the fourth international conference on Communities and technologies (C&T '09). ACM, New York, NY, USA, 11-20.