Wikimedia Product/Perspectives/Tools

Overview
In early 2006, a large-scale vandalism attack happened on the English Wikipedia with thousands of articles wiped out. The editors were defenseless and the site was vulnerable. Then suddenly, four Wikipedians came together to write TawkerBot, the first anti-vandalism bot for Wikipedia. This bot proved to be a life-saver for the site. Today more than 300 bots work round the clock on English Wikipedia to ensure the smooth functioning of the site. Tools like Twinkle (tool library that helps editors perform wiki maintenance), Huggle (a diff browser for rapidly reverting vandalism), HotCat (allows a user to easily add and remove categories to pages), AutoWikiBrowser (semi-automated MediaWiki editor), etc. drive many of the tasks power editors do on English Wikipedia every single day.

At the same time, smaller language wikis like Hindi Wikipedia have problems coping with vandalism and keeping up with content moderation needs. Unlike English Wikipedia, they don't have the corps of volunteer developers able to write tools to defend and curate the site's content. It's a lot harder for those communities to grow their content or their editor base, because the active contributors are stuck doing manual drudge work that the bigger wikis automated years ago.

Tools for Developers
Empowering our volunteer developers to write better code that can work across wikis is going to be a key factor in helping us gather the sum of all knowledge. Wikis need code contributors as much as they need content contributors. Templates, gadgets, and bots act as superpowers in making editors more efficient at their tasks. Experienced editors use these tools to create and maintain backlogs, keep track of the quality of incoming edits, perform mass actions on pages, ward off vandalism and more. However this superpower is limited to wikis which have contributors able to write code for the site. This creates disparity in the resources available to wikis. Bringing these important resources to all wikis is fundamental to bridging the equity gap across all language wikis.

This paper advocates for building a platform that can support tools which work on all our wikis seamlessly. Right now a lot of developer code lives on the wikis (gadgets, Lua modules, templates) where it really isn’t possible to do any type of testing, code reviews or debugging; nor is there any straightforward way to add localization or RTL support. This often leads to issues like security vulnerabilities, conflicts with MediaWiki deployed extensions, and bugs due to lack of maintenance. Also, in its current state, having code hosted on the wikis (in a per-project fashion) makes it hard to get in the mindset of having the code work across wikis. It’s easy to get sucked into customization and forget to think about things like RTL rendering or localization.

One aspect of the future platform depends on services being available to the developers and communities which they can use for building better tools. These may be services which can be used to do better copyright violation detection, vandalism detection, and image recognition, and provide access to better statistics, and so on. Part of the growth of services involves better partnerships with companies like Google, Turnitin and others providing such services. Another very important aspect of the platform is for Wikimedia engineering to collaborate with our volunteer developer communities to come up with documentation and best practices for creating new tools. An example of this can be tutorials and guidelines on how gadgets can make use of OOUI to standardize our interfaces and make them more accessible for everyone. Tools that facilitate communication among engineers and volunteer developer communities is key to achieving this goal.

Tools for Organizers
Movement organizers are another key audience for tool development in the coming years. The Foundation’s 2018-19 annual plan recognizes organizers as “fundamental implementers” and a "core asset” of the free-knowledge movement. But tools that support organizers’ efforts are frequently ad-hoc, poorly documented and available only on certain wikis.  Access problems can be particularly acute in smaller communities, where the technical skills required to set up and run bots, scripts and other technologies are often scarce.

Organizers’ needs fall into a three main areas. “Community-building” tools are required to help organizers inform, engage  and manage the work of their communities. “Outreach and promotion” tools will help organizers advertise their activities and recruit new members. “Event-management” tools are necessary to more efficiently carry out tasks like event signup and conference scheduling.

Finally, two overarching meta-problems are key areas of interest among organizers. One is the need for better guidance about best practices and the tools that do exist. As one organizer put it, “There are a lot of tools we don’t know about or know what they can do for us. We need someone to help us understand what we are missing, and what to do and how to do it.”  The second is the need for a mechanism that can replace or augment categories, so that organizers will be able to classify content effectively and more efficiently tap volunteers’ subject interests—a primary motivator, especially of new editors.

Tools for Moderators
A critical, but often overlooked, aspect of the workflows that make our projects successful are the tools and processes used to review and moderate our content. For the most part, the Wikimedia Foundation has taken a hands-off approach to content moderation tools and let the community develop their own solutions (with a few exceptions such as Recent Changes filtering). As one would expect, the systems built by the community utilize the building blocks at their disposal: templates, gadgets, user scripts, and wikitext. These tools and processes suffer from several significant problems, many of which have already been mentioned above: lack of portability, limited capabilities, lack of automated testing and code review, lack of localization support, etc.

Another major problem with these tools and processes, especially those created for content moderation, is their high learning curve. As one example, on English Wikipedia there is a system for submitting, reviewing, and publishing draft articles called Articles for Creation (AfC). In order to participate as a reviewer in AfC, you have to install a special user script, be granted a special right though a unique vetting process, and use several obscure templates and categories. The complexity of this process limits the number of people who are able and willing to participate, which in turn leads to a less diverse pool of reviewers. This lack of diversity may contribute to problems of systemic bias in our content. The small number of reviewers also makes the review process slow, often taking a week or longer to review a submitted draft. This is likely to contribute to the overall effect of the process, decreasing newcomer productivity. Unless we make these moderation tools work for less technical users, it is unlikely that the pool of moderators will grow or diversify.

Similar examples can be found throughout the moderation processes for our projects, including workflows for article assessment, deletion, and problem triaging; workflows for reviewing edits; workflows for reviewing and organizing multimedia contributions; workflows for proofreading Wikisource transcriptions; and more. While the Wikimedia Foundation has historically focused on building software for reading and editing content, the other critical pieces of the wiki ecosystem have been largely neglected, leading to volunteers feeling overwhelmed and unsupported. In a 2015 survey of experienced editors across 10 projects, only 35% said that the Foundation was mostly or completely meeting the community’s technical needs around identifying and surfacing content problems. Clearly, there is a lot of work for us to do in this area as we have only scratched the surface thus far. If we want to increase the capacity of our communities to efficiently and effectively moderate content, it is time for the Foundation to begin investing seriously in this area.

Unfortunately, the Foundation’s hands-off approach has resulted in a lack of credibility in this area. To build our credibility, we should first focus on the areas where there is a clear need for better tools, such as fighting vandalism and sock-puppetry. We should also investigate how editors transition into becoming moderators so that we can better facilitate that transition. Once we have proven our capacity to understand their motivations and work with moderators to build effective tools, we will then have the mutual trust needed to tackle more difficult workflows such as article deletion and conflict mediation.

Aspects

 * Tools For Developers
 * Tools For Organizers
 * Tools For Moderators

Examples

 * HotCat
 * Huggle
 * Twinkle
 * AutoWikiBrowser
 * Programs and Events Dashboard
 * Wikimedia Cloud Services
 * CentralNotice
 * GeoNotice

Areas of Impact

 * Templates
 * Gadgets
 * Bots
 * Editing and Administration APIs
 * Discussion systems
 * Messaging systems
 * Contributor Analytics
 * Developer Advocacy and Outreach
 * Translation and Localization Infrastructure
 * API and Tool Documentation