Wikimedia Product/Perspectives/Augmentation/Content Generation

Content Generation

Summary

Content generation is the aspect of wiki activity related to adding new facts, writing, translations, or images to the wikis. The Wikimedia movement's ambitious aspiration to make the sum of all knowledge available to everyone in the world means that the movement has a tremendous amount of work to do with regard to content parity across all wiki projects. Most of the hundreds of languages in the world have Wikipedias with less than 10% the number of articles that English Wikipedia has, and even the largest Wikipedias have serious gaps in terms of the depth of their articles, and the subject matter covered by their articles.

Augmentation is a potential pathway to closing the gaps described above. By applying algorithms and artificial intelligence in the right ways, human editors can be assisted in generating the most important content for the wikis, allowing us to close the most important gaps fastest. This kind of human-machine partnership is not new in the wikis. As early as 2002, Rambot^[1] generated 32,000 stub articles in English Wikipedia using Census data, and now in 2018, thousands of articles are translated between languages with the help of machine translation algorithms. On the horizon are technologies like Quicksilver,^[2] which detects facts about entities from news articles and collates them for human editors to turn into needed articles.

As humans and machines work together to generate content, we can think about that interaction on a spectrum of how much work the human editor does and how much work the machine does. In some scenarios, the machine may just suggest a task that the human editor does in entirety. In other scenarios, the human may edit and improve on work done primarily by a computer. This paper explores some specific examples of content generation activities that can exist in the future, drawing from all along the spectrum of the human-machine partnership.

Because bias and unfairness already exist in the contents of the Wikimedia projects, algorithms have the potential to magnify and exacerbate those problems. The Wikimedia movement should confront this with the same principles that have led to our success in the past: transparency and the ability for anyone to contribute.

White Paper

DRAFT

Resources

References

[1] The History of Bots on Wikipedia

[2] ttps://quicksilver.primer.ai/

[1]

[2]