Growth/Personalized first day/Structured tasks/Add an image

This page describes work on an "add an image" structured task, which is a type of 1>Special:MyLanguage/Growth/Personalized first day/Structured tasks|structured task that the Growth team will offer through the 2>Special:MyLanguage/Growth/Personalized first day/Newcomer homepage|newcomer homepage. The 3>Special:MyLanguage/Wikimedia Apps/Team/Android|Android team is also thinking about a similar task for the Wikipedia Android app using the same underlying components. Discussion and updates on this page are relevant to the work of both teams.

This page contains major assets, designs, open questions, and decisions.

Most incremental updates on progress will be posted on the general 1>Special:MyLanguage/Growth/Growth team updates|Growth team updates page, with some large or detailed updates posted here.

Current status

 * 2020-06-22: initial thinking about ideas to create a simple algorithm to recommend images
 * 2020-09-08: evaluated a first attempt at a matching algorithm in English, French, Arabic, Korean, Czech, Vietnamese
 * 2020-09-30: evaluated a second attempt at a matching algorithm in English, French, Arabic, Korean, Czech, Vietnamese
 * 2020-10-26: internal engineering discussion of possible feasibility for image recommendation service
 * 2020-12-15: running initial round of user tests to start to understand whether newcomers might succeed at this task

Summary
1>Special:MyLanguage/Growth/Personalized first day/Structured tasks|Structured tasks are meant to break down editing tasks into step-by-step workflows that make sense for newcomers and make sense on mobile devices. The Growth team believes that introducing these new kinds of editing workflows will allow more new people to begin participating on Wikipedia, some of whom will learn to do more substantial edits and get involved with their communities. After 2>Special:MyLanguage/Growth/Personalized first day/Structured tasks#Community%20discussion|discussing the idea of structured tasks with communities, we decided to build the first structured task: "3>Special:MyLanguage/Growth/Personalized first day/Structured tasks/Add a link|add a link".

Even as we build that first task, we have been thinking about what a next structured task could be, and we think that adding images could be a good fit for newcomers. The idea is that a simple algorithm would recommend images from Commons to be placed on articles that have no images. To start with, it would use only existing connections that can be found in Wikidata, and newcomers would use their judgment to place the image on the article or not.

We know that there are many open questions around how this would work, many potential reasons that it might not go right. That's why we are hoping to hear from lots of community members and have an ongoing discussion as we decide how to proceed.

Why images?
 Looking for substantial contributions 

When we first discussed structured tasks with community members, many pointed out that adding wikilinks is not a particularly high-value type of edit. Community members brought up ideas for how newcomers could make more substantial contributions. One idea is images. Wikimedia Commons contains 65 million images, but in many Wikipedias, over 50% of articles have no images. We believe that many images from Commons can make Wikipedia substantially more illustrated.

 Interest from newcomers 

We know that many newcomers are interested in adding images to Wikipedia. "To add an image" is a common response newcomers give on the 1>Special:MyLanguage/Growth/Personalized first day/Welcome survey|welcome survey for why they are creating their account. We also see that one of the most frequent 2>Growth/Focus on help desk/Help panel|help panel questions is about how to add images, true across all the wikis we work with. Though most of these newcomers are probably bringing their own image that they want to add, this hints at how images can be engaging and exciting. That makes sense, given the image-heavy elements of the other platforms that newcomers participate in -- things like Instagram and Facebook.

 Difficulty of working with images 

The many help panel questions about images reflects that the process to add them to articles is too difficult. Newcomers have to understand the difference between Wikipedia and Commons, rules around copyright, and the technical parts of inserting and captioning the image in the right place. Finding an image in Commons for an unillustrated article requires even more skills, such as knowledge of Wikidata and categories.

 Success of "Wikipedia Pages Wanting Photos" campaign 

The 1>metawiki:Special:MyLanguage/Wikipedia_Pages_Wanting_Photos|Wikipedia Pages Wanting Photos campaign (WPWP) was a surprising success: 600 users added images to 85,000 pages. They did this with the assistance of a couple of 2>Special:MyLanguage/metawiki:Wikipedia_Pages_Wanting_Photos/Resources|community tools that identified pages that have no images, and which suggest possible images through Wikidata. While there are important lessons to be learned about how to help newcomers succeed with adding images, this gives us confidence that users can be enthusiastic about adding images and that they can be assisted by tools.

 Taking this all together 

Thinking about all this information together, we think that it could be possible to build an "add an image" structured task that is both fun for newcomers and productive for Wikipedias.

Algorithm
Our ability to make a structured task for adding images depends on whether we can create an algorithm that generates sufficiently good recommendations. We definitely do not want to urge newcomers to add the wrong images to articles, which would cause work for patrollers to clean up after them. Therefore, trying to see if we could make a good algorithm is one of the first things we've worked on.

Logic
We have been working with the [https://research.wikimedia.org/ Wikimedia Research team], and so far we have been testing an algorithm that prioritizes accuracy and human judgment. Rather than using any computer vision, which can generate unexpected results, it simply aggregates existing information in Wikidata, drawing on connections made by experienced contributors. These are the three main ways that it suggests matches to unillustrated articles:


 * Look at the Wikidata item for the article. If it has an image (P18), choose that image.
 * Look at the Wikidata item for the article. If it has a Commons category associated (P373</>), choose an image from the category.
 * Look at the articles about the same topic in other language Wikipedias. Choose a lead image from those articles.

The algorithm also includes logic to do things like exclude images that are likely icons or that are present on an article as part of a navbox.

Performance
As of December 2020, we've gone through two rounds of testing the algorithm, each time looking at matches to articles in six languages: English, French, Arabic, Vietnamese, Czech, and Korean. The evaluations were done by our team's ambassadors, who are native speakers in each languages. Looking at 50 matches in each language, we went through and classified them into these groups:

A question that runs throughout the work on an algorithm like this is: how accurate does it need to be? If 75% of matches are good is that enough? Does it need to be 90% accurate? Or could it be as low as 50% accurate? This depends on how good the judgment is of the newcomers using it, and how much patience they have for weak matches. We'll learn more about this when we user test the algorithm with real newcomers.

In the first evaluation, the most important thing is that we found a lot of easy improvements to make to the algorithm, including types of articles and images to exclude. Even without those improvements, about 20-40% of matches were "2s", meaning great matches for the article (depending on the wiki). 1>phab:T260857#6444769</>|You can see the full results and notes from the first evaluation here.

For the second evaluation, many improvements were incorporated, and the accuracy increased. Between 50-70% of matches were "2s" (depending on the wiki). But increasing the accuracy can decrease the coverage, i.e. the number of articles for which we can make matches. Using conservative criteria, the algorithm may only be able to suggest tens of thousands of matches in a given wiki, even if that wikis has hundreds of thousands or millions of articles. We believe that that kind of volume would be sufficient to build an initial version of this feature. 1>phab:T263374</>|You can see the full results and notes from the second evaluation here.

We are continuing to make improvements to the algorithm, and in December 2020, we are trying a third evaluation, which you can follow along with 1>phab:T266271</>|here.

Open questions
Images are such an important and visible part of the Wikipedia experience. It is critical that we think hard about how a feature enabling the easy adding of images would work, what the potential pitfalls might be, and what the implications would be for community members. To that end, we have many open questions, and we want to hear of more that community members can bring up.


 * Will our algorithm be sufficiently accurate such that plenty of good matches are provided?
 * What metadata from Commons and the unillustrated article do newcomers need in order to make a decision about whether to add the image?
 * Will newcomers have sufficiently good judgment when looking at recommendations?
 * Will newcomers who don't read English be equally able to make good decisions, given that much of Commons metadata is in English?
 * Will newcomers be able to write good captions to go along with images that they place in the articles?
 * How much should newcomers judge images based on their "quality" as opposed to their "relevance"?
 * Will newcomers think this task is interesting? Fun? Difficult? Easy? Boring?
 * How exactly should we determine which articles have no images?
 * Where in the unillustrated article should the image be placed? Is it sufficient to put it at the top of the article?
 * How can we be mindful of potential bias in the recommendations, i.e. perhaps the algorithm will make many more matches for topics in Europe and North America.
 * Will such a workflow be a vector for vandalism? How can this be prevented?

Validating the idea


Thinking about the open questions above, in addition to community input, we want to generate some quantitative and qualitative information to help us evaluate the feasibility of building an "add an image" feature. Though we have been evaluating the algorithm amongst staff and Wikimedians, it is important to see how newcomers react to it, and to see how they use their judgment when deciding on whether an image belongs in an article.

To that end, we are going to run tests with usertesting.com, in which people new to Wikipedia editing can go through potential image matches in a prototype and respond "Yes", "No", or "Unsure". We built a quick prototype for the test, backed with real matches from the current algorithm. The prototype just shows one match after another, all in a feed. The images are shown along with all the relevant metadata from Commons:


 * Filename
 * Size
 * Date
 * User
 * Description
 * Caption
 * Categories
 * Tags

Though this may not be what the workflow would be like for real users in the future, the prototype was made so that testers could go through lots of potential matches quickly, generating lots of information.

To try out the interactive prototype, [<tvar|1>https://o0cg2e.axshare.com/</> use this link]. Note that this prototype is primarily for viewing the matches from the algorithm -- we have not yet thought hard about the actual user experience. It does not actually create any edits. It contains 60 real matches proposed by the algorithm.

Here's what we'll be looking for in the test:


 * 1) Are participants able to confidently confirm matches based on the suggestions and data provided?
 * 2) How accurate are participants at evaluating suggestions? Do they think they are doing a better or worse job than they are actually doing?
 * 3) How do participants feel about the task of adding images to articles this way? Do they find it easy/hard, interesting/boring, rewarding/irrelevant?
 * 4) What information do participants find most valuable in helping them evaluate image and article matches?
 * 5) Are participants able to write good captions for images they deem a match using the data provided?

Concept A vs. B
In thinking about design for this task, we have a similar question as we faced for "add a link" with respect to Concept A and Concept B. In Concept A, users would complete the edit at the article, while in Concept B, they would do many edits in a row all from a feed. Concept A gives the user more context for the article and editing, while Concept B prioritizes efficiency.

In the interactive prototype above, we used Concept B, in which the users proceed through a feed of suggestions. We did that because in our user tests we wanted to see many examples of users interacting with suggestions. That's the sort of design that might work best for a platform like the Wikipedia Android app. For the Growth team's context, we're thinking more along the lines of Concept A, in which the user does the edit at the article. That's the direction we chose for "add a link", and we think that it could be appropriate for "add an image" for the same reasons.

Single vs. Multiple
Another important design question is whether to show the user a single proposed image match, or give them multiple images matches to choose from. When giving multiple matches, there's a greater chance that one of the matches is a good one. But it also may make users think they should choose one of them, even if none of them are good. It will also be a more complicated experience to design and build, especially for mobile devices. We have mocked up three potential workflows:


 * Single: in this design, the user is given only one proposed image match for the article, and they only have to accept or reject it. It is simple for the user.
 * Multiple: this design shows the user multiple potential matches, and they could compare them and choose the best one, or reject all of them. A concern would be if the user feels like they should add the best one to the article, even if it doesn't really belong.
 * Serial: this design offers multiple image matches, but the user looks at them one at a time, records a judgment, and then chooses a best one at the end if they indicated that more than one might match. This might help the user focus on one image at a time, but adds an extra step at the end.