Growth/Personalized first day/Structured tasks/Add a link

This page describes the 1>Special:MyLanguage/Growth|Growth team's work on the "structured tasks" project, which is related to the "2>Special:MyLanguage/Growth/Personalized first day/Newcomer tasks|newcomer tasks" and "3>Special:MyLanguage/Growth/Personalized first day/Newcomer homepage|newcomer homepage" projects. This page contains major assets, designs, open questions, and decisions. Most incremental updates on progress will be posted on the general 1>Special:MyLanguage/Growth/Growth team updates|Growth team updates page, with some large or detailed updates posted here.

Design
While the "structured task sketch" section above contains some quick initial sketches to demonstrate the idea behind structured tasks, this section contains our current design thinking. To look into the full set of thinking around designs for the "add a link" structured task, , which contains background, user stories, and initial design concepts.

Comparative review
When we design a feature, we look into similar features in other software platforms outside of the Wikimedia world. These are some highlights from comparative reviews done in preparation for Android’s suggested edits feature, which remain relevant for our project.


 * Task types – are divided into five main types: Creating, Rating, Translating,  Verifying content created by others (human or machine), and Fixing content created by others.
 * Visual design & layout – incentivizing features (stats, leaderboards, etc) and onboarding is often very visually rich, compared to pared back, simple forms to complete short edits. Gratifying animations often compensate for lack of actual reward.
 * Incentives – Most products offered intangible incentives grouped into: Awards and ranking (badges) for achieving set milestones, Personal pride and gratification (stats), or Unlocking features (access rights)
 * Users motivations – those with more altruistic motivations (e.g., help others learn) are more likely to be incentivized by intangible incentives than those with self-interested motivations (e.g., career/financial benefits)
 * Personalization/Customization – was used in some way on most apps reviewed. The most common customization was via surveys during account creation or before a task; and geolocalization used for system-based personalization.
 * Guidance – Almost all products reviewed had at least basic guidance prior to task completion, most commonly introductory ‘tours’. In-context help was also provided in the form of instructional copy, tooltips, step-by-step flows,  as well as offering feedback mechanisms (ask questions, submit feedback)

Initial wireframes
After organizing our thoughts and doing background research, the first visuals in the design process are "wireframes". These are simply meant to experiment and display some of the ideas we think could work well in a structured task workflow. For full context around these wireframes, see the.

Mobile mockups: August 2020
Translate this section

Our team discussed the wireframes from the previous section. We considered what would be best for the newcomers, taking into account the preferences expressed by community members, and thinking about engineering constraints. In August 2020, we took the next step of creating mockups, meant to show in more detail what the feature might look like. These mockups (or similar versions) will be used in team discussions, community discussions, and user tests. One of the most important things we thought about with these mockups is the concern we heard consistently from community members during talk>Talk:Growth/Personalized first day/Structured tasks|the discussion: structured tasks may be a good way to introduce newcomers to editing, but we also want to make sure they can find and use the traditional editing interfaces if they are interested.

We have mockups for two different design concepts. We're not necessarily aiming to choose one design concept or the other. Rather, the two concepts are meant to demonstrate different approaches. Our final designs may contain the best elements from both concepts:


 * Concept A: the structured task edit takes place in the Visual Editor. The user can see the whole article, and switch out of "recommendation mode" into source or visual editor mode.  Less focused on adding the links, but easier access to the visual and source editors.
 * Concept B: the structured task edit takes place in its own new area. The user is shown only the paragraph of the article that needs their attention, and can go edit the article if they choose.  Fewer distractions from adding links, but more distant access to the visual and source editors.

Please note that the focus in this set of mockups is on the user flow and experience, not on the words and language. Our team will go through a process to determine the best way to write the words in the feature and to explain to the user whether a link should be added.



Static mockups

To view these design concepts, we recommend viewing the full set of slides below.



Interactive prototypes

You can also try out the "interactive prototypes" that we're using for live user tests. These prototypes, for Concept A and for Concept B, show what it might feel like to use "add a link" on mobile. They work on desktop browsers and Android devices, but not iPhones. Note that not everything is clickable -- only the parts of the design that are important for the workflow.

 Essential questions 

In discussing these designs, our team is hoping for input on a set of essential questions:


 * 1) Should the edit happen at the article (more context)?  Or in a dedicated experience for this type of edit (more focus, but bigger jump to go use the editor)?
 * 2) What if someone wants to edit the link target or text?  Should we prevent it or let them go to a standard editor?  Is this the opportunity to teach them about the visual editor?
 * 3) We know it’s essential for us to support newcomers discovering traditional editing tools. But when do we do that? Do we do it during the structured task experience with reminders that the user can go to the editor? Or periodically at completion milestones, like after they finish a certain number of structured tasks?
 * 4) Is "bot" the right term here? What are some other options? "Algorithm", "Computer", "Auto-", "Machine", etc.?"   What might better help convey that machine recommendations are fallible and the importance of human input?

Mobile user testing: September 2020
Background

During the week of September 7, 2020, we used usertesting.com to conduct 10 tests of the mobile interactive prototypes, 5 tests each of Concepts A and B, all in English. By comparing how users interact with the two different approaches at this early stage, we wanted to better understand whether one or the other is better at providing users with good understanding and ability to successfully complete structured tasks, and to set them up for other kinds of editing afterward. Specific questions we wanted to answer were:


 * Do users understand how they are improving an article by adding wikilinks?
 * Do users seem like they will want to cruise through a feed of link edits?
 * Do users understand that they're being given algorithmic suggestions?
 * Do users make better considerations on machine-suggested links when they have the full context of the article (like in Concept A)?
 * Do users complete tasks more confidently and quickly in a focused UI (like in Concept B)?
 * Do users feel like they can progress to other, non-structured tasks?

Key findings


 * The users generally were able to exhibit good judgment for adding links. They understood that AI is fallible and that they have to think critically about the suggestions.
 * While general understanding of what the task would be ("adding links") was low at first, they understood it well once they actually started doing the task. Understanding in Concept B was marginally lower.
 * Concept B was not better at providing focus. The isolation of excerpts in many cases was mistaken for the whole article. There were also many misunderstandings in Concept B about whether the user would be seeing more suggestions for the same term, for the same article, or for different articles.
 * Concept A better conveyed expectations on task length than Concept B. But the additional context of a whole article did not appear to be the primary factor of why.
 * As participants proceed through several tasks, they become more focused on the specific link text and destination, and less on the article context. This seemed like it could lead to users making weak decisions, and this is a design challenge. This was true for both Concepts A and B.
 * Almost every user intuitively knew they could exit from the suggestions and edit the article themselves by tapping the edit pencil.
 * All users liked the option to view their edits once they finished, either to verify or admire them.
 * “AI” was well understood as a concept and term. People knew the link suggestions came from AI, and generally preferred that term over other suggestions. This does not mean that the term will translate well to other languages.
 * Copy and onboarding needs to be succinct and accessible in multiple points. Reading our instructions is important, but users tended not to read closely. This is a design challenge.

Outcome


 * We want to build Concept A for mobile, but absorbing some of the best parts of Concept B's design. These are the reasons why:
 * User tests did not show advantages to Concept B.
 * Concept A gives more exposure to rest of editing experience.
 * Concept A will be more easily adapted to an “entry point in reading experience”: in addition to users being able to find tasks in a feed on their homepage, perhaps we could let them check to see if suggestions are available on articles as they read them.
 * Concept A was generally preferred by community members who commented on the designs, with the reason being that it seemed like it would help users understand how editing works in a broader sense.
 * We still need to design and test for desktop.

Ideas

The team had these ideas from watching the user tests:


 * Should we consider a “sandbox” version of the feature that lets users do a dry run through an article for which we know the “right” and “wrong” answers, and can then teach them along the way?
 * Where and when should we put the clear door toward other kinds of editing?  Should we have an explicit moment at the end of the flow that actively invites them copyedit or do another level task?
 * It’s hard to explain the rules of adding a link before they try the task, because they don't have context. How might we show them the task a little bit, before they read the rules?
 * Perhaps we could onboard the users in stages?  First they learn a few of the rules, then they do some links, then we teach them a few more pointers, then they do more links?
 * Should users have a cooling-off period after doing lots of suggestions really fast, where we wait for patrollers to catch up, so we can see if the user has been reverted?

Desktop mockups: October 2020
After designing, testing, and deciding on Concept A for mobile users, we moved on to thinking about desktop users. We again have the same question around Concepts A and B. The links below open interactive prototypes of each, which we are using for user testing.


 * Concept A: the structured task takes place at the article, in the editor, using some of the existing visual editor components. This gives users greater exposure to the editing context and may make it more likely that they explore other kinds of editing tasks.
 * Concept B: the structured task takes place on the newcomer homepage, essentially embedding the compact mobile experience into the page. Because the user doesn't have to leave the page, this may encourage them to complete more edits. They could also see their impact statistics increase as they edit.

We are user testing these designs during the week of October 23. See below for mockups showing the main interaction in each concept.

Link recommendation algorithm
See this page for an explanation of the link recommendation algorithm and for statistics around its accuracy.. In short, we believe that users will experience an accuracy around 75%, meaning that 75% of the suggestions they get should be added. It is possible to tune this number, but the higher the accuracy is, the fewer candidate link we will be able to recommend. After the feature is deployed, we can look at revert rates to get a sense of how to tune that parameter.

Link recommendation service backend
To follow along with engineering progress on the backend "add link" service, please see this page on Wikitech.