Growth/Personalized first day/Newcomer tasks/ja

このページで解説する「新人編集者向けタスク」プロジェクトの作業は、Growthチームが取り組む「それぞれの人の初日」という大枠のイニシアチブ配下の固有のプロジェクトです. このページでは主要なアセット、設計、意思決定について述べます. 進捗状況で増えた更新のほとんどは一般向けの Growthチームの更新ページに、そしてこのページには特定の大規模または詳細な更新をそれぞれ掲載します.

チームの開発対象は、それぞれの試作品を見ると概要がつかめます (矢印キーで移動)：


 * デスクトップ
 * モバイル
 * トピック選択機能

2019年7月24日、このプロジェクトに基づく開発開始. 第1バージョンは対象のウィキ4件で2019年 11月20 日に実装しました.

現状

 * 2019-07-24: 初回打ち合わせで新人編集者のタスクを協議
 * 2019-08-27: 設計コンセプトをめぐるチームの会議
 * 2019-09-09: 開発作業の管理を Phabricator で開始
 * 2019-09-23: デスクトップ版ユーザーテスト完了
 * 2019-09-30: モバイル版ユーザーテスト完了
 * 2019-11-20: V1.0 の実装先はチェコ語版、朝鮮語版、アラビア語版、ベトナム語版のウィキペディア
 * 2019-12-13: first variant test ("initiation") deployed to Czech, Korean, Arabic, and Vietnamese Wikipedias
 * 2020-01-14: testing the addition of topic matching, to be deployed the week of 2020-01-20.
 * 2020-01-21: the option to select topics of interest was added to the suggested edits module
 * 2020-03-05: topic matching upgraded to use ORES models
 * 2020-04-03: results from first variant test
 * 次へ: engineering work on adding guidance

要約
ウィキを初めて訪れた時、新人編集者にはあらゆる機会を得て成功体験をすることが重要だと考えています. ところが、まだ習熟していない段階では難易度が高すぎるタスクに取り組もうとする場合、あるいはやってみたくなるタスクが見つからない場合、もしくは最初の編集はしたものの、引き続き編集をするきっかけがわからない場合がしばしばあるようです. そのため二度とウィキを再訪しなくなってしまいます. 過去に試みたところ編集者にタスクをお勧めするとうまくいったため、新人編集者向けホームページを使い、新人編集者向きのタスクをお勧めすることが適切だと考えました.

'''そこで留意点がいくつかあります. '''
 * 多くの新人編集者はそれぞれ、何かやってみたいことがあり、例えば特定の記事に写真を追加したいと考えてウィキを訪問するようです. その目標達成の邪魔にならないようにします.
 * 新人編集者が編集の腕を上げるには、難易度が低い課題から高い課題へと段階を追っていきます.
 * 最初の段階で成功体験を得た新人編集者は、その後も編集を続ける意欲が高まります.

これらを念頭に置くと、新人編集者それぞれの場所と時間のタイミングに合わせ、また興味に沿うよう、正しい編集ができる技術を習得するタスクをお勧めしていきます.

歓迎アンケートというツールを使うと、新人編集者それぞれに適するタスク選びができます. このツールはそもそも新人編集者一人ひとりがその人らしい経験をするように作られました. アンケートでは、各人がウィキで目標にしていることや興味の対象の回答を得て、今後、それをオプション情報としてタスクのお勧めに反映できるようにする予定です.

解決すべき最大の課題のひとつは、新人編集者に適したタスクの選び方です. 既存のソースは十分にあり、たとえば記事で使うテンプレートやコンテンツ翻訳ツール内のお勧め、あるいは出典ハントなどのツールのヒントなどが使えそうです. 新人編集者それぞれが目標達成するには、どの選択肢が役に立つかが課題です.

まず最初は新人編集者向けのホームページを使ったタスクのお勧めに注力するもの、長期的には新規機能を開発して編集体験のお勧めへと拡張し、新人編集者が課題を達成するように補助する方法も考えられます.

また同じく長期的には、タスクのお勧め機能をその他の新人編集者の経験、例えばホームページの影響力モジュールあるいはヘルプパネルと連携させることも検討課題になるかもしれません.

Why this idea is prioritized
We know from research and experience that many newcomers fail early in their editing journey for one of these reasons:


 * They arrive with a very challenging edit in mind, such as writing a new article or adding an image. Those tasks are difficult enough that they likely fail and don't return.
 * They arrive without knowing what to edit, and can't find any edits to make.

We also know that on the newcomer homepage, the most frequently clicked-on module is the "user page" module -- the only thing on the page that encourages users to start editing. This makes us think that many users are looking for a clear way to get started with editing.

And from past Wikimedia endeavors, we've seen that task recommendations can be valuable. SuggestBot is a project that sends personalized recommendations to experienced users, and is a well-received service. The Content Translation tool also serves personalized recommendations based on past translations, and has been shown to increase the volume of editing.

For all these reasons, we think that recommending specific editing tasks for newcomers will give them a clear way to get started. For those newcomers that have an edit in mind that we want to do, we'll encourage them to try some easy edits first to build up their skills. For those newcomers who do not have a specific preference on what to edit, they'll hopefully find some good edits from this feature.

Glossary
''There are many terms that sound similar and can be confusing. This section defines each of them.''


 * "Newcomer tasks"
 * The entire workflow that recommends edits for newcomers and guides them through the edits.


 * "Suggested edits"
 * The name of the specific module that the newcomer tasks workflow adds to the newcomer homepage.


 * "Task recommendations" or "Task suggestions"
 * Lists of articles that need editing work, suggested automatically to users.


 * "Personalized"
 * Software that adapts automatically to each user to fit their needs.


 * "Customized"
 * Software that the user adapts to fit their needs.


 * "Topic"
 * A content subject, such as "Art", "Music", or "Economics".


 * "Topic matching"
 * The ability to find tasks for newcomers that match their topics of interest.


 * "Guidance"
 * Features that help the newcomer complete the suggested task while they are working on it.


 * "Maintenance template"
 * Templates that are put on articles indicating that work needs to be done on them.

Recommending tasks
The core challenge to this project is: Where will the tasks come from and how will we give the right ones to the right newcomers?

The graphic below shows our priorities when recommending tasks to newcomers.

As shown in the graphic above, we would give newcomers tasks that...


 * ...arrive at the right time and place for a newcomer's journey.
 * ...teach relevant conceptual and technical skills.
 * ...gradually guide users to build up their editing abilities.
 * ...be personalized to their interests.
 * ...show them the value and impact of editing.
 * ...motivate them to participate continually.

For instance, we do not want to give newcomers tasks that are irrelevant to what they hope to accomplish. If a newcomer wants to write a new article, then asking them to add a title description will not teach them skills they need to be successful.

We're splitting this challenge into two parts: the sourcing the tasks and topic matching.

Sourcing the tasks
There are many different places we could find tasks for newcomers to do. Our team listed as many as we could think of and evaluated them for whether they seem to be achievable for the first version of the feature. Below is a table showing the many sources of tasks that we evaluated in coming to the decision to start by using maintenance templates.

Version 1.0: basic workflow
In version 1.0, we will deploy the basic parts of the newcomer tasks workflow. It will recommend articles to newcomers that require different types of edits, but it will not match the articles to the newcomers' topics of interest (version 1.1), and it will also not guide the newcomers in completing the task (version 1.2).

Maintenance templates
We're going to be starting by using maintenance templates and categories to identify articles that need work. All of our target wikis use some set of maintenance templates or categories on thousands of articles, tagging them as needing copyediting, references, images, links, or expanded sections. And previous task recommendations software, such as SuggestBot, have used them successfully. These are some examples of maintenance categories:


 * Articles needing links in Arabic Wikipedia
 * Articles needing copyediting in Korean Wikipedia
 * Articles needing references in Czech Wikipedia



In this Phabricator task, we investigated exactly which templates are present and in what quantities, to get a sense of whether there will be enough tasks for newcomers. There seem to be sufficient numbers for the initial version of this project. We are likely to incorporate other task sources from the table below in future versions.

It's also worth noting that it could be possible to supplement many of these maintenance templates with automation. For instance, it is possible to automatically identify articles that have no internal links, or articles that have no references. This is an area for future exploration.

During the week of October 21, 2019, the members of the Growth team did a hands-on exercise in which we attempted to edit articles with maintenance templates. This helped us understand what challenges we can expect newcomers to face, and gave us ideas for addressing them. Our notes and ideas are published here.

Comparative review
Our team's designer reviewed the way that other platforms (e.g. TripAdvisor, Foursquare, Amazon Mechanical Turk, Google Crowdsource, Reddit) offer task recommendations to newcomers. We also reviewed Wikimedia projects that incorporate task recommendations, such as the Wikipedia Android app and SuggestBot. We think there are best practices we can learn from other software, especially when we see the same patterns across many different types of software. Even as we incorporate ideas from other software, we will still make sure to preserve Wikipedia's unique values of openness, clarity, and transparency. The main takeaways are below, and the full set of takeaways is on this page:


 * Task types – bucket into 4 types: Rating content, Creating content, Moderating/Verifying content, Translating content
 * Incentives – Most products offered intangible incentives mainly bucketed into the form of: Awards and ranking (badges), Personal pride and gratification (stats), or Unlocking features (access rights)
 * Reward incentives – promote badges or attainments of specific milestones (e.g., a badge for adding 50 citations)
 * Personalization/Customization – Most have at least one facet of personalization/customization. Most common customization is user input on surveys upon account creation or before a task, most common system-based personalization type is geolocalization
 * Visual design & layout – incentivizing features (stats, leaderboards, etc) and onboarding is visually rich compared to pared back, simple forms to complete short edits.
 * Guidance – Almost all products reviewed had at least basic guidance prior to task completion, most commonly introductory ‘tours’. In-context help was also provided in the form of instructional copy, tooltips, step-by-step flows,  as well as offering feedback mechanisms (ask questions, submit feedback)

Mockups
Our evolving designs can always be found in two sets of interactive mockups (use arrow keys to navigate): Those mockups contain explorations of all the difference parts of the user journey, which we have broken down into several parts:
 * Desktop
 * Mobile


 * 1) Gathering information from the newcomer: learning what we need in order to recommend relevant tasks.
 * 2) Feature discovery: the way the newcomer first encounters task recommendations.
 * 3) Task recommendations: the interface for filtering and choosing tasks.
 * 4) Guidance during editing: once the newcomer is doing a task, the guidance that helps them understand what to do.
 * 5) User feedback: ways in which the newcomer can indicate that they are not satisfied with the recommended task.
 * 6) Next edit: how we continue the user's momentum after the save an edit.

Below are some of the original draft design concepts as the team continues to refine our approach.

Desktop
2019年9月30日の週に usertesting.com 上でデスクトップ版新人編集者タスクの試作品について、ウィキメディア運動に関連のないインターネット利用者を対象に6回テストしました. 有償回答者が試作品を使って観察したことを言葉で表現し、経験について設問に答えました. 結果の全文はこちらの Phabricator タスクをご参照ください. このテストの目標は次の通りです.


 * 1) 新人編集者向けのタスクモジュールについて発見する可能性を数値化する
 * 2) タスクモジュールの使いやすさの改善点を特定.
 * 3) お勧めの記事の選び方とレビューの方法を理解したか?
 * 4) 興味やタスクの難易度で絞り込めると理解したか？
 * 5) お勧めの記事をどうすれば編集できるか理解したか？
 * 6) お勧めに対する利用者の反応と、タスク実行中のガイドに期待したことを数値化する.


 * わかったことの概要


 * 回答者は全員、自分の興味があるトピックに基づいたお勧めを受けるのは合理的だし意味があると考えた.
 * 同様に、タスクの難易度に差があることを回答者全員、肯定的に受け止めた.
 * お勧めの編集モジュールの全般的な使いやすさは非常に高く評価された. 回答者はもっと記事を表示するには何をクリックすればよいか、トピックやタスクのレベル変更はフィルタを使うことをあらかじめ理解しており、カードをクリックするとお勧めの編集が表示されることを理解した.
 * 「お勧めの編集を見る」ボタンを押すと、新規記事を書くという目的達成の補助が受けられると最初に理解できなかった回答者は6名中4名. これは一般的な心理モデルとして、「新規記事の作成」と「編集」は別個のものであるという概念が示唆するものと考えられる.
 * 回答者全員、スタートモジュールが出発点だと明白に理解した. それに加えて、スタートモジュールで先に進むため、「お勧めの編集を見る」ボタンに注目した人が多い.
 * 利用者は導入のダイアログでトピックの追加やタスクのレベルの説明を読み、それに基づいてお勧めの記事が表示されると明白に理解し、あらかじめ予想していた.
 * 人気の話題を選択したり自分の話題を追加したり、全員が問題なく実行できた.
 * お勧めの編集モジュールの目的は全員が理解できた.
 * 簡単レベルと中程度のタスクを終了するまで、記事の新規作成はできないものだと誤解した・考えていた回答者は2名.
 * 編集モードに移行すると、ガイドを読みたいときはヘルプパネルのボタンを押せばよいと理解した回答者は6名中5名.
 * ヘルプパネルからメンターに連絡できると期待した回答者は4名.
 * タスクのヒントがちっとも役に立たなかったという回答者は複数名いた.


 * 推薦事項


 * コンテンツの新規作成も編集行為であることの説明文の改善と、それに先立つ利用者教育の実施.
 * テスト結果に沿って影響度モジュールを更新、おすすめした編集を利用者が理解できるように補助する.
 * 編集中に役立つコンテンツヘルプを提供する. 編集に挑戦する編集者には重要.
 * ヘルプパネルのタスクのヒントに、利用者が自分で更新する「チェックリスト」を導入.
 * 何をするべきか短い説明文をつける.
 * 記事全体の文字編集をする必要はないことを利用者に伝える.
 * リアルライムの検索結果がわかると、利用者がお勧め記事の編集に関心を持ちやすくなり、自分に適した記事を探すためにフィルタ機能を使うよう誘導できる.

モバイル版
2019年9月30日の週に usertesting.com 上でモバイル版新人編集者タスクの試作品を6回テストしました. 結果の全文はこちらの Phabricator タスクをご参照ください. このテストの目標はデスクトップ版と共通しますが、さらにモバイル版の経験がデスクトップ版とどう異なると良いか項目を加えました. モバイル版のユーザーテスト被験者にはウィキペディアに画像を追加しようというシナリオを用意しました (デスクトップ版シナリオでは新規記事の作成).

わかったことの概要


 * ほぼすべての利用者が、(設計変更後の) スタートモジュールは何から始めるかガイドを段階的にわかりやすく示していると回答.
 * 以下のような「お勧めの編集」モジュールを追加すると混乱の原因にはならないものの、画像の追加タスクの途中でヘルプが見たいときにそれを開けば、やり方がわかると利用者に伝わっていない.
 * お勧めの編集は使う意欲を引き出すにはたいへん効果的で、参加者はそれぞれの要素を理解し使いこなした (フィルタ機能、記事をもっと見るなど). ところがお勧めの編集は学習目的あるいは退屈しのぎに取り組むものとしか思われなかった.
 * 回答者によっては、提示された広範な話題よりも、もっと個別の細かい話題を求めていた.
 * 掘り下げた詳しい情報の提供は教育的ではあるものの、うんざりさせる可能性も潜在する. 回答者は全員、「画像の追加」が難しい作業に分類されていることを意外に感じ、落差はあるものの一様に不満を述べた.
 * 興味の対象で絞り込みができる点はとても好評だった.
 * テストの終盤にかけて回答者3名から、「簡単」なタスクの修了「判定」もしくは必須条件を通過させてから、中程度・高度なタスクに取り掛かるべきだと述べた.
 * お勧めの編集は、利用者に編集方法を学習させるための編集作業だということを回答者全員が理解し、編集作業には難しいものもあることが分かったと回答した.
 * 利用者は全員、ヘルプパネルに提供されたガイダンスを編集作業の途中で使いこなせなくて苦労している. ビルドに入る前に、この部分は真剣に対策を考えなければならない.

推薦事項


 * お勧めの編集は独自のカードを設定せず、スタートモジュールに組み込む.
 * 説明文と利用者教育の画像を改善し、お勧めの編集には単に編集作業の練習をする以上の価値があること、タスクの難易度は目安であって、タスクは好きな順番に取り組んでよいと、きちんと伝わるようにする.
 * オーバーレイを追加して、それぞれの編集者に特化したお勧めの編集へと誘導する.
 * タスクとトピックの両面で、リアルタイムの抽出結果のカウンターを設定する.
 * 利用者単位で興味の対象をきめ細かく検索できるようにする.
 * 利用者がお勧めの編集を開いた場合、それがリアルで影響力のある編集だということを強調する.
 * 提供するヘルプのコンテンツがすべて明確に閲覧できるよう、タスク内に表示するヘルプパネルを設計更新します.

Version 1.1: topic matching
Past research and development shows that users are more likely to do recommended tasks if the tasks match their topical interests. SuggestBot uses an editor's past editing history to find similar articles, and those intelligent results are shown in this paper to be executed on more often than random results. The Content Translation tool also recommends articles based on a user's previous translation history, and those recommendations have increased the translation volume.

In looking at the usage of V1.0 of newcomer tasks, which does not contain topic matching, we see that there are users who navigate through many suggested articles, and end up clicking on none. There are also users who navigate through many, and end up editing only the ones they happen to find that belong to a certain topic, such as medicine. These are also good indicators that topics can be valuable to help newcomers find articles they want to edit.

Our challenge with newcomers is a "cold start problem", in that newcomers do not have any edit history to use when trying to find relevant articles for them to edit. We want to have an algorithm that says what the topic is of each article, and use that to filter the articles that have maintenance templates.

Algorithms


There are multiple approaches with which we might find articles that match a user's stated topic of interest. While our team identified many, we built prototypes for three methods and tested them:


 * morelike: assign a seed list of articles that represent each topic area (e.g. "Art" might be represented by the articles for "Painting", "Sculpture", "Dance", and "Weaving".) Use that seed list to find other articles that are similar to those in the seed list by using a similarity algorithm called "morelike".
 * free text: instead of choosing from a set list of topics, allow newcomers to type in any phrase they want to indicate a topic. Use regular Wikipedia search to surface articles relevant to that phrase.
 * ORES: is a machine learning service that – among other things – can return a predicted topic for any article.  Though this prediction service only works in English Wikipedia, there are ways to translate predictions from English to other wikis.

In this Phabricator task, we evaluated the three methods, and decided to proceed with the ORES model. The Growth team worked with the Scoring team to strengthen the model, and with the Search team to make the model predictions available to the newcomer tasks workflow. During the time that this work was happening, we deployed the somewhat worse-performing morelike algorithm, and switched to the ORES model about a month later.

The ORES model we use now offers 64 topics, and we chose to expose 39 of them to newcomers. The evaluation in four different languages showed that on average, 8.5 out of 10 suggestions for a given topic seem like good matches for that topic.

Design
In designing interfaces that allow newcomers to choose topics of interest, these are some of the considerations:

These mockups contain our current designs for this interface. You can navigate with your keyboard's arrow keys. Below are some images of the mockups:
 * How to make a long list of about 30 topics not overwhelming to the user?
 * How to handle multiple layers of topics (e.g. if "Science" has sub-topics of "Biology", "Chemistry", etc.)
 * Whether users can give feedback when a topic does not match what they selected?

Version 1.2: guidance
After newcomers have selected an article from the suggested edits module, they should receive guidance about how to click edit and complete the edit successfully. While it is exciting that some portion of newcomers are completing suggested edits without guidance, we're confident that by adding guidance, we will substantially increase how many newcomers edit.

We have decided to repurpose the help panel as the place to deliver this guidance. Reusing the help panel will allow us to build quickly. The guidance contains three phases:


 * 1) When the user has arrived on the article and before they click edit.
 * 2) After clicking edit and before saving an edit.
 * 3) After saving an edit.

Some of the ideas we are considering implementing include:


 * Guidance tailored to each type of edit, varying depending on whether the suggested edit is a copyedit, adding links, adding references, etc.
 * Reminder that an edit can be small, and that the user does not have to edit the whole article.
 * Step-by-step walkthrough that is like a checklist for completing the edit.
 * Highlighting the maintenance templates in the article so that the user can see why the article was suggested.
 * An indicator that encourages the user to click the edit button.
 * A place to put videos that demonstrate how to complete the edit.
 * Suggestions for additional edits after saving the initial edit.
 * Ability for the user to notify their mentor that they have done an edit, so the mentor can check their work and thank them.

During the last week of December 2019, we user tested desktop and mobile prototypes, which can be found below. We will post the user test results after assembling them.


 * Desktop prototype
 * Mobile prototype

Below are some images of the prototype:

Variant testing
After deploying the first version of newcomer tasks, we want to start testing different variants of the feature, so that we can improve it iteratively. Rather than just having one design of newcomer tasks, and seeing if newcomers are more productive with it than without it, we plan to test more than variant of newcomer tasks at a time, and compare them. We have compiled an exhaustive list of all the ideas of variants to test -- but we will only end up testing perhaps 10 per year, because of the effort and time it takes to build, test, and analyze.

In March, April, and May 2020, we'll be testing variants that aim to get more users into the newcomer tasks flow.

See this page for the list of variant tests and their results.

Usage counts
As of 2020-01-13, 10,835 distinct users have visited their homepage since the deployment of newcomer tasks on 2019-11-20. The tables below show how far into the newcomer tasks workflow those users progressed. We see that generally one sixth of users who visit their homepage interact with the suggested edits module. Of those, most of them click a task. Most surprisingly, we see many users clicking on tasks and even going all the way through saving an edit (166 users saving about 400 edits). This is surprising because the feature does not yet contain topic matching (which would make the tasks more appealing to users) or guidance (which would help them understand how to save edits).

The second table shows percentages of the raw numbers in the first table.

編集の質
The Growth team's ambassadors have gone through over 300 edits saved by newcomers and marked whether or not each edit was productive (meaning that it improved the article). We are happy to see that about 75% of the edits are productive. This is similar to the baseline rate for newcomer edits, and we're glad that this feature has not encouraged vandalism. Most of the edits are copyedits, with many also adding links, and some even adding content and references. About a third of users who make one suggested edits go on to make additional suggested edits, and many also go on to make edits that are not suggested by the feature, which is behavior we are happy to see.

編集の質の高さはチームの励みになり、もっと多くの新人編集者の皆さんがこのワークフローをやってみて、完成まで作業を続けるように今後とも機能の改良に努力していきます.