Growth/Positive reinforcement/cs

Tato stránka popisuje práci týmu Growth na projektu "pozitivní posílení" jako součásti funkcí týmu Growth. Tato stránka obsahuje důležité podklady, návrhy rozhraní, otevřené otázky i rozhodnutí.

Více novinek týkající se práce týmu Growth najdete na obecné stránce s aktualizacemi. Závažné a větší aktuality pak budou vloženy i sem.



Současný stav

 * 2021-03-01: vytvoření projektové stránky
 * 2022-02-25: projekt zahájen, probíhá diskuse s členy týmu
 * 2022-03-01: rozšíření projektové stránky
 * 2022-05-11: diskuse s komunitou
 * 2022-08-12: dokončeno uživatelské testování
 * 2022-11-24: Byly přidány aktuální návrhy a plán měření a experimentu
 * 2022-12-01: nový modul dopadu uvolněn do pilotních wikin
 * Next: další iterace designu a inženýrství začíná vylepšováním a personalizovanou chválou

Shrnutí
Tým Growth se zaměřil na budování "soudržné zkušenosti pro nováčky", která nováčkům poskytuje "přístup" k prvkům, které jim pomáhají zapojit se do praktické komunity Wikipedie. Za pomocí editačních tipů jsme například nováčkům umožnili snadný přístup k přispívání do Wikipedie a pomocí mentorství jsme jim zpřístupnili možnost kontaktovat svého mentora. Editační tipy pomohly více nováčkům uložit svou první editaci. S tímto úspěchem v zádech jsme se rozhodli povzbudit nováčky, aby uložili ještě více editací. To vede naši pozornost k prvku, který je pro nováčky taktéž důležitý, a na kterém ještě nikdo nepracoval: Vyhodnocení úspěšnosti. Nazýváme tento projekt "pozitivní posílení".

Chceme, aby nováčci pochopili, že trvalé příspěvky na Wikipedii mají pokrok a hodnotu, čímž se zvýší zaháčkování pro ty uživatele, kteří udělali první krok při provádění úprav.

Jednou z otázek, kterou si klademe, je: Jakým způsobem můžeme povzbudit nováčky, kteří si vyzkoušeli svou Domovskou stránku, aby v editování neustávali a využít tak jejich energie?

Pozadí
Když byla Domovská stránka nováčka v roce 2019 poprvé nasazena, obsahovala základní verzi modulu Dopad, který zobrazuje počet návštěv článků upravených nováčkem. Toto je jediná část funkcí týmu Growth, které dávají nováčkovi jakoukoli zpětnou vazbu. Od té doby jsme tuto část Domovské stránky nijak nevylepšili. S tímto výchozím bodem jsme shromáždili několik důležitých poznatků o pozitivním posilování:


 * Od členů komunity jsme o modulu slyšeli dobrou zpětnou vazbu a zkušení redaktoři říkali, že je pro ně zajímavý a cenný.
 * Minulé výzkumy ukázaly, že ocenění od ostatních uživatelů zvyšují retenci uživatelů. Příkladem může být například děkování nováčkům za jejich editace (viz zde a zde). Dalším příkladem může být experiment, který proběhl na německé Wikipediik. Věříme, že ocenění od živých uživatelů bude mít větší účinek, než ocenění odeslané systémem.
 * Členové komunity vysvětlili, že je pro ně důležité, aby nováčci pokračovali se složitějšími úkoly, jakmile se naučí pracovat s těmi jednoduchými.
 * Další platformy, jako je Google, Duolingo nebo Github, používají několik metod pro pozitivní posílení, jako jsou odznaky či cíle.
 * Komunity se obávají toho, že by tento projekt mohl povzbudit nevhodné způsoby editování. Podobné chování jsme viděli v minulosti, když bylo v rámci editačních soutěží možné vyhrát finanční cenu, objevili se účastníci, kteří ukládali editace formálně vyhovující, ale i přesto problematické. Něco obdobného se také děje u rolí (například automatické schválení uživatele), které závisí pouze na počtu uložených editací.



Osobnost uživatele
Existuje mnoho částí cesty nováčka, ve kterých bychom se mohli pokusit zvýšit udržení. Mohli bychom se zaměřit na nováčky, kteří přestali upravovat po jedné nebo několika úpravách, nebo bychom se mohli zaměřit dále na nováčky, kteří přestali upravovat po týdnech aktivity. Pro tento projekt jsme se rozhodli zaměřit na ty nováčky, kteří dokončili svou první editační relaci a které chceme vrátit na druhou relaci. Diagram je znázorňuje žlutou hvězdou.

Chceme se soustředit na nováčky na tomto místě, protože to je nejbližší další místo, kde můžeme pomoci zvýšit retenci nováčků. Je to také místo, kde v současnosti pozorujeme významný úbytek nováčků. Pokud by se nám podařilo více nováčků na tomto místě udržet, mělo by to mít významný vliv na růst komunity wikipedistů.



Výzkum a design
Provedli jsme výzkum různých mechanismů, které se na různých projektech používají k motivování uživatelů, aby přispívali. Toto jsou naše hlavní výzkumné poznatky:


 * Motivace wikimediánů je mnohostranná, a s časem se mění. Noví wikipedisté jsou častěji motivováni zvědavostí a sociálními vazbami než ideologií.
 * Internal projects focus on intrinsic incentives, appeal to altruistic motivations, and are not systematically applied.
 * Broadening the appeal beyond ideological motivations may improve diversity of retained editors on Wikipedia.
 * Pozitivně naladěné zprávy od zkušených wikipedistů a mentorů jsou prokazatelně efektivní ve zvýšení krátkodobé retence.

Pro shrnutí našich současných návrhů týkajících se pozitivního posílení se můžete podívat na náš Design Brief. Jak budou pokračovat komunitní konzultace a testy použitelnosti, naše návrhy se ještě budou měnit.

Myšlenky
Když mluvíme o pozitivním posílení, napadají nás tři hlavní myšlenky. U každé z nich nás napadá několik myšlenek, které by se daly realizovat.

Dopad

 * Dopad: Vylepšená verze modulu Dopadu, založená na statistikách, grafech a dalších informacích o příspěvcích uživatele. Nová verze modulu Dopad by nováčkům ukázala více informací o jejich vlivu na Wikimedia projekt. Nováčka by také mohla povzbudit k tomu, aby v editování pokračoval. Oblasti, které můžeme prozkoumat, zahrnují například:
 * Suggested edits milestone, to nudge users to try suggested edits.
 * Statistiky o tom, jak uživatel editoval v průběhu času (podobné jako informace, které jsou dostupné v X Tools).
 * Počet „obdržených poděkování“, abychom zvýraznili možnost získat od komunity ocenění.
 * Nedávná editační aktivita, včetně počtu dní v řadě, po které nováček editoval, za účelem povzbuzení pokračující aktivity na projektu.
 * View reading activity on articles newcomers have edited over time (similar to info on en:Wikipedia:Pageview_statistics).

Leveling up

 * Leveling up: It is important to communities that newcomers progress to more valuable tasks. Ty, kteří dokončují mnoho jednoduchých editačních tipů, chceme přimět k tomu, aby vyzkoušeli i obtížnější úkoly. This could happen after they complete a certain number of easy tasks, or by encouragement on their homepage. Oblasti, které můžeme prozkoumat, zahrnují např.:
 * Po uložení editace nováček uvidí zprávu, která jej motivuje uložit i nějakou další editaci.
 * In the Suggested Edits module, provide opportunities to do more difficult edits, so that newcomers can become more skilled editors.
 * V modulu Dopad bychom mohli zahrnout sekci věnovanou oceněním či milníkům.
 * On the Homepage, add a new module with set challenges to attain some reward (badge/certificate).
 * Využití notifikací k povzbuzení nováčků k tomu, aby vyzkoušeli i složitější úkoly.

Personalized praise

 * Personalized praise: research shows that praise and encouragement from other users increases newcomer retention. We want to think about how to encourage experienced users to thank and award newcomers for good contributions. Perhaps mentors could be encouraged to do this on their mentor dashboards or through notifications. We can utilize existing communication mechanisms which past studies have proven to have a degree of positive effect. Oblasti, které můžeme prozkoumat, zahrnují například:
 * Osobní zpráva od mentora, která se zobrazí v Domovské stránce nováčka.
 * Echo notifikace zaslaná jménem mentora nováčka nebo týmu Growth nadace Wikimedia.
 * Poděkování zaslané za určitou editaci
 * A new milestone badge awarded by the mentor or the Wikimedia Growth team relating to a specific edit.



Diskuse s komunitou
We discussed the Positive Reinforcement project with community members from ar:ويكيبيديا:مشروع فريق النمو (التعزيز الإيجابي)bn:উইকিপিডিয়া:আলোচনাসভাcs:Diskuse k Wikipedii:Zkušenosti nových wikipedistů/Pozitivní posílenífr:Discussion Projet:Aide et accueil/Volontaires, and here on mediawiki.org.

We received direct feedback about the three main ideas, along with many other ideas for improving new editor retention.

Below is a summary of the main themes from the feedback, along with how we plan to iterate based on the feedback.

Other ideas:
Community members suggested several other ideas for improving newcomer engagement and retention. We think these are all valuable ideas (some of which we are already exploring or want to work on in the future) but the following ideas won't fit within the scope of the current project:
 * Send newcomers onboarding and welcome emails (the Growth team is actually currently exploring engagement emails in collaboration with the Marketing and the Fundraising teams).
 * Expose newcomers to Wikiprojects that relate to their interests.
 * Include a customizable widget on the newcomer homepage to allow wikis to promote certain newcomer tasks or events.
 * Send notifications to users who welcome newcomers once the newcomer reaches certain editing milestones (to help prompt the user to offer Thanks or Wikilove).



Testy použitelnosti
Along with community discussion, we wanted to validate and add to our initial designs and hypothesis by testing designs with readers and editors from several countries. So our design research team conducted Positive Reinforcement user testing aimed to better understand the project's impact on newcomer contribution across several different languages.

We tested several static Positive Reinforcement designs with Wikipedia readers and editors in Arabic, Spanish, and English. Along with testing Positive Reinforcement designs we introduced data visualizations from xtools as a way to better understand how these data visualizations are perceived by newcomers.



User testing results

 * Make impact data actionable: Impact data was a compelling feature for participants with more experience editing, which several related to their interest in data—an unsurprising quality for a Wikipedian. For those new to editing, impact data, beyond views and basic editing activity, may be more compelling if linked to goal-setting and optimizing impact.
 * Evaluate the ideal editing interval: Across features, daily intervals seemed likely to be overly ambitious for new and casual editors. Participants also reflected on ignoring similar mechanisms on other platforms when they were unrealistic. Consider consulting usage analytics to identify “natural” intervals for new and casual editors to make goals more attainable.
 * Ensure credibility of assessments: Novice editor participants were interested in the assurance of their skills and progress the quality score, article assessment, and badges offer. Some hoped that badges could lend credibility to their work reviewed by more experienced editors. With that potential, it could be valuable to evaluate that the assessments are meaningful measures of skill and further explore how best to leverage them to garner community trust of newcomers.
 * Reward quality and collaboration over quantity: Both editor and reader participants from esWiki were more interested in recognition of their knowledge or expertise (quality) than the number of edits they have made (quantity). Similarly, some Arabic and English editors are motivated by their professional interests and skill development to edit. Orienting goals and rewards to other indicators of skilled edits, such as adding references or topical contributions, and collaboration or community involvement may also help mitigate concerns about competition overtaking collaboration.
 * Prioritize human recognition: While scores and badges via Growth tasks is potentially valued, recognition from other editors appears to be more motivational. Features which promote giving, receiving, and revisiting thanks seemed most compelling, and editors may benefit from selecting impact data which demonstrates engagement with readers or editors most compelling to them.
 * Experiment with playfulness of designs: While some positive reinforcement features can be seen as the product of “gamification”, some participants (primarily from EsWiki) felt that simple, fun designs were overly childish or playful for the seriousness of Wikipedia. Consider experimenting with visual designs that vary in levels of playfulness to evaluate broader reactions to “fun” on Wikipedia.

Design
Below are the current designs for Positive Reinforcement. We have refined the three main ideas outlined above, but the scope of plans and the actual designs have evolved based on feedback from community discussions and user testing.

Impact
The revised impact module provides new editors with more context about their impact. The new design includes far more personalized info and data visualizations than the previous design. This new design is fairly similar to the design we shared previously when discussing this feature with communities. You can view the current engineering progress at beta wiki, and we hope to release this feature to Growth pilot wikis soon.

Leveling up
The Leveling up features focus on encouraging newcomers to progress to more valuable tasks. Ideas also include some prompts for new editors to try suggested edits, since structured tasks have been shown to improve newcomer activation and retention.
 * “Level up” post-edit dialog message: A new post-edit dialog message type is added to encourage newcomers to try a new task type. We hope this will encourage some users to learn new editing skills as they progress to different, more challenging tasks.
 * Post-edit dialog for non-suggested edits: Introduce newcomers who complete ‘normal’ edits to suggested edits. We plan to experiment by showing newcomers a prompt post 3rd and 7th edit. Desktop users who click through to try a suggested edit will also see their Impact module, which we hope helps engage newcomers and provides a small degree of automated positive reinforcement. We will carefully measure this experiment, and ensure there aren't any unintentional negative effects.
 * New notifications: New echo notifications to encourage newcomers to start or continue suggested edits. This acts as a proxy to “win-back” emails for those who have an email address and settings on to receive email notifications.

Personalized praise
Personalized praise features are based on research results that show that encouragement and thanks from other users increases editor retention.
 * Encouragement from Mentors: We will add a new module to the Mentor dashboard, that is designed to encourage Mentors to send personalized messages to newcomers who meet certain criteria. We will allow Mentors to customize and control how and when "praise-worthy" mentees are surfaced.
 * Increasing Thanks across the wiki: We plan to fulfill the community wishlist item to Enable Thanks Button by default in Watchlists and Recent Changes (T51541, T90404). We hope this will increase Thanks and positivity across the wikis, and hopefully newcomers will benefit from this directly or indirectly.



Hypotheses
The Positive Reinforcement features aim to provide or improve the tools available to newcomers and mentors in three specific areas that will be described in more detail below. Our hypothesis is that once a newcomer has made a contribution (say by making a structured task edit), these features will help create a positive feedback cycle that increases newcomer motivation.

Below are the specific hypotheses that we seek to validate across the newcomer population. We will also have hypotheses for each of the three sets of features that the team plans to develop. These hypotheses drive the specifics for what data we will collect and how we will analyse that data.


 * 1) The Positive Reinforcement features increase our core metrics of retention and productivity.
 * 2) Since the Positive Reinforcement features do not feature a call to action that asks newcomers to make edits, we will see no difference in our activation core metric.
 * 3) Newcomers who get the Positive Reinforcement features are able to determine that making un-reverted edits is desirable, and we will see a decrease in the proportion of reverted edits.
 * 4) The positive feedback cycle created by the Positive Reinforcement features will lead to a significantly higher proportion of "highly active" newcomers.
 * 5) The Positive Reinforcement features increase the number of Daily Active Users of Suggested edits.
 * 6) The average number of edit sessions during the newcomer period (first 15 days) increases.
 * 7) "Personalized praise" will increase mentor’s proactive communication with their mentees, which will lead to increase in retention and productivity.

Experiment plan
Similarly as we have done for previous Growth team projects, we want to test our hypotheses through controlled experiments (also called "A/B tests"). This will allow us to establish a causal relationship (e.g. "The Leveling Up features cause an increase in retention of xx%"), and it will allow us to detect smaller effects than if we were to give it to everyone and analyze the effects pre/post deployment.

In this controlled experiment, a randomly selected half of users will get access to Positive Reinforcement features (the "treatment" group), and the other randomly selected half will instead get the current (September 2022) Growth feature experience (the "control" group). In previous experiments, the control group has not gotten access to the Growth features. The team has decided to move away from that (T320876), which means that the current set of features is the new baseline for a control group.

The Personalized Praise feature is focused on mentors. There is a limited number of mentors on every wiki, whereas when it comes to newcomers the number increases steadily every day as new users register on the wikis. While we could run experiments with the mentors, we are likely to run into two key challenges. First, the limited number of mentors could mean that the experiments would need to run for a long time. Second, and more importantly, mentors are well integrated into the community and communicate with each other, meaning they are likely to figure out if some have access to features that others do not. We will therefore give the Personalized Praise features to all mentors and examine activity and effects on newcomers pre/post deployment in order to understand the feature’s effectiveness.

In summary, this means we are looking to run two consecutive experiments with the Impact and Leveling up features, followed by a deployment of the Personalized Praise features to all mentors. These experiments will first run on the pilot wikis. We can extend this to additional wikis if we find a need to do that, but it would only happen after we have analyzed the leading indicators and found no concerns.

Each experiment will run for approximately one month, and for each experiment we will have an accompanying set of leading indicators that we will analyze two weeks after deployment. The list below shows what the planned experiments will be:


 * 1) Impact: treatment group gets the updated Impact module.
 * 2) Leveling up: treatment group gets both the updated Impact module and the Leveling up features.
 * 3) Personalized praise: all mentors get the Personalized praise features.

Leading indicators and plan of action
While we believe that the features we develop are not detrimental to the wiki communities, we want to make sure we are careful when experimenting with them. It is good practice to define a set of leading indicators together with plans of what action to take based if a leading indicator suggests something isn't going the way it should. We have done this for all our past experiments and do so again for the experiments we plan to run as part of this project.

Impact
Impact module interactions: We find that the proportion of newcomers who interact with the old module (6.1%) is significantly higher than for the new module (5%): $$\chi^2 = 17.5, df = 1, p \ll 0.001$$ This difference showed up early on in the experiment, and we have examined the data more closely understand what is happening. One issue we identified early on was that not all interaction events were instrumented, which we subsequently resolved. Examining further, we find that many of those who get the old module click on links to the articles or the pageviews. In the new module, a graph of the pageviews is available, thus removing some of the need for visiting the pageview tool. As a result, we decided that no changes were needed.

Mentor module interactions: We find no significant difference in the proportion of newcomers who interact with the Mentor module. The proportion for newcomers who get the old module is 2.4%, for those who get the new module it's 2.2%. A Chi-square test finds this difference not significant: $$\chi^2 = 1.5, df = 1, p = 0.219$$

Mentor module questions: We do not see a substantial difference in the number of questions asked between the old module (269 edits) and the new module (281 edits). The proportion of newcomers who asks their mentor a question is also the same for both groups, at 1.5%.

Edits and revert rate: We do not see a substantial difference in the number of edits nor in the revert rate between the two groups measured on a per-user average basis. There are differences between the groups, but these are driven by some highly prolific editors, particularly on the mobile platform.