Growth/Positive reinforcement/pt-br

This page describes work on "positive reinforcement" as part of the Growth feature set. This page contains major assets, designs, open questions, and decisions.

Most incremental updates on progress will be posted on the general Growth team updates page, with some large or detailed updates posted here.



Situação atual

 * 2021-03-01: página do projeto criada
 * 2022-02-25: projeto iniciado com discussões em equipe
 * 2022-03-01: página do projeto expandida
 * 2022-05-11: discussão da comunidade
 * 2022-08-12: teste do usuário concluído
 * 2022-11-24: current designs and measurement and experiment plan added
 * 2022-12-01: novo módulo de impacto lançado para wikis piloto
 * 2023-02-07: subir de nível e elogio personalizado iniciado e segunda discussão da comunidade iniciada
 * 2023-02-14: publicado análise que ajudará a orientar o nivelamento do trabalho
 * 2023-03-22: Leveling up features released as an A/B test at Growth pilot wikis
 * 2023-03-24: published Thanks Usage analysis
 * 2023-05-25: released Personalized praise module on Growth pilot wikis
 * Next: lançamento do módulo de elogio personalizado

Resumo
A equipe de Growth tem se concentrado em construir uma “experiência coesa para recém-chegados” que forneça o “acesso” que os recém-chegados precisam aos elementos que os ajudem a ingressar nas práticas da Wikipédia. Por exemplo, com tarefas para recém-chegados, demos a eles acesso a oportunidades de participar, e com o módulo de mentoria, demos a eles acesso a mentoria. As edições sugeridas conseguiram que mais recém-chegados fizessem suas primeiras edições. Com esse sucesso, queremos tomar medidas para incentivar os recém-chegados a continuar fazendo mais edições Isso chama nossa atenção para um elemento não desenvolvido ao qual os recém-chegados precisam ter acesso: avaliar o desempenho. We’re calling this project “positive reinforcement”.

We want newcomers to understand there is progression and value to sustained contributions on Wikipedia, increasing retention for those users who took the first step in making an edit.

Nossa grande questão aqui é: Como podemos encorajar os recém-chegados que visitaram nossa página inicial e experimentaram nossos recursos a continuar editando e aproveitar seu impulso?

Contexto
Quando a página inicial do recém-chegado foi implantada em 2019, ela continha um "módulo de impacto" básico, que listava o número de visualizações das páginas que o recém-chegado havia editado. Essa é a única parte dos recursos do Growth que dá ao recém-chegado uma noção de seu impacto, e não o aprimoramos desde que foi implantado pela primeira vez. Com isso como ponto de partida, reunimos alguns aprendizados importantes:


 * Ouvimos bons comentários dos membros da comunidade sobre o módulo, com editores experientes dizendo que é interessante e valioso para eles.
 * Appreciation from other users has been shown to increase retention, such as in the case of "thanks" (here and here) and in an experiment on German Wikipedia. We believe that these reinforcements from real people would be more effective than automated ones coming from the system.
 * Os membros da comunidade explicaram que é uma alta prioridade para os recém-chegados passar para tarefas mais valiosas depois de começar com as fáceis, em vez de ficarem presos apenas fazendo tarefas fáceis.
 * Outras plataformas, como Google, Duolingo e Github, utilizam vários mecanismos, como distintivos e metas.
 * As comunidades temem incentivar a edição não saudável. Vimos que, quando os concursos de edição oferecem prêmios em dinheiro, ou apenas quando funções úteis como "confirmação estendida" dependem de contagens de edição, isso pode incentivar as pessoas a fazer muitas edições problemáticas.

User persona
Há muitas partes da jornada do recém-chegado nas quais podemos tentar aumentar a retenção. Poderíamos nos concentrar naqueles que pararam de editar após apenas uma ou algumas edições, ou podemos nos concentrar nos recém-chegados que pararam de editar após semanas de atividade. Para este projeto, decidimos nos concentrar nos recém-chegados que concluíram sua primeira sessão de edição e que queremos retornar para uma segunda sessão. O diagrama os ilustra com uma estrela amarela.

Queremos nos concentrar nos recém-chegados neste estágio, pois esse é o próximo estágio do funil no qual podemos ajudar a melhorar a retenção. É também onde vemos uma taxa de atrito muito significativa atualmente, portanto, se pudermos ajudar a reter os recém-chegados neste ponto, deve haver um impacto significativo ao longo do tempo.



Pesquisa e design
A pesquisa foi realizada sobre os vários mecanismos que foram empregados para encorajar as pessoas a contribuir com conteúdo dentro e fora da wiki. A seguir estão algumas das principais conclusões da pesquisa:


 * As motivações para os editores da Wikipédia são multifacetadas e mudam ao longo do tempo e da experiência. Os novos editores geralmente são movidos mais pela curiosidade e pela conexão social do que pela ideologia.
 * Internal projects focus on intrinsic incentives, appeal to altruistic motivations, and are not systematically applied.
 * Ampliar o apelo além das motivações ideológicas pode melhorar a diversidade de editores na Wikipédia.
 * Mensagens positivas de usuários e mentores experientes são comprovadamente eficazes na retenção de curto prazo.

Para ver um resumo das ideias de design atuais, consulte essa página. Nossos designs evoluirão ainda mais por meio do feedback da comunidade e de várias rodadas de testes com usuários.

Ideias
Temos três ideias principais. Podemos buscar várias ideias enquanto trabalhamos neste projeto.

Impacto

 * Impacto: Uma reformulação do módulo Impacto com base na incorporação de estatísticas, gráficos e outras informações de contribuição. O módulo de impacto revisado forneceria aos novos editores mais contexto sobre seu impacto, além de incentivá-los a continuar contribuindo. As áreas de exploração incluem:
 * Suggested edits milestone, to nudge users to try suggested edits.
 * Estatísticas sobre o quanto o usuário editou ao longo do tempo (semelhante ao que está no X Tools).
 * Contagem de “agradecimentos recebidos”, para destacar a capacidade de receber o reconhecimento da comunidade.
 * Atividade de edição recente - incluindo dias consecutivos que os recém-chegados editaram — para incentivar o envolvimento contínuo ou lembrar as pessoas de reiniciar suas contribuições
 * View reading activity on articles newcomers have edited over time (similar to info on en:Wikipedia:Pageview_statistics).



Subindo de nível

 * Subir de nível: É importante para as comunidades que os recém-chegados progridam para tarefas mais valiosas. Para aqueles que realizam muitas tarefas fáceis, queremos incentivá-los a tentar tarefas mais difíceis. Isso pode acontecer depois que eles concluem um certo número de tarefas fáceis ou por incentivo em sua página inicial. As áreas de exploração incluem:
 * O recém-chegado verá mensagens de sucesso pós-edição que o motivarão a fazer mais edições do mesmo ou de diferentes níveis de dificuldade.
 * No módulo Edições sugeridas, forneça oportunidades para fazer edições mais difíceis, para que os recém-chegados possam se tornar editores mais qualificados.
 * No módulo Impacto, inclua um contador de marcos ou área de premiação.
 * Na página inicial, adicione um novo módulo com desafios definidos para obter alguma recompensa (distintivo/certificado).
 * Adicione notificações para solicitar que os recém-chegados tentem uma tarefa mais difícil.



Elogio personalizado

 * Elogio personalizado: pesquisas mostram que elogios e incentivos de outros usuários aumentam a retenção de recém-chegados. Queremos pensar em como incentivar os usuários experientes a agradecer e premiar os novatos por boas contribuições. Talvez os mentores possam ser encorajados a fazer isso em seus painéis de mentores ou por meio de notificações. Podemos utilizar mecanismos de comunicação existentes que estudos anteriores provaram ter um certo grau de efeito positivo. As áreas de exploração incluem:
 * Uma mensagem pessoal do mentor do recém-chegado aparecendo na página inicial.
 * Uma notificação do mentor ou da equipe Wikimedia Growth.
 * “Obrigado” em uma edição específica.
 * Um novo selo de marco concedido pelo mentor ou pela equipe Wikimedia Growth relacionado a uma edição específica.



Discussão na comunidade
Discutimos o projeto com membros da comunidade a partir de ar:ويكيبيديا:مشروع فريق النمو (التعزيز الإيجابي)bn:উইকিপিডিয়া:আলোচনাসভাcs:Diskuse k Wikipedii:Zkušenosti nových wikipedistů/Pozitivní posílenífr:Discussion Projet:Aide et accueil/Volontaires, e aqui em mediawiki.org.

Recebemos feedback direto sobre as três ideias principais, juntamente com muitas outras ideias para melhorar a retenção de novos editores.

Abaixo está um resumo dos principais temas do feedback, junto com como planejamos iterar com base no feedback.

Impacto


Elogio personalizado


Outras ideias
Os membros da comunidade sugeriram várias outras ideias para melhorar o envolvimento e a retenção de recém-chegados. Achamos que todas essas são ideias valiosas (algumas das quais já estamos explorando ou queremos trabalhar no futuro), mas as seguintes ideias não se encaixam no escopo do projeto atual:
 * Send newcomers onboarding and welcome emails (the Growth team is actually currently exploring engagement emails in collaboration with the Marketing and the Fundraising teams).
 * Exponha os recém-chegados a Wikiprojetos relacionados aos seus interesses.
 * Include a customizable widget on the newcomer homepage to allow wikis to promote certain newcomer tasks or events.
 * Send notifications to users who welcome newcomers once the newcomer reaches certain editing milestones (to help prompt the user to offer Thanks or Wikilove).

Second community consultation:
In February 2023, we completed a community consultation in which we reviewed the most recent Leveling up designs with the Growth Pilot wikis. This consultation was completed in English on MediaWiki, and at Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, and Spanish Wikipedia. (T328356) In general, feedback was quite positive. These two tasks help address feedback mentioned by those that responded to our questions:


 * Leveling up: Community configuration (T328386)
 * Leveling up: Second design iteration of "Try a new task" dialog (T330543)

In March 2023, we completed a community consultation in which we reviewed the most recent Personalized praise designs with the Growth Pilot wikis. This consultation was completed on English Wikipedia, Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, French Wikipedia, Spanish Wikipedia, and at MediaWiki in English. (T328356) Most feedback was supportive of Personalized praise features, but several further improvements were requested. We've created Phabricator tasks to address these further improvements.


 * On Arabic Wikipedia, and other wikis with Flagged Revisions, mentors want to see not only the number of edits a user had completed, but more details on the review status of edits (T333035)
 * Mentors want to be able to view the number or percentage of reverts their mentee has, and customize how many reverts a newcomer can have to be considered praiseworthy (T333036)
 * Mentors would appreciate knowing which edit a mentee is Thanked for (T51087)

User testing
Along with community discussion, we wanted to validate and add to our initial designs and hypothesis by testing designs with readers and editors from several countries. So our design research team conducted Positive Reinforcement user testing aimed to better understand the project's impact on newcomer contribution across several different languages.

We tested several static Positive Reinforcement designs with Wikipedia readers and editors in Arabic, Spanish, and English. Along with testing Positive Reinforcement designs we introduced data visualizations from xtools as a way to better understand how these data visualizations are perceived by newcomers.



User testing results

 *  Make impact data actionable:  Impact data was a compelling feature for participants with more experience editing, which several related to their interest in data—an unsurprising quality for a Wikipedian. For those new to editing, impact data, beyond views and basic editing activity, may be more compelling if linked to goal-setting and optimizing impact.
 *  Evaluate the ideal editing interval:  Across features, daily intervals seemed likely to be overly ambitious for new and casual editors. Participants also reflected on ignoring similar mechanisms on other platforms when they were unrealistic. Consider consulting usage analytics to identify “natural” intervals for new and casual editors to make goals more attainable.
 *  Ensure credibility of assessments:  Novice editor participants were interested in the assurance of their skills and progress the quality score, article assessment, and badges offer. Some hoped that badges could lend credibility to their work reviewed by more experienced editors. With that potential, it could be valuable to evaluate that the assessments are meaningful measures of skill and further explore how best to leverage them to garner community trust of newcomers.
 *  Reward quality and collaboration over quantity:  Both editor and reader participants from esWiki were more interested in recognition of their knowledge or expertise (quality) than the number of edits they have made (quantity). Similarly, some Arabic and English editors are motivated by their professional interests and skill development to edit. Orienting goals and rewards to other indicators of skilled edits, such as adding references or topical contributions, and collaboration or community involvement may also help mitigate concerns about competition overtaking collaboration.
 *  Prioritize human recognition:  While scores and badges via Growth tasks is potentially valued, recognition from other editors appears to be more motivational. Features which promote giving, receiving, and revisiting thanks seemed most compelling, and editors may benefit from selecting impact data which demonstrates engagement with readers or editors most compelling to them.
 *  Experiment with playfulness of designs:  While some positive reinforcement features can be seen as the product of “gamification”, some participants (primarily from EsWiki) felt that simple, fun designs were overly childish or playful for the seriousness of Wikipedia. Consider experimenting with visual designs that vary in levels of playfulness to evaluate broader reactions to “fun” on Wikipedia.

Design
Below are the current designs for Positive Reinforcement. We have refined the three main ideas outlined above, but the scope of plans and the actual designs have evolved based on feedback from community discussions and user testing.

Impact
The revised impact module provides new editors with more context about their impact. The new design includes far more personalized info and data visualizations than the previous design. This new design is fairly similar to the design we shared previously when discussing this feature with communities. You can view the current engineering progress at beta wiki, and we hope to release this feature to Growth pilot wikis soon.

Leveling up
The Leveling up features focus on encouraging newcomers to progress to more valuable tasks. Ideas also include some prompts for new editors to try suggested edits, since structured tasks have been shown to improve newcomer activation and retention.
 * “Level up” post-edit dialog message: A new post-edit dialog message type is added to encourage newcomers to try a new task type. We hope this will encourage some users to learn new editing skills as they progress to different, more challenging tasks.
 * Post-edit dialog for non-suggested edits: Introduce newcomers who complete ‘normal’ edits to suggested edits. We plan to experiment by showing newcomers a prompt post 3rd and 7th edit. Desktop users who click through to try a suggested edit will also see their Impact module, which we hope helps engage newcomers and provides a small degree of automated positive reinforcement. We will carefully measure this experiment, and ensure there aren't any unintentional negative effects.
 * New notifications: New echo notifications to encourage newcomers to start or continue suggested edits. This acts as a proxy to “win-back” emails for those who have an email address and settings on to receive email notifications.

Personalized praise
Personalized praise features are based on research results that show that encouragement and thanks from other users increases editor retention.
 * Encouragement from Mentors: We will add a new module to the Mentor dashboard, that is designed to encourage Mentors to send personalized messages to newcomers who meet certain criteria. We will allow Mentors to customize and control how and when "praise-worthy" mentees are surfaced.
 * Increasing Thanks across the wiki: We plan to fulfill the community wishlist item to Enable Thanks Button by default in Watchlists and Recent Changes (T51541, T90404). We hope this will increase Thanks and positivity across the wikis, and hopefully newcomers will benefit from this directly or indirectly.

Hypotheses
The Positive Reinforcement features aim to provide or improve the tools available to newcomers and mentors in three specific areas that will be described in more detail below. Our hypothesis is that once a newcomer has made a contribution (say by making a structured task edit), these features will help create a positive feedback cycle that increases newcomer motivation.

Below are the specific hypotheses that we seek to validate across the newcomer population. We will also have hypotheses for each of the three sets of features that the team plans to develop. These hypotheses drive the specifics for what data we will collect and how we will analyse that data.


 * 1) The Positive Reinforcement features increase our core metrics of retention and productivity.
 * 2) Since the Positive Reinforcement features do not feature a call to action that asks newcomers to make edits, we will see no difference in our activation core metric.
 * 3) Newcomers who get the Positive Reinforcement features are able to determine that making un-reverted edits is desirable, and we will see a decrease in the proportion of reverted edits.
 * 4) The positive feedback cycle created by the Positive Reinforcement features will lead to a significantly higher proportion of "highly active" newcomers.
 * 5) The Positive Reinforcement features increase the number of Daily Active Users of Suggested edits.
 * 6) The average number of edit sessions during the newcomer period (first 15 days) increases.
 * 7) "Personalized praise" will increase mentor’s proactive communication with their mentees, which will lead to increase in retention and productivity.

Experiment plan
Similarly as we have done for previous Growth team projects, we want to test our hypotheses through controlled experiments (also called "A/B tests"). This will allow us to establish a causal relationship (e.g. "The Leveling Up features cause an increase in retention of xx%"), and it will allow us to detect smaller effects than if we were to give it to everyone and analyze the effects pre/post deployment.

In this controlled experiment, a randomly selected half of users will get access to Positive Reinforcement features (the "treatment" group), and the other randomly selected half will instead get the current (September 2022) Growth feature experience (the "control" group). In previous experiments, the control group has not gotten access to the Growth features. The team has decided to move away from that (T320876), which means that the current set of features is the new baseline for a control group.

The Personalized Praise feature is focused on mentors. There is a limited number of mentors on every wiki, whereas when it comes to newcomers the number increases steadily every day as new users register on the wikis. While we could run experiments with the mentors, we are likely to run into two key challenges. First, the limited number of mentors could mean that the experiments would need to run for a long time. Second, and more importantly, mentors are well integrated into the community and communicate with each other, meaning they are likely to figure out if some have access to features that others do not. We will therefore give the Personalized Praise features to all mentors and examine activity and effects on newcomers pre/post deployment in order to understand the feature’s effectiveness.

In summary, this means we are looking to run two consecutive experiments with the Impact and Leveling up features, followed by a deployment of the Personalized Praise features to all mentors. These experiments will first run on the pilot wikis. We can extend this to additional wikis if we find a need to do that, but it would only happen after we have analyzed the leading indicators and found no concerns.

Each experiment will run for approximately one month, and for each experiment we will have an accompanying set of leading indicators that we will analyze two weeks after deployment. The list below shows what the planned experiments will be:


 * 1) Impact: treatment group gets the updated Impact module.
 * 2) Leveling up: treatment group gets both the updated Impact module and the Leveling up features.
 * 3) Personalized praise: all mentors get the Personalized praise features.

Leading indicators and plan of action
While we believe that the features we develop are not detrimental to the wiki communities, we want to make sure we are careful when experimenting with them. It is good practice to define a set of leading indicators together with plans of what action to take based if a leading indicator suggests something isn't going the way it should. We have done this for all our past experiments and do so again for the experiments we plan to run as part of this project.

Impact
 Impact module interactions:  We find that the proportion of newcomers who interact with the old module (6.1%) is significantly higher than for the new module (5%): $$\chi^2 = 17.5, df = 1, p \ll 0.001$$ This difference showed up early on in the experiment, and we have examined the data more closely understand what is happening. One issue we identified early on was that not all interaction events were instrumented, which we subsequently resolved. Examining further, we find that many of those who get the old module click on links to the articles or the pageviews. In the new module, a graph of the pageviews is available, thus removing some of the need for visiting the pageview tool. As a result, we decided that no changes were needed.

 Mentor module interactions:  We find no significant difference in the proportion of newcomers who interact with the Mentor module. The proportion for newcomers who get the old module is 2.4%, for those who get the new module it's 2.2%. A Chi-square test finds this difference not significant: $$\chi^2 = 1.5, df = 1, p = 0.219$$

 Mentor module questions:  We do not see a substantial difference in the number of questions asked between the old module (269 edits) and the new module (281 edits). The proportion of newcomers who asks their mentor a question is also the same for both groups, at 1.5%.

 Edits and revert rate:  We do not see a substantial difference in the number of edits nor in the revert rate between the two groups measured on a per-user average basis. There are differences between the groups, but these are driven by some highly prolific editors, particularly on the mobile platform.

Levelling up
 Levelling up post-edit dialog interactions:  We find a higher proportion of newcomers interacting with the post-edit dialog in the Levelling Up group (90.8%) compared to the standard post-edit dialog (86.5%). This is largely driven by mobile where the Levelling Up interaction proportion (88%) is a lot higher than the other group (81.6%). The proportion is still higher for the Levelling Up group on desktop (93.6%) compared to the control (92.2%), but we regard it as "virtually identical" because the high proportion in the control group means there is little room for an increase.

 Try a suggested edit click through rates:  21.9% of newcomers who see the "Try a suggested edit" post-edit dialogue chooses to click through, which is significantly higher than the threshold set. The proportion is higher on desktop (24%) than on mobile (19.7%), but in neither case is there a reason for concern.

 Increase your skill level click through rates:  We find that 73.1% of newcomers who see the "increase your skill level" dialog click through to see the new task, which is a lot higher than our expected threshold of less than 10%. Proportions are high on both desktop (71.1%) and mobile (77.3%).

 Get started click through rates:  3.8% of newcomers who get the "Get started" notification clicks through to the Homepage. Users who registered on desktop are more likely to click the notification (5.5%) than those on mobile (2.5%). Because the threshold of 5% is met, we are investigating further to understand this difference between desktop and mobile behaviour, particularly to understand if our 5% threshold is reasonable.

 Keep going click through rates:  We find that 9.6% of users who get the "Keep going" notification clicks through to the Homepage. Similarly as we do for the "Get started" notifications, we find a much higher proportion on desktop (16.2%) compared to mobile (4.7%). Our investigations into differences in notification behaviour by platform will hopefully give us more insight into this difference.

 Activation:  We find a decrease in constructive article activation (making a non-reverted article edit within 24 hours of registration) of 27% compared to 27.7%. As soon as we noticed this we opened T334411 to investigate the issue, with a focus on patterns in geography (countries and wikis) and technology (devices and browsers). We did not find clear patterns explaining the issue. The investigation of this decrease in activation will be investigated further: T337320.

Personalized praise
Data was gathered on 2023-06-13, from the four pilot wikis where the feature is deployed (Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, and Spanish Wikipedia).

Personalized praise notification click through: Although this is still a relatively small sample, results seem healthy and show that Mentors are indeed receiving notifications and clicking through to view their praise-worthy mentees.

Personalized praise mentor dashboard module click through: Only 27.5% of Mentors are clicking through to a mentee's talk page, however it's to be expected that some of the mentees we are surfacing aren't deserving of praise. Based on this data and feedback from Mentors, the Growth team will pursue the following tasks to help improve this feature:


 * Add revert scorecard to Personalized praise module on Mentor dashboard (T337510)
 * Exclude blocked accounts from the Personalized praise suggestions (T338525)

Experiment Results
Following a review of the experiment's leading indicators, a WMF data scientist meticulously analyzed each component of the experiment in relation to the main KPIs of the Growth team:


 * Constructive activation is defined as a newcomer making their first edit within 24 hours of registration, and that edit not being reverted within 48 hours.
 * Activation is similarly defined as constructive activation, but without the non-revert requirement.
 * Constructive retention is defined as a newcomer coming back on a different day in the two weeks after constructive activation and making another edit, with said edit also not being reverted within 48 hours.
 * Retention is similarly defined as constructive retention, but without the non-revert requirements.
 * Constructive edit volume is the overall count of edits made in a user's first two weeks, with edits that were reverted within 48 hours removed.
 * Revert rate is the proportion of edits that were reverted within 48 hours out of all edits made. This is by definition 0% for users who made no edits, and we generally exclude these users from the analysis.

Impact module experiment results
As noted in our leading indicators, we initially noticed that the new Impact module seemed to correlate with a slight decrease in activation on mobile. This was quite surprising as the empty state for the old Impact module was nearly identical to the empty state of the new Impact module. We immediately followed up with several investigations to track down why this could be happening: T342150.

We restarted experiment data collection after making several small changes, and we now see that activation is identical between the experiment and control group, which is what we would expect.

Although we are pleased that we have received positive feedback from new editors regarding the new Impact module. However, we have found that the Impact module alone hasn't resulted in significant changes in newcomer retention, edit volume, or revert rates.

Our next experiment will combine the new Impact module with the Leveling up features. We hope that this combination of Positive Reinforcement features will lead to substantial improvements in activation, retention, and edit volume. We will soon publish a detailed report that highlights the outcomes of this experiment.