Jump to content

User:SSethi (WMF)/Sandbox/Languages onboarding/Recommendations

From mediawiki.org

Initial Recommendations[edit]

In the Initial Recommendations phase, the main goal was to fully understand the problem space. This involved identifying the issues, gathering relevant data from the Wikimedia volunteer community, and analyzing it to find the root causes. Based on this analysis, preliminary recommendations were formulated, providing a foundation for further refinement in later phases.

1. Streamlining technical infrastructure[edit]

1.1. Simplifying content creation: Community conversations highlighted several challenges faced by individual and their communities contributing to the Wikimedia Incubator, particularly those involving non-tech-savvy users. The main issues revolve around the technical complexity of the platform which inhibits the editing experience compared to normal wikis. Simplifying the technical infrastructure is crucial to address these challenges, including issues with prefixes, template importing, and content formatting.

Editing on Incubator should feel similar to editing on normal wikis, but we are far from achieving this goal.

Some of the key challenges related to editing were discussed:

  • Prefixes and URL Access: The organization of content using prefixes poses a significant challenge. While there are gadgets to hide them during editing, there are fewer technical contributors with the required expertise to both develop and maintain these tools. Furthermore, accessing pages on Incubator for a language without ISO codes is difficult.
  • Wikidata Support and Content Translation: Incubator is disconnected from Wikidata, hindering interwiki linking. The absence of the content translation tool makes it difficult for contributors, who often resort to translating articles manually across tabs. The fastest way for contributors to make new articles seems to be through translating existing articles from other languages that they understand. It is quite cumbersome for them to import content from other languages into their projects as there is no content translation tool that is available on the Incubator like on other language Wikipedias.
  • Templates and CSS Stylesheets: Users need to copy-paste infoboxes and templates from other Wikipedias, and they are overly complicated to both use and maintain as they rely on other templates and stylesheets. The lack of a central repository necessitates copying and maintaining templates for each project.
  • Search Features: There's a need for improved search functionality to identify code errors in templates more easily and language search boxes similar to those on Wikipedia.org.

Currently, creating a new wiki on Incubator requires starting from scratch, which is time-consuming. Establishing basic building blocks for new wikis could streamline this process. Instead of a shared infrastructure like Incubator, a placeholder production wiki for each language wiki could simplify matters, especially if it's not indexed in search results until it reaches a threshold in content. This would require solving complex infrastructure issues, but it's feasible and necessary. Speeding up the creation of new wikis is essential, and evaluating this idea is recommended. Overall, these challenges are solvable, and addressing them would benefit many contributors. There is a related task filed already filed on on Phabricator about this idea.

We should forget about Incubator completely. And, find another way of starting wiki. Because of the complexities around it, it might take time to improve the technical side of it.

1.2. Language approval & site creation process: Though the conversations were largely centered around the difficulty with editing experience, concerns were raised about the inconsistency in the language approval and site setup processes, where some projects are approved quickly while others face delays.

1.3. Progress tracking: Providing a progress bar for each test project with specific goals can motivate contributors and improve the approval process by giving them insight into their project's status.

1.4. Improving templates experience: Providing basic templates can simplify the editing process on wikis. Copying infoboxes from one wiki to another involves copying modules and templates, which can be complex. Starting a test wiki on Incubator is like starting from scratch. Having foundational building blocks for new wikis, like global templates, would be helpful. Including commonly used templates such as citation, references, infoboxes, and main pages at the beginning of a project can encourage community involvement and make editing more accessible. Link to Global templates.

1.5. Improving translation processes: It would be good to explore if we can translate directly from Incubator instead of Translatewiki.net, which is currently a lengthy process. Many of us use Translatewiki.net for translations, but its interface differs from Incubator's. It requires some expertise to translate on Translatewiki.net. Providing resources like a glossary of technical terms within Incubator could help contributors translate content more effectively, especially for those unfamiliar with technical language.  

2. Exploring social pathways[edit]

On the social front, these recommendations focus on fostering community growth and inclusivity within Wikimedia projects. Additionally, they propose exploring social pathways for language onboarding, including enhancing the discoverability of Incubator, creating welcoming pages, and orienting communities to relevant Wikimedia projects.

2.1. Enhancing social infrastructure for language onboarding: Develop social infrastructure for language onboarding, such as enhancing the discoverability of Incubator, creating welcoming pages, and orienting communities to various Wikimedia projects (such as Wiktionary, Wikisource, etc.) that might be more relevant for them than creating a Wikipedia, and guiding them on how to contribute effectively, can help make the process more accessible and inclusive. Assist in understanding which articles need to be written and provide guidance for creating articles, possibly through technical solutions like automated article suggestions. Encourage diverse topic selection and improve documentation accessibility. Enhance transparency in the approval process for articles.

2.2. Empowering resource sharing: Facilitate a forum for affiliates to share resources regarding language onboarding and creation, without necessarily coordinating their efforts. Encourage the sharing of templates for translations and maintain a compact list of common pages needed for projects. Set weekly or monthly goals for page creation/translation with specific themes.

2.3. Outreach and partnerships for linguistic diversity: Conduct outreach to language communities and facilitate connections with organizations experienced in providing support for in developing language tools, fonts, keyboards and apps, such as with the Giellatekno (Center for Language Technology) at the University of Tromsø, Norway and Language Diversity Hub.

2.4. Streamlining wiki incubation: Streamline the process of adding wikis to Incubator and graduating them from Incubator by collaborating with local communities, affiliates, and other regional Wikimedia organizations. Encourage feedback and collaboration to expedite the process. Ensure that articles are not lost during the migration of wikis from Incubator to independent sites.

2.5. Guiding principles for new wikis: Provide a basic set of guidelines for new wikis, including principles like neutrality, verifiability, and basic project policies, to facilitate their establishment and operation.

2.6. Enhancing community interaction: Improve on-wiki communication channels, such as Village Pump pages, to assist newcomers in understanding how wikis function and to foster community interaction.

2.7. Increasing language committee members: Expanding the number of language committee members can expedite the project approval process, reducing delays caused by limited manpower.

Proposed Recommendations[edit]

In the Proposed Recommendations phase, the initial suggestions were presented for feedback and stakeholders from several Wikimedia Foundation product and tech teams were closely involved in refining and enhancing these proposals.

1. Streamlining technical infrastructure[edit]

These recommendations aim to establish a streamlined technical infrastructure for creating language wikis and improving the complex processes involved in each of the distinct phases of language onboarding.

Before Incubation[edit]

1.1. Automating New Language Addition and Approval in Incubator[edit]

Description: Automate the process for requesting the addition and approval of new language creation in Incubator. This process currently entails various manual steps: understanding project principles, creating Meta and Translatewiki.net accounts, confirming language eligibility, and translating essential messages. Upon approval, which currently involves a series of labor-intensive manual tasks related to tracking, approving, and rejecting requests by Language Committee members, the site is set up in Incubator. Also see: Help:Manual.

Hypothesis: If we automate the process for requesting new language additions in Incubator, it's likely to result in an increase in the number of requests and quicker turnaround times for request approval. This streamlined, automatic approach is expected to attract interest from smaller and underrepresented community members who might be new to the Wikimedia ecosystem and may find the current request process, involving the use of templates and editing in Wikitext, complex. Additionally, a better tracking system for Language Committee members can save them some manual time and aid in improving graduation rates. The potential number of requests created in the future can be measured by examining how requests remain in discussion status before incubation on the Request for new languages page and the time taken for request resolution.

During Incubation[edit]

1.2. Providing Access to Modern Wiki Features Beyond Incubator[edit]

Description: Allow wikis to access modern wiki features by initiating them directly as production wikis. Currently, Incubator faces technical limitations and lacks many of the modern features found in other Wikipedia wikis. This deficiency leads to a poor editing experience for contributors. To run this as an experiment, a selected number of wikis (3-5) could start directly as production wikis, bypassing the Incubator entirely, and their progress can be observed over a few months to derive lessons for future planning.

Hypothesis: If we provide production wiki access to 5 new languages, with or without Incubator, we will learn whether access to a full-fledged wiki with modern features such as those available on English Wikipedia (including ContentTranslation and Wikidata support, advanced editing and search results) aids in faster editing. Ultimately, this will inform us if this approach can be a viable direction for language onboarding for new or existing languages, justifying further investigation.

Additional notes: Existing research already indicates a higher percentage of graduates for projects (e.g., Punjabi Wiktionary) that have an immediately preceding project (e.g., Punjabi Wikipedia) that has already graduated (see graph) from Incubator. This indicates how subsequent language wikis may benefit from existing knowledge, experience, and familiarity with infrastructure. By observing the progress of languages over a quarterly period of three months, we will assess if this results in increased monthly edits (currently, several language wikis receive fewer than one hundred edits monthly) and ultimately helps languages graduate faster (currently, the average duration for a language wiki to graduate from the Incubator is 4.4 years). We will carefully select new languages to be a part of this pilot, factoring in various criteria to be decided, considering community sensitivities. Each language will have the same start and end time. The test would be time-limited, and the final outcome would be identical for all languages joining the pilot for robust A/B testing.

1.3. Improving Editing Experience Within the Incubator[edit]

Description: Enhance the editing experience of language communities in Incubator. Technical challenges in Incubator include: new contributors facing complex wikitext and bureaucratic procedures, essential features like Content Translation integration and Wikidata being absent, and content restrictions hindering the process, such as the lack of search functionality or the ability to add citations. List of technical challenges in developing and reviewing content in Incubator.

Several areas where the improvements can be made:

  1. User interface language is always English by default. It can be changed by each user, but it's not like that on other wikis.
  2. Special:Statistics shows statistics for all the languages, which are not useful for people contributing to a particular language
  3. Special:DeadendPages and Special:OrphanedPages show up to 5000 pages, and they are sorted alphabetically, and since every Incubator page begins with a language code, they stop listing in the middle of languages whose code begins with the letter A.
  4. Special:LintErrors cannot be filtered by language.
  5. Special:LongPages is useless because would want to know the longest page in their language, and not in all of the Incubator.
  6. Maintenance categories like Category:Maintenance:Pages with broken file links are not sorted by language.
  7. GrowthExperiments don't work at all. (The developers could argue that they were originally designed for medium-sized wikis and not for small ones, but there are features there that could be useful for small wikis, too.)
  8. In Notifications (Echo), all notifications are labeled as "Incubator" and not as a specific language.
  9. Wikidata (already mentioned; exists, but with limited support)
  10. Interlanguage links (it's completely impossible to point them from other wikis to Incubator, and people sometimes ask for it. Fixing it would involve MediaWiki core, ULS, and Wikidata).

Hypothesis: If we enhance a feature related to editing in Incubator, such as integrating Content Translation or Wikidata, it could significantly improve the editing experience for contributors. Existing research suggests that Visual Editor has a comparatively higher edit completion rate compared to the Wikitext editor. By potentially measuring the increase in edit activity, new editors joining, and edit completion rate, we can learn over a few months if certain wikis have the potential to graduate faster with these changes. Ultimately, this could reduce the time spent by wikis in Incubator and balance the ratio of test projects (698) to hosted projects (326). Also, see: Research:Incubator_and_language_representation_across_Wikimedia_projects

Number of test, hosted, and closed language editions per wiki project type (December 2023 stats)

After Incubation[edit]

1.4. Automate Backend Site Creation[edit]

Description: Once a language is ready to graduate from the Incubator, the process of automating site creation on the backend begins. Currently this process is complex and involves a variety of systems. This includes configuring the domain, database, MediaWiki installation, initializing the new wiki’s database, Parsoid, Wikidata, etc. On comparable MediaWiki-based sites such as WBStack and Fandom, the process is automated and takes just several minutes.

Stretch Goal: Automate site creation on the frontend, which currently is manual and involves a series of complex steps such as monitoring which wikis are ready for the next steps with site setup, submitting a request on Phabricator to create a wiki, notifying the individuals who can address this request, and initiating the wiki creation, transition, or deletion process.

Hypothesis: If we automate the site creation process for wikis to make it much easier for creators of new wikis, it will facilitate the creation of more wikis. On an average, 12 wikis have been approved each year for creation from Incubator since 2006 (Incubator:Site_creation_log). Additionally, it will save creators time in this process, and potentially, many more people can undertake it as a task to do as volunteers. A small step towards testing this idea could involve fixing a minor aspect of the process (as an example fixing the currently broken AddWiki script and potentially integrating it with MediaWiki core) to assess its impact. For example, the "Last meaningful edit date" that occurred on the project can be tracked right now, along with the duration between the last edits and site creation, allowing us to examine the decrease in the amount of time. This will help us understand if the rate of wiki creation increases because of this new process and how to further evolve the backend infrastructure.

SWOT Analysis[edit]

This SWOT analysis evaluates the strengths, weaknesses, opportunities, and threats associated with each of the ideas under Recommendation 1: Streamlining technical infrastructure.

Ideas Strengths Weaknesses Opportunities Threats
New Language Addition and Approval - Simplifies and accelerates language addition process

- Reduces manual labor for Language Committee

- Attracts smaller language communities

- Risk of overlooking language eligibility

- Potential approval discrepancies

- Lack of ownership

- Increased language addition requests

- Greater inclusivity for underrepresented languages

- Potential for spam in language addition requests

- Need for robust validation mechanisms

- Requires significant resources & budget allocation

Providing Access to Modern Wiki Features Beyond Incubator - Access to modern wiki features

- Faster editing process

- Selective opt-in for initial wikis

- SRE team support available

- May overlook valuable incubation process

- Risk of neglecting community building in Incubator

- Faster editing for language wikis

- Potential for attracting new contributors

- Limited opportunity for incubation and testing

- Rapid onboarding, but retention failure

Automate Backend Site Creation - Streamlines site creation

- Reduces manual effort and time

- Improves efficiency and scalability

- Requires initial investment in automation

- Backend integration complexity may pose challenges

- Lack of ownership

- Faster site creation turnaround

- Potential for increased wiki creation

- Technical complexities and lack of ownership may hinder implementation

- Requires significant resources & budget allocation

Improving Editing Experience Within the Incubator - Improves experience and accessibility for new & existing contributors

- Encourages higher engagement

- Requires significant technical upgrades

- Unclear feasibility of proposed improvements

- Risk of investment becoming obsolete, if Idea 1 becomes a reality.

- Lack of ownership

- Improved content quality and quantity in Incubator

- Higher contributor retention

- Faster graduation of wikis

- Enhanced community satisfaction

- Technical complexities and lack of ownership may hinder implementation

- Requires significant resources & budget allocation

Considering the analysis:

1. Automating New Language Addition and Approval in Incubator simplifies and accelerates the language addition process, attracting interest from smaller language communities. Yet, there's a risk of overlooking eligibility criteria and potential misuse.

2. Providing Access to Modern Wiki Features Beyond Incubator offers the potential for faster editing and access to modern wiki features. However, it may overlook the valuable incubation process and community building that occurs in the Incubator.

3. Improving Editing Experience Within the Incubator enhances user experience and encourages higher engagement. However, it requires significant technical upgrades and challenges in implementation.

4. Automate Backend Site Creation streamlines the site creation process, reducing manual efforts and improving efficiency. However, it may face technical challenges during implementation and require initial investment.

2. Exploring social pathways[edit]

The design questions for idea exploration were formulated in alignment with recommendations aimed at fostering community growth, inclusivity, and linguistic diversity within Wikimedia projects. In response, ideas were generated focusing on streamlining onboarding processes, enhancing infrastructure, and fostering community engagement.

Design Questions for Idea Exploration[edit]

  • How might we raise awareness about languages not yet featured on Wikipedia, facilitating their inclusion? How can a design intervention guide potential contributors in initiating the process of adding their languages?
  • How can we streamline the process for new communities to assess their readiness for launching a Wikipedia in their language? Additionally, how might we seamlessly direct them to alternative Wikimedia projects or contribution avenues if they are not yet prepared?
  • What design strategies can we implement to encourage potential contributors to take necessary actions to add their languages to Wikipedia? Can we simplify the process through a user-friendly form, automating steps where possible, to eliminate the complexity of manual submissions? Note: Currently, the process for creating a new wiki is highly manual and lengthy. Please refer to this 10-page document for a step-by-step description of the process: New wiki creation: step-by-step process description
  • How can we enhance the user experience for new Wikipedia contributors by providing personalized guidance, tips, and tricks throughout their editing journey, similar to the Wikipedia Adventure and Growth features?
  • How might we effectively visualize the progress of a language wiki to its contributors, ensuring transparency and motivation for continued contribution? What design elements could be employed to make this progress tracking intuitive and engaging for users?

2.1. Guided-onboarding for language contributors[edit]

Description: Implementing a guided onboarding system for language contributors on Wikimedia projects. Upon sign-up, new contributors are asked questions on what kind of contributions and in what languages are they planning to make to Wikimedia projects and are provided personalized guidance, tips, and tricks throughout their editing journey, similar to the Wikipedia Adventure and Growth features. Contributors who select a language that has an existing Wikimedia project(s) with at-least a few active contributors are pointed to those projects (can be a hosted project or still under incubation). While, contributors who select a language currently not active or existing on Wikimedia projects, are asked further questions around what types of content, they are planning to bring on Wikimedia and can be pointed towards the projects that match their interests; such as Commons for multimedia, Wikidata for lexemes, Wikimedia Incubator for starting a new Wikipedia / Wiktionary or Multilingual Wikisource for bringing digitized texts online.

Hypothesis: If new contributors to Incubator receive personalized guidance based on their preferred languages and contribution types, they are likely to become long-term contributors in projects and make sustained contributions to Wikimedia projects. This new system will result in an increase in monthly edit activity among new contributors who have been onboarded in a guided way, compared to new contributors of other projects who didn’t receive guidance.

2.2. Creating language-based portals[edit]

Description: Create language-based portals that connect with different knowledge formats across Wikimedia projects (Wikipedia, Wikisource, Commons, Wikidata, Wiktionary etc.) per language. Currently, there's a lack of connectivity between content on various Wikimedia platforms, making it challenging for users to navigate seamlessly between them. Additionally, language communities often focus solely on Wikipedia, neglecting other potentially valuable projects. Inspired by Google's dashboard, which provides access to diverse products like news and payments, a similar platform could be created to facilitate navigation and integration across Wikimedia projects. This platform would also serve as a portfolio for language communities, encouraging them to explore different types of contributions beyond Wikipedia. For example, while the Wiki documentaries project might inspire topics, the language portal could emphasize participation in various Wikimedia projects.

Hypothesis: If we promote cross-pollination of knowledge across Wikimedia projects, this will help improve the accessibility of various knowledge formats, empowering communities to build knowledge ecosystems that cater to their specific needs, regardless of having a full-fledged Wikipedia in their language. A language portal will help increase the percentage of monthly edits to lesser-known Wikimedia projects in a language community.

2.3. Fostering a community of language enthusiasts and developers[edit]

Description: Help foster a community of language enthusiasts and developers by continuing the ongoing convening language community meetings and also exploring collaborations such as with the Indic Tech Plan (under construction) that includes various movement stakeholders. Ensure that languages-related tasks are well-structured on the Wikimedia Phabricator and integrate them with upcoming hackathons. Explore collaborations with various other movement stakeholders such as the Language Diversity Hub, Language committee, affiliates, and external partners for collective action.

Hypothesis: A thriving community of language enthusiasts and developers, fostered through sustained meetings, collaboration with diverse movement stakeholders (Indic Tech Plan, Language Diversity Hub, Language Committee, affiliates, external partners), and well-defined, hackathon-integrated tasks on Wikimedia Phabricator, is expected to lead to a significant increase in collective action for language-related initiatives within the Wikimedia ecosystem. This collaborative approach will likely result in richer content and improved representation for various languages on Wikimedia platforms.

SWOT Analysis[edit]

This SWOT analysis evaluates the strengths, weaknesses, opportunities, and threats associated with each of the ideas under Recommendation 2. Exploring social pathways.

Ideas Strengths Weaknesses Opportunities Threats
Guided-onboarding for language contributors - Improved User Experience

- Increased Engagement

- Needs design resources

- Might not cater to all contributor needs

- Volunteer recruitment

- Data driven insights

- Might be complex to implement
Creating language-based portals - Improves Accessibility

- Encourages cross-pollination

- Technical complexity

- User adoption

- Opportunity for innovation - Long-term sustainability
Fostering a community language enthusiasts and developers - Increased collaboration with movement stakeholders - Dependence on volunteers

- Need for technical expertise to help volunteer developers

- Volunteer skill development

- Data driven insights

- Unforeseen technical challenges

- Volunteer Burnout