Outreachy/Past projects

From mediawiki.org

This page tries to keep up with the current status of all past Outreach Program for Women/Outreachy projects.

See also Google Summer of Code/Past projects.

Quantitative summary of past Outreachy projects[edit]

Completed Outreachy projects since 2013:

In the 18 Outreachy rounds between 2013 and 2021, contributors joined from 22 countries: India, United States, Brazil, United Kingdom, Sri Lanka, Canada, Israel, Romania, Germany, Turkey, Cameroon, Kenya, Nigeria, Vietnam, Taiwan, Nepal, Bangladesh, Russia, Malaysia, Uganda, France, Pakistan.

Create a Ruby Gem to analyze Wikidata Statistics[edit]

  • Mentees: Sulagna Saha
  • Mentors: Sage Ross, Will Kent
  • Outcome: Published the gem which provides functionality to parse the differences between Wikidata revisions and extract statistics about the changes. It enables accurate analysis of Wikidata edits, such as counting the number of claims, qualifiers, references, aliases, labels, descriptions and site links added, removed, and changed. The gem is integrated to Programs and Events Dashboard and deployed.
  • Tech Stack: The gem is written by Ruby programming language.
  • Relavant links: Wikidata-diff-analyzer, Github Repo for the gem, Integration
  • Blog: Sulagna's Blog

Content Translation language imbalances[edit]

  • Mentee: Nathaly Toledo.
  • Mentors: Adam Wight, Kavitha Appakayala, Jan Dittrich.
  • Outcome: Two research questions were solved and one was advanced. The research questions and related reasoning were:
    • What is the content being translated the most? What patterns can be found? Try finding a dataset that will let you know what articles lack translations (calculate an average), and classify them to understand patterns that could lead to an answer.
    • RQ 3.1: What is the effect of MT availability on translation flow? Let’s consider three distinct types of events changing MT availability: enabling MT where there was none, changing default MT engine, and disabling MT. The question explores the impact machine translation would have on better more translation are sent or started as a consequence of the practicality and also whether they are less likely to be deleted soon after being created.
    • Do users prefer to translate content in their native language(s)? If so, what influences this behavior? The question is based on the assumption that the strongest communities also correspond to the larger languages, and these communities tend to be under the “self-focus” bias, which prompts tend to create and translate content in their first language first (and about their own culture first). It also assumes that the most confortable someone is in their own language levels, the more likely they are to translate in it.
  • Tech stack: Python, Jupyther Notebooks.
  • Relevant links:

Content Translation language imbalances[edit]

  • Mentee: Abhishek Bhardwaj.
  • Mentors: Adam Wight, Kavitha Appakayala, Jan Dittrich.
  • Outcome: We developed a reusable, editable python package that extracts data from the Wikimedia database. The current version of our package contains modules to extract language proficiency data of translators from all the Wikipedia versions. We also did data warehousing caching the data whose generation is costly (saving hours of run-time for anyone who wish to use it). We did the analysis of the generated data to find trends in user activities and dig deeper into the relation between translation imbalances and proficiency of translators and what is the optimal language pair for translation for each user based on their self reported language proficiency.
  • Tech Stack: Python, PAWS, MariaDB.
  • Relevant Links: Research Page, GitHub Repository

Develop a web app for editing Toolhub records[edit]

  • Mentees: Nicole Barnabee-Burns, Hannah Waruguru Njoroge
  • Mentors: Slavina Stefanova, Damilare Adedoyin
  • Outcome: Over the course of the internship, we developed a full-stack web application that could be used to improve discoverability of other Wikimedia tools. The tool identifies gaps in the Toolhub records of other tools, and presents a user-friendly interface for filling in the missing information.
  • Tech stack: The application was built with Vue.js on the front-end and Flask on the back-end, and is connected to a MariaDB database. Task queuing is handled by Celery, with Redis as a broker.
  • Relevant links: Toolhunt, Phabricator workboard, Frontend repository, Backend repository
  • Blog: Nicole's blog, Hannah's blog

Hybrid event production for QueeringWikipedia 2023[edit]

  • Mentee: André Rodrigues
  • Mentors: Željko Blaće, Owen Blacker
  • Outcome: After investigating various FLOSS options and considering time commitments, we decided to use Zoom for regular meetings, Jitsi for unconference style sessions, and BigBlueButton for workshops and explanatory sessions. In addition, I conducted outreach and held office hours to promote the event during the internship period.
  • Relevant links: Phabricator page
  • Blog: André's blog

Develop features for Wiki Loves Monuments App[edit]

Develop a web app for patrolling based on the new ML-based service to predict reverts[edit]

Rewrite Imagebulk tool to scale up[edit]

  • Mentees: Enow97
  • Mentors: Jay Prakash and Sudhanshu
  • Outcome: The project involved rewriting the existing web app codebase using Vue.js and Flask, along with integrating Celery to improve the scalability and performance of the system. The resulting system will be able to handle large volumes of traffic and complex user interactions while remaining responsive and efficient. Although, code has been written under this project but deployment is still being left and will be handle by mentor (Jay Prakash).
  • Tech stack:
  1. Vue.js on the front-end
  2. Flask on the back-end
  3. Task queuing in Celery along with Redis as the broker
  4. Docker

Add support for tracking specific namespaces to Programs & Events Dashboard[edit]

  • Student: Vaidehi Atpadkar
  • Mentors: Sage Ross
  • Outcome: Dashboard now has a new feature of selecting specific wiki-namespaces for tracking and displaying the stats for them.
  • Relevant links: source code
  • Blog: Vaidehi's Blog

Build Python library to work with html-dumps[edit]

  • Student: Nazia Tasnim
  • Mentors: Martin Gerlach, Isaac Johnson
  • Outcome: mwparserfromhtml, a python-library to parse the Wikipedia HTML dumps.
  • Relevant links: source code
  • Blog: Nazia's Blog

What's in a name? Automatically identifying first and last author names for Wikicite and Wikidata[edit]

Automatically matching new Wikipedia articles with Wikidata items using Python[edit]

Automatically matching new Wikipedia articles with Wikidata items using Python[edit]

Develop learning toolkits and videos to demonstrate the use of essential tools for Wikimedia[edit]

Improve Wikidata support on Programs & Events Dashboard[edit]

  • Student: Ivana Novakovic-Lekovic
  • Mentors: Sage Ross
  • Outcome: Integrated Wikidata edit analysis into the Dashboard’s data update system; it now shares Wikidata edits details about merges, aliases, labels, claims, and more.
  • Relevant links: source code
  • Blog: Ivana's Blog

Refactor Mediawiki tests to use WebdriverIO Async[edit]

  • Student: Osama Tahir
  • Mentors: Soham Parekh, Željko Filipin
  • Outcome: Refactored MediaWiki tests in wide range of extensions (such as Math, Newsletter, VisualEditor) to use WebdriverIO Async
  • Relevant links: source code
  • Blog: Osama's Blog

WikiNav[edit]

  • Student: Muniza A.
  • Mentors: Martin Gerlach and Isaac Johnson
  • Outcome: Developed WikiNav, a tool that processes the Wikipedia clickstream data to generate statistics and visualizations that help make this data more accessible to folks with varying levels of programming and data wrangling experience.
  • Relevant links: Phabricator task, demo application
  • Blog: Muniza's Blog

Developing mwsql: A Python package for working with Wikimedia SQL dumps[edit]

Synchronising Wikidata and Wikipedias using pywikibot[edit]


Modules Research Tool[edit]

Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia[edit]

Wiki Country Inference Tool: A Model that Infers countries from Wikipedia Articles[edit]

Developing a lightweight and efficient Content Filtration module for Wikimedia Commons[edit]

Review and improve Lua documentation on meta and mediawiki[edit]

Enhancements to gdrive-to-commons uploader tool[edit]

Productionize Wikidata-based Topic Model on ORES[edit]

WikiContrib: Gather and analyze user contributions on Wiki and GitHub[edit]

  • Student: Raymond Ndibe
  • Mentors: Srishti Sethi and Rammanoj potla
  • Outcome: 1) Implemented feature to count contributions made to Wikimedia repositories on GitHub 2) Implemented contributions caching feature 3) Implemented persistent URL feature 4) Fixed all outstanding issues and bugs 5) Improved the tool's UI/ UX.

Converting Campaign pages to React[edit]

  • Student: Lalitha Reddy
  • Mentors: Sage Ross, Khyati Soneji
  • Outcome: Created the campaign navbar and the home tab component in React.
  • Relevant links: project task, bi-weekly reports

Improvements and User Testing of Wiki Education Dashboard Android App[edit]

A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia[edit]

Documentation improvements to the ~20 top 100 most viewed MediaWiki Action API pages on-wiki[edit]

Create regression automated tests for Special:Homepage functionality testing[edit]

Improve MediaWiki Action API Integration Tests[edit]

Documentation improvements to the ~20 top 70 most viewed MediaWiki Action API pages on-wiki[edit]

Improve Programs & Events Dashboard for use in the #1lib1ref campaign[edit]

  • Student: Khyati Soneji
  • Mentors: Sage Ross, Wes Reid
  • Outcome: Added support for counting references added to English Wikipedia articles in Programs & Events Dashboard, along with improved data download options and support for scoping via PetScan PSIDs.
  • Relevant links: Internship blog posts, project task

Research project on the editing patterns of users of wiki CX translation tool[edit]

  • Student: Doris Zhou
  • Mentor: Isaac Johnson, Jonathan Morgan
  • Outcome: Did research analyzing the editing patterns, article selection, and article writing quality of users who initiated article translation using the CX Translation tool. Looked specifically at English to French in depth and did some English to Chinese analysis.
  • Relevant links: bi-weekly reports, research meta page

Improve top 50 viewed pages of the MediaWiki Action API & create a demo app to educate users[edit]

Update MediaWiki Action API docs, add Python code to repo, create a demo app, and write a tutorial for the demo which showcases several APIs.

Add a new Linter Category: Links-in-Links[edit]

Write code in Parsoid to detect links inside links and in PHP Linter extension to add this category.

Provide Test Support for Various Wikimedia Projects[edit]

Apply exploratory testing principles to test weekly maintenance releases of Content Translation tool and Visual Editor.

QA: Testing Automation - port Echo Notification tests to Node.js[edit]

Created automated tests to check that updates to the changes made to the code base do no break existing components.

Create an event setup wizard for Programs & Events Dashboard[edit]

Design, create and test a wizard which helps to make it easy for users to set up an event with exactly the settings they need, which is an interface that walks through all the main options and describes what they do and what they are for to help configure an event.

Improve support for photo/media contribution campaigns on Wikimedia Programs & Events Dashboard[edit]

Made media contributions a first class citizen in the Wikimedia Programs & Events Dashboard. The project included building dedicated user-friendly pages for viewing and assessing the metadata of uploads from a specific campaign, and adding upload contribution statistics in other views alongside article statistics.

Automatically detect spambot registration using machine learning like invisible reCAPTCHA[edit]

Create a captcha which is friendlier to humans and harder for bots to crack

Improvements to Grants review and Wikimania scholarships web apps[edit]

Improve scholarships and grant review applications by important bug fixes and feature additions

Refactoring of MassMessage Extension[edit]

Fix technical depth cleaning on MassMessage

Translation outreach: User guides on MediaWiki.org[edit]

Create, test and document new strategies to recruit technical translators

User Contribution Summary Tool[edit]

Create a tool that's optimized for presenting one's activity on wikipedia in a CV-like manner

Improve Programs & Events Dashboard support for Art+Feminism 2018[edit]

Improve the Program & Events Dashboard from WikiEducation based on the feedback from the Art+Feminism campaign of 2018.

Remind me of this article in X days[edit]

Make it possible for logged-in user to get a reminder of an article after a few

days. Possibility to enter a short comment.

Documentation on how to develop Zotero translators at translation-server[edit]

Document the process of writing Zotero web translators on server side and on Scaffold and how to get them in production.

Allow Programs & Events Dashboard to make automatic edits on connected wikis[edit]

  • Student: Medha Bansal
  • Mentors: Sage Ross and Jonathan Morgan
  • Status: All tasks as mentioned in the proposal and in the timeline have been completed. Project is live with all supporting documentation.
  • Link to project task on Phabricator: T158678
  • Link to weekly reports archives: Weekly reports

Creating User Profile Pages for Wiki Ed Dashboard and providing cumulative statistics for all programs a user has participated in.[edit]

Added customizable Profile pages to the Wiki Education Dashboard and generated contribution statistics of the users, providing them a brief overview of all the contributions they made to encourage them to do more.

Easier categorization of pictures in Upload to Commons Android app[edit]

This project improves the image categorization functionality of the app by offering relevant category suggestions based on geolocation, and making category search more flexible.

Reinvent Translation Search[edit]

The objective of this project is to offer a search tool to empower translators to find messages they want to translate and maintain consistency between translations.

Wikipedia article translation metrics[edit]

"This project aims at building a model that would estimate whether a page is translated or not, using statistical analysis and machine learning tools."

Pywikibot compat to core migration[edit]

"The purpose of this project is to improve all the documentation including getting started guides and project documentation in Pywikibot."

Wikipedia Education Program need-finding research[edit]

"The task is to improve the function, usability and design of the course pages for both professors and students."

Collaborative spelling dictionary building tool[edit]

"The project aims at developing a collaborative dictionary which shall also have an additional feature of checking spellings of the words."

Adding Performance Instrumentation to Parsoid[edit]

"This project will develop a dashboard of metrics that will allow users to, at-a-glance, understand Parsoid's performance. It will provide a resource for application tuning, quick assessments of production readiness, and troubleshooting sources of performance problems."

  • Student: Christy Okpo
  • Mentors: Subramanya Sastry
  • Wrap-up blogpost: Link
  • Phabricator Evaluation task: T92244
  • Status: Dashboards have been created, here and here. A glossary of metrics and guide to performance instrumentation using Graphite, have also been created.

Extending PyWikiBot support to sites on IWM[edit]

"PyWikiBot currently supports only a few wiki projects. At the end of this project, the benefits of automation of tasks by PWB will be provided to all MediaWiki sites on the meta:Interwikimap, and provide the basis for support of non-MediaWiki wiki sites and non-wiki sites."

  • Student: Manpreet Kaur
  • Mentors:John Mark Vandenberg, Fabian Neundorf
  • Wrap-up blogpost: Link
  • Phabricator Evaluation task: T92246
  • Status: Final report can be found here. Further work to be done on non-mw sites.

Improving URL citations on Wikimedia[edit]

Aims to make citing sources in VisualEditor easier by generating a citation given a unique identifier such as a URL or DOI.

Enhancing Wikimaps/OpenHistoricalMaps Project[edit]

  • Student: Jaime Lyn
  • Mentors: Dr. Rob Warren
  • Wrap-up blogpost: Link
  • Final report:
  • Status:

Welcome to labs - Welcoming new contributors to Wikimedia Labs and Tool Labs[edit]

Finding the best and making them better: Evaluating, documenting, and improving MediaWiki web API client libraries[edit]

Feed the Gnomes - Wikidata Outreach[edit]

Template Matching for RDFIO[edit]

WikiHunt the 'Property': Wikidata Outreach Initiative[edit]