Migrate restrouter is Done and is now in Services's team hands
Some portions of SonarQube is not open sourced, so we're looking into options
Streamline the Kibana -> Phab error reporting workflow - has a POC now and should be deployed soon
August 27, 2019 - In progress
For the work to streamline the Kibana -> Phab error reporting workflow e're looking at deploying Phatality
POC for GitLab is Done and Zuul3 is Partially done
Scope out requirements for a self-hosted version of SonarQube is no longer stalled. We have a strategy that will use a combination of self-hosted and cloud hosted depending on the data. Essentially, self-hosted open source version will not do branch level analysis. We don't believe that will keep us from using it for non-branch based analysis.
Expand set of repositories covered by code health metrics (via SonarQube) - we will have three new extensions added by the end of this month, and adding 3-6 more next month.
Set up an experimental elastic search instance to store and analyze CI logs and metrics: We met to discuss this under the "Data ^3" project and laid out some basic objectives for a POC, and this work will continue into next quarter.
Update the existing system test tooling and developer education:
We worked with the Core Platform team in the analysis and selection of a an integration test tool. The expectation is for the Quality and Test Engineering team to take responsibility for this tooling once a SET is in place.
Code Health Metrics WG has spun off effort to separate existing MediaWiki Unit Tests from Integration tests (driven by WMDE)
September 2019 - In progress
Actionable code health metrics are provided for code stewards
Decided that prior to investigating self-hosting of SonarQube, we wanted asses the current perceived value. As such we will be interviewing teams that are currently using SonarQube/SonarCloud as part of the Code Health Pipeline.
We've been incrementally adding new repos to the Code Health Pipeline in order to avoid overloading the CI. No issues so far. Looking to add all applicable repos by the end of Q2.
A clear set of unit, integration, and system testing tools is available for all supported engineering languages.
To date we've established a set of tools that are used across the organization for Unit and System level automated testing. The CPT team has evaluated and deployed an integration testing tool that we look to make available more broadly. However, due to lack of SET staffing, it's not likely going to happen in this FY. As the new Quality and Test Engineering team has been formed, we will be assessing the state of tools across other teams across the foundation.
The Selenium documentation has been updated.
Webdriver IO has been upgraded from 4 to 5 for Core. Will need to start planning the migration for the other repos.
Done - [P-O14-D4] Run a series of interviews, office hours, or surveys to gather volunteer editor community's input on citation needed template recommendations. The result of this work will inform the specifications of an API (to be developed) to surface citation needed recommendations as well as future directions for this research. task T228442
Done - [P-O14-D4] Complete the research on characterizing Wikipedia citation usage. (Why We Leave Wikipedia). This goal will continue in Q2 and depending on the submission results potentially in Q3. task T227790
Done - [W-O6-D3] Computer vision consultation as part of Structured Data on Commons task T228440
NPostponed - [P-O14-D6] Building a pipeline for image classification based on Commons categories. task T228441
Done - [P-O14-D4] Make substantial progress towards a comprehensive literature review about automatic detection of misinformation and disinformation on the Web. We expect this work to be completed in Q2 and inform the work in this direction in Q3+. task T229595
Done - [P-O14-D4] Understand patrolling on Wikipedia. A write-up describing how patrolling is being done on Wikipedia across the languages. This work may be extended further by understanding the patrolling on Wikipedia in the context of Wikipedia's interaction with other projects such as Wikidata, Wikimedia Commons, ... task T228817
Done - Conduct the analysis on reader surveys to understand the relation between demographics and the consumption of content on Wikipedia across languages. (Why We Read Wikipedia + Demographics). This research will be concluded in Q2 and we expect substantial progress in Q1: task T228279
Done - Hiring and onboarding. We expect 1-2 scientists to join the team in Q1 and the onboarding work will need to happen. We also expect to open a position for an engineering position in the team. task T229259
Done - [T-O12-D3] Determine important features of articles w/r/t level of reader interest across different demographic groups (as motivation for what aspects a general article category model should capture): task T228319
Reduce complexity of the platform: Reduce technical debt and increase automation to reduce workload and make it easier to add new search features
Refactor query highlighting to make it extensible by other extensions task T190130 In progress
Refactor Mjolnir jobs into separate smaller jobs In progress
Core work: Maintain CirrusSearch and the Search API and WDQS
Core maintenance work (always In progress)
Improve WDQS updater performance by writing custom code for updates task T212826 In progress
Full data reimport for WDQS to enable optimizations that were done last quarter Done
Work through the backlog of bugs and performance improvements for WDQS with our contractor Done
Start the hiring process for a new WDQS Engineer Done
Hardware renewal: replace elastic1017-1031 task T226843 In progress
Continue to identify and enable machine learning and natural language processing techniques to improve the quality of search
"Did you mean" suggestions: deploy method0 to production In progress
Underserved communities benefit from search techniques that to date are only used on big wikis like machine learning–aided ranking, word embeddings or language specific analyzers: Language analysis / Phab work