Wikimedia Technology/Goals/2019-20 Q4

Technology Department Team Goals and Status for Q4 FY19/20 in support of the  Medium Term Plan (MTP) Priorities and Annual Plan for FY19/20



Analytics
Team Manager: Nuria Ruiz
 * Modern Event Platform. Build a reliable, scalable, and comprehensive platform for creating services, tools and user facing features that produce and consume event data
 * Deploy a new event stream for analytics using the new Event Platform infrastructure
 * Vertical MEP from web to backend: Migrate SearchSatisfaction EventLogging event stream to Event Platform


 * Smart Tools for Better Data. Make easier to understand the history of all Wikimedia projects
 * Design (together with core platform team) an alternative architecture for historic data endpoints used by iOS application
 * Define computation for active editors per project family
 * Implement foundations for newpyter (hadoop hosted distributed jupyter notebook setup)
 * Enhancements to Wikistats UI so you can split/filter simultaneously


 * Smart Tools for Better Data. Increase Data Quality, Privacy and Security
 * Bots: Label high volume bot spikes in pageview data as automated traffic


 * Core. Operational Excellence. Increase Resilience of Systems
 * Create a MySQL replica for backups for all MySQL instances we use MySQL on, like Oozie or Superset


 * Core. Operational Excellence. Reduce Operational Load by Phasing Out Legacy Systems/Technologies
 * Airflow as an easier job scheduling alternative, PoC for refine workflow
 * Unify stats and notebook cluster. Decomision notebook hosts and make puppet role of stat1007 just like the other stats boxes


 * Cassandra3 migration plan proposal

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Core Platform
Team Manager: Corey Floyd
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Fundraising Tech
Team Manager: Erika Bjune
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Performance
Team Manager: Gilles Dubuc
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:

Quality and Test Engineering
Team Manager: JR Branaa
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:

Release Engineering
Team Manager: Tyler Cipriani
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Research
Team Manager: Leila Zia
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Machine Learning / Scoring Platform
Team Manager: Aaron Halfaker
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Search Platform
Team Manager: Guillaume Lederrey
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Security
Team Manager: John Bennett
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Site Reliability Engineering
Directors: Mark Bergsma and Faidon Liambotis
 * Cross-cutting

Service Operations
Team Manager: Mark Bergsma
 * Key Deliverable

Data Persistence
Team Manager: Mark Bergsma
 * Key Deliverable

Traffic
Team Manager: Brandon Black
 * Key Deliverable

Infrastructure Foundations
Team Manager: Faidon Liambotis
 * Key Deliverable

Observability
Team Manager: Faidon Liambotis
 * Key Deliverable

Data Center Operations
Team Manager: Willy Pao
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Technical Engagement
Team Manager: Birgit Müller
 * Increase visibility & knowledge of technical contributions, services and consumers across the Wikimedia ecosystem
 * Establish Coolest Tool Award
 * Share stories and insights from the technical community
 * Increase knowledge on scope and breadth of technical contributions and contributors
 * Train people how to use Phabricator to increase acceptance and foster collaboration
 * Successfully run Wikimedia’s technical internship and outreach programs
 * Successfully coordinate Outreachy and GSOC
 * Mentor First Season of Google Season of the Docs.
 * Submit and hold session on Wikimedia's Tech internships at WikiCon Northamerica
 * Successfully coordinate Google Code-in 2019
 * Mentor 1 intern on the WikiContrib project via Outreachy round 20
 * Develop, test and evaluate different formats to build technical capacity in smaller wikis
 * Conduct workshop and document the technical challenges small wikis face in North America
 * A starter kit for small wikis containing a recommended set of templates, Gadgets, bots, etc. is available by Q4
 * Organize an online workshop series for Indic language small wikis
 * Write a report highlighting lessons learned from developing and testing different formats to build technical capacity in smaller wikis.
 * Create a hub for the Small Wiki Toolkits initiative

Wikimedia Cloud Services
Team Manager: Bryan Davis
 * Increase application security by hosting tools using unique hostnames rather than path based routing
 * Interwiki links support for $tool.toolforge.org
 * Update front proxy to support host based routing
 * Update `webservice` to support host based routing
 * Create redirect system to preserve function of legacy URLs following conversion from path base to host based routing of each tool
 * Migrate 5+ early adopter/beta tester tools to host based routing
 * Migrate all tools to host based routing
 * WMCS Platform as a Service (PaaS)
 * Provide a more modern, secure, and performant PaaS experience for Toolforge tools
 * Increase quality of technical documentation for Toolforge and Cloud VPS users
 * PAWS Kubernetes rebuild
 * WMCS Infrastructure as a Service (IaaS)
 * Debian Jessie operating system deprecation
 * CEPH instance storage
 * OpenStack platform upgrades
 * Galera cluster
 * Fix Cloud VPS and Toolforge mail servers to work with the modern internet
 * WMCS Data as a Service (DaaS)
 * All Debian Jessie instances are removed/replaced in Cloud VPS hosted projects
 * Remove Debian Jessie from the Cloud VPS "toolsbeta" project
 * Remove Debian Jessie from the Cloud VPS "tools" project
 * Remove Debian Jessie from the Cloud VPS "openstack" project
 * Upgrade Toolforge Kubernetes to 1.16
 * Update `webservice` to support k8s 1.16 APIs
 * Fix psp API group to work with k8s 1.16
 * Determine blockers for k8s 1.16 upgrade and assign as tasks/KRs to team
 * Deploy Kubernetes 1.16 in Toolforge

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status: