Wikimedia Technology/Goals/2019-20 Q4

Technology Department Team Goals and Status for Q4 FY19/20 in support of the  Medium Term Plan (MTP) Priorities and Annual Plan for FY19/20



Analytics
Team Manager: Nuria Ruiz
 * MTP-Y1: Platform Evolution Build a reliable, scalable, and comprehensive platform for creating services, tools and user facing features that produce and consume event data
 * 5% of production and analytics events have been migrated to the new event platform.
 * By June 2020 all production and consumption of new event data originated in our websites is flowing through this new...
 * Build a reliable, scalable, and comprehensive platform for building services, tools and user facing features that...
 * ✅ Client Error Logging is deployed to 1 wiki and error stats are displayed on our operation dashboards.
 * ✅ Enable better scaling of our production infrastructure by moving our events standardized modern event system


 * MTP-Y1: Platform Evolution Reduce the complexity of workflows when it comes to build, train and deploy machine learning models to enable ML- aided product augmentation and research
 * ✅ Deploy a fully open source solution for GPU-enhanced computation infrastructure that improves training times
 * By the end of fiscal year present a design on how to speed up our model training by providing models


 * Modern Event Platform Build a reliable, scalable, and comprehensive platform for creating services, tools and user facing features that produce and consume event data
 * ✅ Working Stream Config Service and client side library for sending events to EventGate in MediaWiki Vagrant
 * One new event stream created and deployed by Product by the end of Q3 2019/2020
 * 2 existent EventLogging analytics streams migrated to Modern Event Platform by end of Q4 2019/2020
 * Resolve Kafka Connect HDFS Licensing issue and decide if we will use Kafka Connect task T223626
 * One new automated dashboard created and deployed by Product and Analytics engineering by end of Q4 2019/2020
 * Deploy a new event stream for analytics using the new Event Platform infrastructure
 * Vertical MEP from web to backend: Migrate SearchSatisfaction EventLogging event stream to Event Platform


 * Smart Tools for Better Data Make easier to understand the history of all Wikimedia projects
 * Wikistats UI is localized for languages and number formatting
 * Define computation for "Active Editors per project family"
 * Wikistats UI is more flexible when it comes to explore metrics. Allow spliting and filtering simultaneously
 * Add "Active Editors per project family" as a metric to wikistats UI
 * Design (together with core platform team) an alternative architecture for historic data endpoints used by iOS application
 * Define computation for active editors per project family
 * Implement foundations for newpyter (hadoop hosted distributed jupyter notebook setup)
 * Enhancements to Wikistats UI so you can split/filter simultaneously


 * Smart Tools for Better Data Increase Data Quality, Privacy and Security
 * Bots: Label high volume bot spikes in pageview data as automated traffic


 * Core Operational Excellence. Increase Resilience of Systems
 * Create a MySQL replica for backups for all MySQL instances we use MySQL on, like Oozie or Superset
 * Airflow as an easier job scheduling alternative, PoC for refine workflow
 * Unify stats and notebook cluster. Decomision notebook hosts and make puppet role of stat1007 just like the other stats boxes


 * Cassandra3 migration plan proposal



Platform
Team Manager: Corey Floyd
 * Create a cohesive documentation portal to onboard new developers to our API
 * Create infrastructure for developing better structured documentation to make it easier to build easy to read
 * Developers can easily understand the contents of the portal and find the information they need
 * Allow developers to quickly get started building knowledge based applications using our APIs.
 * ✅ Build a prototype for the documentation portal
 * The portal is a hub for a thriving community of developers.
 * Develop a technical direction for the Wikimedia Platform to support Wikimedia Medium Term Plan
 * Enable the development of full featured Javascript web clients
 * Communicate the vision and plan for the Core Platform Team's work through the end of the FY resulting from PE
 * Develop a strategy to integrate Javascript frameworks into the MediaWiki platform to enable easy development
 * Improve the sustainability of MediaWiki and the ease of building on top of it
 * Quantify and reduce coupling in MediaWiki Core
 * Initiatives that the Core Platform Team begins are driven to completion
 * Allow for more confident refactoring of core code
 * Close out MCR work
 * MW Core Code is better logically decomposed into libraries, introduction of new cross dependencies is...
 * ✅ Product requirements for upcoming Core Platform Team initiatives are documented
 * Further decoupling efforts
 * Close out actor and comment migration
 * Limit vandalism requests by bad actors and guarantee levels of service through securing the API
 * Reduce the risk of vandalism by bad actors by limiting throughput of anonymous API calls
 * Completion and shipping of the OAuth 2.0 initiative Epic 1 and 2
 * Reduce the risk of vandalism by bad actors by enabling the ability to disable access of known API users

Architecture
Team Manager: Kate Chapman
 * Define target architecture for structured content so pieces of content can be more easily used to engage users.
 * ✅ Perform task analysis modeling with product managers help determine pain points and needed system capabilities
 * Engage stakeholders to present plan on system changes needed to better enabled structured data.
 * ✅ Develop proposal for modern system to enable structured data.
 * Present proposal to CTO and CPO to gain support for no longer focusing on building page building software
 * Make architectural decision process clear so teams have clear direction as to what decisions have been made and how to proceed.
 * Create plan for decision making process
 * Develop template for technical design and decisions
 * Engage stakeholders in decision making process for feedback.

Platform Engineeering
Team Manager: Mat Nadrofsky
 * Drive the Delivery of Q4 Platform Engineering Initiatives
 * Limit the ability for bad actors and misinformed users to impact the availability of our services
 * Enable developers to make system changes while maintaining a consistent stable experience
 * Help a team migrate their service to Kubernetes



Fundraising Tech
Team Manager: Erika Bjune
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Performance
Team Manager: Gilles Dubuc
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:

Quality and Test Engineering
Team Manager: JR Branaa
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:

Release Engineering
Team Manager: Tyler Cipriani
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Research
Team Manager: Leila Zia
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Machine Learning / Scoring Platform
Team Manager: Aaron Halfaker
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Search Platform
Team Manager: Guillaume Lederrey
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Security
Team Manager: John Bennett
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Site Reliability Engineering
Directors: Mark Bergsma and Faidon Liambotis
 * Cross-cutting

Service Operations
Team Manager: Mark Bergsma
 * Key Deliverable

Data Persistence
Team Manager: Mark Bergsma
 * Key Deliverable

Traffic
Team Manager: Brandon Black
 * Key Deliverable

Infrastructure Foundations
Team Manager: Faidon Liambotis
 * Key Deliverable

Observability
Team Manager: Faidon Liambotis
 * Key Deliverable

Data Center Operations
Team Manager: Willy Pao
 * Key Deliverable

Dependencies on:

 Status 
 * April 2020 status:
 * May 2020 status:
 * June 2020 status:
 * June 2020 status:
 * June 2020 status:



Technical Engagement
Team Manager: Birgit Müller

Developer Advocacy
Team Manager: Birgit Müller
 * Develop, test and evaluate different formats to build technical capacity in smaller wikis
 * ✅ Conduct workshop and document the technical challenges small wikis face in North America
 * Organize an online workshop series for Indic language small wikis
 * Create a hub for the Small Wiki Toolkits initiative
 * A starter kit for small wikis containing a recommended set of templates, Gadgets, bots, etc. is available by Q4
 * Write a report highlighting lessons learned from developing and testing different formats to build technical capacity in smaller wikis


 * Increase visibility & knowledge of technical contributions, services and consumers across the Wikimedia ecosystem
 * ✅ Establish Coolest Tool Award
 * Share stories and insights from the technical community
 * Increase knowledge on scope and breadth of technical contributions and contributors
 * Train people how to use Phabricator to increase acceptance and foster collaboration


 * Successfully run Wikimedia’s technical internship and outreach programs
 * Successfully coordinate Outreachy and GSOC
 * ✅ Mentor First Season of Google Season of the Docs.
 * ✅ Submit and hold session on Wikimedia's Tech internships at WikiCon Northamerica
 * ✅ Successfully coordinate Google Code-in 2019
 * Mentor 1 intern on the WikiContrib project via Outreachy round 20

Wikimedia Cloud Services
Team Manager: Bryan Davis
 * All Debian Jessie instances are removed/replaced in Cloud VPS hosted projects
 * ✅ Remove Debian Jessie from the Cloud VPS "toolsbeta" project
 * ✅ Remove Debian Jessie from the Cloud VPS "tools" project
 * Remove Debian Jessie from the Cloud VPS "openstack" project


 * Increase application security by hosting tools using unique hostnames rather than path based routing
 * ✅ Update front proxy to support host based routing
 * ✅ Create redirect system to preserve function of legacy URLs following conversion from path base to host based routing of each tool
 * Migrate all tools to host based routing
 * ✅ Update `webservice` to support host based routing
 * ✅ Migrate 5+ early adopter/beta tester tools to host based routing
 * Interwiki links support for $tool.toolforge.org


 * Upgrade Toolforge Kubernetes to 1.16
 * Update `webservice` to support k8s 1.16 APIs
 * Determine blockers for k8s 1.16 upgrade and assign as tasks/KRs to team
 * ✅ Fix psp API group to work with k8s 1.16
 * Deploy Kubernetes 1.16 in Toolforge


 * WMCS Infrastructure as a Service (IaaS)
 * Debian Jessie operating system deprecation
 * OpenStack platform upgrades
 * Galera cluster
 * CEPH instance storage
 * Fix Cloud VPS and Toolforge mail servers to work with the modern internet


 * WMCS Platform as a Service (PaaS)
 * Provide a more modern, secure, and performant PaaS experience for Toolforge tools
 * Increase quality of technical documentation for Toolforge and Cloud VPS users
 * PAWS Kubernetes rebuild'''