Wikimedia Technology/Goals/2017-18 Q1

From mediawiki.org
Q4 (with Product) Wikimedia Technology Goals, FY2017–18, Q1 (July – September) Q2

Introduction[edit]

Purpose of this document[edit]

Goals for the Wikimedia Technology department, for the first quarter of fiscal year 2017–18 (July 2017 – September 2017). The goal setting process owner in each section is the person responsible for coordinating completion of the section, in partnership with the team and relevant stakeholders.

Goals for the Audiences department are available on their own page

Legend[edit]

Annual Program/Outcome refers to items in the annual plan (draft).

Tech Goal categorizes work into one or more of these quadrants:

A  Foundation level goals C Features that we build to improve our technology offering
B Features we build for others D Modernization, renewal and tech debt goals

ETA fields may use the initialism EOQ (End of Quarter).

Status fields can use the following templates: In progress In progress, To do To do, N Postponed, Yes Done or Incomplete Partially done

CTO[edit]

Goal setting process owner: Victoria Coleman

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 4: Technical community building

Outcome 5: Organize Wikimedia Developer Summit

Objective 1: Developer Summit web page published four months before the event

Decide on Dev Summit event location, dates, theme, deadlines, etc. and publicize the information B Technical Collaboration EOQ To do To do

Goal setting process owner: Nuria Ruiz

The Analytics Engineering makes Wikimedia related data available for querying and analysis to both WMF and the different Wiki communities and stakeholders. We develop infrastructure so all our users, both within the Foundation and within the different communities, can access data in a self-service fashion that is consistent with the values of the movement. For next quarter we will have less resources (about 2 people less for most of the quarter) thus our commitments are smaller. The second tier of priorities is marked as STRETCH, we will start working on those once goals are almost completed.


Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 1: Wikistats 2.0 redesign.

  • (carry on from last quarter) Initial deployment of wikistats 2.0 UI task T160370
  • Start development of a new backend and API on top of the Data Lake Edit data in Hadoop. task T156384
A. Org Level Priority. B. Serving our Audiences. Cloud Services N Not done
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 1: Wikistats 2.0 redesign.

  • Add Edit count to Data Lake per event for all events for all wikis since the beginning of time. Yes Done task T161147, task T165233 Yes Done
  • STRETCH: Add meta data about data lake calculations task T155507N Not done
A. Org Level Priority. B. Serving our Audiences. Ops, DBAs
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 3: Experiments with real-time data and community support for new datasets available.

Deprecate socket.IO RC feed and help clients migrate (by July 7th) task T156919 Yes Done B. Serving our Audiences. Ops Yes Done
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 2: Better visual access to EventLogging data

Spike: Provide top domain and data to truly test superset task T166689 N Not done
Cross Departmental: Privacy, Security And Data Management. Track 2: Technology.

Outcome 2: To protect user data and uphold movement values, the Wikimedia Foundation continues compliance with best practices for data management Objective 2 & 3: Better offboarding / onboarding for data access. Ensure retention guidelines are being followed Yes Done

  • Audit existing stat/analytics shell accounts and incorporate expiration dates, make all accounts compliant with "time restricted access" task T170878 Yes Done
  • stat1002/3 replacement task T152712 Yes Done
  • Revamp docs for access to data Yes Done
  • (carry on from last quarter) Data purging for Eventlogging task T156933 N Not done
C. improve our own feature set

D. Tech Debt

Ops
Program 1: Availability, performance, and maintenance.

Outcome 3: Scalable, reliable and secure systems for data transport.

Objective 1: Consolidation of Kafka infrastructure to tier-1 requirements, including TLS encryption

  • Bootstrap the new Kafka cluster (hardware refresh) with an upgraded version and TLS encryption and access control lists support. task T152015 Yes Done
  • STRETCH: switch the Varnishkafka cache misc configuration to the new cluster and enable TLS authentication for it. N Not done
C. improve our own feature set

D. Tech Debt

Ops Yes Done
Program 1: Availability, performance, and maintenance.

Outcome 3: Scalable, reliable and secure systems for data transport.

Objective 3: Software, hardware upgrades, and maintenance on analytics stack to maintain current level of service

  • Hadoop health monitoring (add newer ones like default rack, corrupted blocks, etc..): task T166140 Yes Done
  • Prep work for Eventlogging databases refresh (OS install, racking..) task T156844 Yes Done
  • STRETCH: Stats for Vanishkafka errors: task T164259 N Not done
C. improve our own feature set

D. Tech Debt

Ops

Goal-setting process owner: Bryan Davis

Work is in progress on choosing which tasks to designate as goals on phabricator.

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance

Outcome 4: VPS hosting

Objective 2: Pay down tech debt by deploying OpenStack Neutron

Create a detailed migration plan for implementing Neutron as our OpenStack SDN layer C: Improving our offering

D: Tech debt

EOQ In progress In progress
Program 1: Availability, performance, and maintenance

Outcome 4: VPS hosting

Objective 1: Maintain existing OpenStack infrastructure and services

Define a metric to track OpenStack system availability B: Serving our audiences EOQ Yes Done
Program 4: Technical community building

Outcome 1: Improve documentation

Objective 2: Create tutorial content

Plan contract documentation work B: Serving our audiences

C: Improving our offering

EOQ N Not done
Program 7: Smart tools for better data

Outcome 3: Data services

Objective 1: Provide reliable and available access to dumps

Begin migrating customer-facing Dumps endpoints to Cloud Services C: Improving our offering EOQ In progress In progress
Program 10: Public cloud services & support

Outcome 1: PaaS is easy to use

Objective 2: Migrate workflows to Striker

Manage shared tool accounts via Striker C: Improving our offering EOQ Yes Done
Program 10: Public cloud services & support

Outcome 2: Rebranding

Objective 1: Complete core rebranding

Perform initial Cloud Services rebranding C: Improving our offering Q2 In progress In progress
Program 10: Public cloud services & support

Outcome 3: Outreach

Objective 1: Promote services and products

Attend Wikimania to support and promote Cloud Services products with the Wikimedia communities B: Serving our audiences August Yes Done
Program 10: Public cloud services & support

Outcome 4: First line tech support

Objective 1: Provide first line technical support resources

Hire first line technical support contractor B: Serving our audiences Q2 Yes Done

Goal setting process owner: Katie Horn

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Product Program 7: Payment processor investigation and long-term strategy
  • Outcome 1: Advancement and fr-tech find a solution that lowers or does not increase current maintenance costs.
Continue Ingenico intergration Some MVP of Ingenico Ingenico (external vendor) EOQ To do To do
Product Program 8: Donor retention
  • Outcome 1: We hope to reduce the hours spent manually deleting duplicate records during Big English.
Time box work on Civi duplicate records Time boxed effort to resolve merge conflicts Major Gifts feedback EOQ To do To do

MediaWiki Platform[edit]

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 2: MediaWiki
  • Outcome 1: Stakeholders in MediaWiki development will have sense of progress and direction in MediaWiki.
  • Hire product manager
  • Develop MediaWiki roadmap
D: Modernization, tech debt
  • PM
EOQ In progress In progress
Program 2: MediaWiki
  • Outcome 2: MediaWiki code quality will be improved
C: Improving our offering

D: Tech debt

  • Brion
  • Tim
EOQ To do To do
Program 8: Multi-datacenter support
  • Outcome 1: Our audiences enjoy improved MediaWiki and REST API availability and reduced wiki read-only impact from data center fail-overs.
  • Objective 3: Integrate MediaWiki with dynamic configuration or service discovery, in order to reduce the time required for a master switch from one datacenter to another
C: Improving our offering Ops
  • Tim
EOQ In progress In progress
Cross Departmental: Structured Data on Commons.
  • Outcome 1: Store structured data within wiki pages, in particular on media file pages on Commons.
  • Outcome 2: Introduce Multi-Content Revisions (MCR)
  • Actor table (T167246) development substantially complete, ready for deployment
  • Comment table (T166732) development substantially complete, ready for deployment
  • De-globalize EditPage.php (T144366)
  • Narrow AbuseFilter interface (T170184)
  • EditPage backend (T157658)
A: Foundation level goals

C: Improving our offering

D: Tech debt

Ops (DBA)
  • Brad
  • Brion
  • Kunal
  • Tim
EOQ In progress In progress

Speed is Wikipedia's killer feature. ("Wiki" means "quick" in Hawaiian.) As the Wikimedia Foundation’s Performance team, we want to create value for readers and editors by making it possible to retrieve and render content at the speed of thought, from anywhere in the world, on the broadest range of devices and connection profiles.

Goal-setting process owner: Gilles Dubuc

Annual Program/Outcome Quarterly objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance

-> Outcome 1: All production sites and services maintain current level of availability or better

--> Objective 2: Assist in the architectural design of new services and making them operate at scale

  • Follow-up bugfixes and improvements after Thumbor deployment to production - T121388
B: Features we build for others
  • Operations
  • Gilles
EOQ Yes Done
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

--> Objective 2: Catch and address performance regressions in a timely fashion through automation

  • Test user performance from Asia to validate changes when the Asia Cache goes live - T169180
B: Features we build for others
  • Operations
  • Peter
  • Gilles
EOQ Yes Done
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

--> Objective 3: Modernize our performance toolset. We will measure performance metrics that are closer to what users experience.

  • Rework NavigationTiming metrics to make them stackable - T104902
  • Add metrics for master queries on HTTP GET/HEAD - T166199
C: Feature
  • Aaron
  • Gilles
  • Peter
  • Timo
EOQ Yes Done
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

D: Tech debt
  • Release Engineering
  • Timo
EOQ Yes Done
Program 8: Multi-datacenter support

-> Outcome 1: Our audiences enjoy improved MediaWiki and REST API availability

--> Objective 1: MediaWiki support for having read-only "read" requests (GET/HEAD) be routed to other datacenters

  • Enable HTTPS for swift clients (MediaWiki) - T160616
  • Enable HTTPS for mariadb clients (MediaWiki) - T134809
  • Install and use mcrouter in deployment-prep - T151466
B: Features we build for others
  • Operations
  • Aaron
EOQ N Not done

Goal setting process owner: Greg Grossmeier

#releng-201718-q1 (Phabricator project) -- All Technology team Q1 goals: Wikimedia_Technology/Goals/2017-18_Q1

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team members ETA Status
Program 1: Availability, performance, and maintenance

Outcome 1: All production sites and services maintain current levels of availability or better

Objective 1: Deploy, update, configure, and maintain production services

D: Tech Debt
  • Operations
  • Services
  • Discovery
  • Cloud Services
  • Tyler
  • Chad
  • Antoine
  • Mukunda
EOQ Yes Done
Program 1: Availability, performance, and maintenance

Outcome 5: effective and easy-to-use testing infrastructure and tooling

Milestone 1: Develop and migrate to a JavaScript-based browser testing stack

  • Migrate majority of developers to JavaScript based browser test framework (webdriver.io)
C. Improve our own feature set

D: Tech Debt

  • All developers
  • Notably: Wikidata, CirrusSearch
  • Zeljko
End of Q2 In progress In progress
Program 6: Streamlined service delivery

Outcome 2: unified pipeline towards production deployment.

Objective 2: Set up a continuous integration and deployment pipeline

  • Define functional tests for Mathoid running on the staging Kubernetes cluster for use in future gating decisions - task T170482
  • Define method for monitoring and reacting to the above functional tests - task T170483
C. Improve our own feature set

D: Tech Debt

  • Operations
  • Services
  • Lead: Tyler
  • Antoine
  • Dan
  • Jean-Rene
EOQ N Not done


Goal setting process owner: Dario Taraborelli [ Wikimedia Research goals overview ] [Wikimedia Research annual plan overview]

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 12: Grow contributor diversity.

Outcome 1: We improve Wikipedia’s contributor diversity after designing and testing potential intervention(s).

Objective 1: Identify the underlying (potential) causes of lack of representative contribution from certain demographics task T166083

Objective 2: Design frameworks to change the current socio-technical infrastructure to address at least one of the underlying causes of lack of representativeness

  • Create one or more formal collaborations for this research (T166085)
  • Perform a literature review and potentially run survey(s) to identify the self-reported causes of imbalanced representation in contribution (T175215)
  • Initiate the design of a framework to address one of the identified/hypothesized causes of imbalance in contributor demographics, if time permits. (Stretch)
  • Run a couple of quick surveys to get a better sense of where in the pipeline we start loosing diversity. (Stretch)
  • External collaborators (Bob and Jerome)
  • Leila
EOQ Yes Done

The second stretch goal is under review and may move to the next quarter depending on its engineering needs. This latter goal is a nice-to-have that we can afford to drop if needed.

Program 9: Growing Wikipedia across languages via recommendations.

Outcome 1: Surface relevant information about the articles to editors at the time of editing with the goal of helping editathon organizers

Objective 1: Build, improve, and expand algorithms the can provide more detailed recommendations to editors and editathon organizers on how to expand articles/family-of-articles

  • Clean up the category system for machine consumption (this has been the focus for the past few months and we are very close to have a solution)
  • Start surfacing recommendations to collect feedback from editathon organizers (T174738)
  • External collaborators (Bob, Michele, Tiziano)
  • Leila
Yes Done
Cross Departmental: Structured Data on Commons. Segment 4: Programs.

Outcome 2: Develop a better understanding of existing needs for Structured Commons (task T152248)

  • Develop interview protocol
  • interview 6-8 GLAM stakeholders
  • Share initial user stories and research themes at Structured Data offsite
  • Jonathan
EOQ In progress In progress

Goal setting process owner: Aaron Halfaker See the epic task: Phab:T166045

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 5: Scoring Platform (ORES).

Outcome 1: Tool developers and Product teams can innovate tools that use machine prediction to make wiki-work more efficient.

  • Objective 1: Expand vandalism & good-faith detection models to more wikis (focus on Emerging Communities)
C Features that we build to improve our technology offering
  • Operations
  • awight
  • halfak
Q1 Yes Done
Outcome 2: Volunteers are empowered to track trends in prediction bias and other failures of AI in the wiki.
  • Objective 1: Develop best practices for using community input to improve/correct predictions
  • Design schema for meta ORES (T153152)
  • Report on implementations of meta ORES (T166053)
  • Community Engagement
In progress In progress

Search platform[edit]

Goal setting process owner: Erika Bjune

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Product Program 1: Make knowledge more easily discoverable
  • Outcome 1: Through incremental Discovery improvements, readers are better able to discover and search for content.
  • Objective 1: Implement advanced methodologies such as “learning to rank” machine learning techniques and signals to improve search result relevance across language Wikipedias.
Quarterly Objective 1:
  • Perform load and A/B tests on new models to make sure they can be safely deployed to production
  • When ready, deploy newly automated models which match (at a minimum) current performance of manually-configured search result relevance
C. improve our own feature set Analytics, Operations, Community Engagement Erik, David, Trey, Daniel (contractor) EOQ Yes Done
Product Program 1: Make knowledge more easily discoverable
  • Outcome 1: Through incremental Discovery improvements, readers are better able to discover and search for content.
  • Objective 2: Improve support for multiple languages by researching and deploying new language analyzers as they make sense to individual language wikis.
Quarterly Objective 2:
  • Perform research spikes to find new analyzers for different languages
    • Test new analyzers to see if they are improvements (Japanese and Vietnamese)
    • Deploy new / updated analyzers
  • Deploy analyzers in progress from last quarter (Hebrew)
C. improve our own feature set Analytics, Operations, Community Engagement Erik, David, Trey EOQ Yes Done
Product Program 1: Make knowledge more easily discoverable
  • Outcome 1: Through incremental Discovery improvements, readers are better able to discover and search for content.
Quarterly Objective 3:
  • Work on expanding category search in the Wikidata Query Service, while also collecting SPARQL statistics.
C. improve our own feature set Analysis team, Operations, WMDE Stas, Guillaume EOQ Yes Done
Cross-Departmental Program: Structured Data on Commons.

Segment 2: Search integration and exposure

Quarterly Objective 1:
  • Commons search extended to support search via structured data for media
C. improve our own feature set Operations, WMDE Erik, Stas, Guillaume EOQ To do To do
Cross-Departmental Program: Structured Data on Commons.

Segment 2: Search integration and exposure

Quarterly Objective 2:
  • Meet with SDoC project team to discuss how advanced search will be updated to support more specific media search filters.
C. improve our own feature set Readers, WMDE Erik, Stas EOQ In progress In progress

Goal setting process owner: Darian Patrick

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team members ETA Status
Cross Departmental: Privacy, Security And Data Management.

Segment 2: Technology
Outcome 1: Through improvements to our organizational security posture, the Foundation ensures the high-quality protection and security of our infrastructure and data
Objective 2: Update tools and processes to keep pace with industry-wide security developments

Update MediaWiki security release process and tooling
  • Implement continuous integration of security patches
  • Transfer knowledge from Release Engineering
  • Develop regular release regimen
  • D: Tech Debt
  • TBD
  • Darian
  • Brian
  • Sam
EOQ To do To do


Goal setting process owner: Gabriel Wicke

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 7: Smart tools for better data

Outcome 2: access to Wikimedia content and data with scalable APIs

Objective 1: Revision storage scaling

Start gradual roll-out of Cassandra 3 & new schema to resolve storage scaling issues and OOM errors. B: Serving our audiences; D: Tech debt
  • Operations
EOQ Draft
Program 8: Multi-datacenter support

Outcome 2: Backend infrastructure works reliably across data centers

Objective 1: Reliable, multi-DC job processing

Begin migrating job queue processing to multi-DC enabled eventbus infrastructure.
  • Implement ChangeProp deduplication and rate limiting.
  • Start migrating job queue use cases.
D: Tech debt
  • Operations
  • Analytics
End of Q1 Draft

Goal setting process owner: Mark Bergsma

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Continue Asia Cache PoP procurement, installation, and configuration tasks
  • Finish up trailing purchasing tasks from previous quarter (DC, hardware, network links, etc).
  • Procure at least one transit or peering link to help advance address space issue
  • Physically install all hardware
  • Acquire address space & communicate it to Wikipedia Zero partners (via the Zero team)
  • Turn up network links (stretch)
  • Configure network devices and hosts (stretch)
B. Serving our audiences

C. Improving our offering

Finance, Legal, Partnerships EOQ Yes Done
Program 6: Streamlined service delivery

Outcome 1: We have seamless productization and operation of (micro)services.

  • Objective 1: Set up production-ready Kubernetes cluster(s) with adequate capacity
  • Objective 2: Create a standardized application environment for running applications in Kubernetes
  • Implement a pod networking policy approach
  • Upgrade to Kubernetes >= 1.5
  • Standardize on a "default" pod setup
  • Experiment with ingress solutions (stretch)
D. Improve our own feature set Release Engineering, Services EOQ Yes Done
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Remove Salt from our infrastructure
  • Port debdeploy to Cumin
  • Migrate the reimage script to Cumin
  • Remove support for the Trebuchet deployment system
  • Remove Salt from production & WMCS
C. Improve our own feature set

D. Technical debt

Release Engineering, WMCS EOQ Yes Done
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Prepare for Puppet 4
  • Support directory environments in our Puppet infrastructure and add an environment that uses Puppet's future parser
  • Switch at least 3 node groups to the future parser environment
  • Force both current and future parser for every test in the puppet-compiler
  • Integrate puppet-compiler with the Continuous Integration infrastructure (task T166066) (stretch) N Postponed
  • Speed up CI for operations/puppet (task T166888) and add future parser validation
C. Improve our own feature set

D. Technical debt

Release Engineering EOQ Yes Done
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Improve database backups' coverage, monitoring and data recovery time (part 1)
  • Adjust configuration management manifests to support MariaDB multi-instances
  • Migrate at least 2 instances on 1 dbstore host to the new multi-instance setup
  • Research backup storage options and prepare a design document
  • Investigate and experiment with replacements of mysqldump
D. Technical debt EOQ Yes Done