Wikimedia Technology/Goals/2017-18 Q1

From MediaWiki.org
Jump to: navigation, search
TriangleArrow-Left.svgQ4 (with Product) Wikimedia Technology Goals, FY2017–18, Q1 (July – September) Q2TriangleArrow-Right.svg

Introduction[edit]

Purpose of this document[edit]

Goals for the Wikimedia Technology department, for the first quarter of fiscal year 2017–18 (July 2017 – September 2017). The goal setting process owner in each section is the person responsible for coordinating completion of the section, in partnership with the team and relevant stakeholders.

Goals for the Audiences department are available on their own page

Legend[edit]

Annual Program/Outcome refers to items in the annual plan (draft).

Tech Goal categorizes work into one or more of these quadrants:

A  Foundation level goals C Features that we build to improve our technology offering
B Features we build for others D Modernization, renewal and tech debt goals

ETA fields may use the initialism EOQ (End of Quarter).

Status fields can use the following templates: In progress In progress, To do To do, N Postponed, YesY Done or Incomplete Partially done

CTO[edit]

Goal setting process owner: Victoria Coleman

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 4: Technical community building

Outcome 5: Organize Wikimedia Developer Summit

Objective 1: Developer Summit web page published four months before the event

Decide on Dev Summit event location, dates, theme, deadlines, etc. and publicize the information B Technical Collaboration EOQ To do To do

Analytics Engineering[edit]

Goal setting process owner: Nuria Ruiz

The Analytics Engineering makes Wikimedia related data available for querying and analysis to both WMF and the different Wiki communities and stakeholders. We develop infrastructure so all our users, both within the Foundation and within the different communities, can access data in a self-service fashion that is consistent with the values of the movement. For next quarter we will have less resources (about 2 people less for most of the quarter) thus our commitments are smaller. The second tier of priorities is marked as STRETCH, we will start working on those once goals are almost completed.


Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 1: Wikistats 2.0 redesign.

  • (carry on from last quarter) Initial deployment of wikistats 2.0 UI task T160370
  • Start development of a new backend and API on top of the Data Lake Edit data in Hadoop. task T156384
A. Org Level Priority. B. Serving our Audiences. Cloud Services To do To do
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 1: Wikistats 2.0 redesign.

A. Org Level Priority. B. Serving our Audiences. Cloud Services To do To do
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 3: Experiments with real-time data and community support for new datasets available.

Deprecate socket.IO RC feed and help clients migrate (by July 7th) task T156919 B. Serving our Audiences. Ops YesY Done
Program 7: Smart Tools for Better Data.

Outcome 1: Foundation staff and community have better tools to access data.

Objective 2: Better visual access to EventLogging data

Spike: load data in superset, test and productionize if pertainstask T166689 To do To do
Cross Departmental: Privacy, Security And Data Management. Track 2: Technology.

Outcome 2: To protect user data and uphold movement values, the Wikimedia Foundation continues compliance with best practices for data management Objective 2 & 3: Better offboarding / onboarding for data access. Ensure retention guidelines are being followed

  • Audit existing stat/analytics shell accounts and incorporate expiration dates, make all accounts compliant with "time restricted access"
  • stat1002/3 replacement task T152712 To do To do
  • Revamp docs for access to data
  • (carry on from last quarter) Data purging for Eventlogging task T156933
C. improve our own feature set

D. Tech Debt

Ops To do To do
Program 1: Availability, performance, and maintenance.

Outcome 3: Scalable, reliable and secure systems for data transport.

Objective 1: Consolidation of Kafka infrastructure to tier-1 requirements, including TLS encryption

  • Bootstrap the new Kafka cluster (hardware refresh) with an upgraded version and TLS encryption and access control lists support. task T152015
  • STRETCH: switch the Varnishkafka cache misc configuration to the new cluster and enable TLS authentication for it.
C. improve our own feature set

D. Tech Debt

Ops To do To do
Program 1: Availability, performance, and maintenance.

Outcome 3: Scalable, reliable and secure systems for data transport.

Objective 3: Software, hardware upgrades, and maintenance on analytics stack to maintain current level of service

  • Hadoop health monitoring (add newer ones like default rack, corrupted blocks, etc..): task T166140
  • Prep work for Eventlogging databases refresh (OS install, racking..) task T156844
  • STRETCH: Stats for Vanishkafka errors: task T164259
C. improve our own feature set

D. Tech Debt

Ops To do To do

Cloud Services[edit]

Goal-setting process owner: Bryan Davis

Work is in progress on choosing which tasks to designate as goals on phabricator.

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance

Outcome 4: VPS hosting

Objective 2: Pay down tech debt by deploying OpenStack Neutron

Create a detailed migration plan for implementing Neutron as our OpenStack SDN layer C: Improving our offering

D: Tech debt

EOQ In progress In progress
Program 1: Availability, performance, and maintenance

Outcome 4: VPS hosting

Objective 1: Maintain existing OpenStack infrastructure and services

Define a metric to track OpenStack system availability B: Serving our audiences EOQ In progress In progress
Program 4: Technical community building

Outcome 1: Improve documentation

Objective 2: Create tutorial content

Plan contract documentation work B: Serving our audiences

C: Improving our offering

EOQ
Program 7: Smart tools for better data

Outcome 3: Data services

Objective 1: Provide reliable and available access to dumps

Begin migrating customer-facing Dumps endpoints to Cloud Services C: Improving our offering EOQ In progress In progress
Program 10: Public cloud services & support

Outcome 1: PaaS is easy to use

Objective 2: Migrate workflows to Striker

Manage shared tool accounts via Striker C: Improving our offering EOQ In progress In progress
Program 10: Public cloud services & support

Outcome 2: Rebranding

Objective 1: Complete core rebranding

Perform initial Cloud Services rebranding C: Improving our offering EOQ In progress In progress
Program 10: Public cloud services & support

Outcome 3: Outreach

Objective 1: Promote services and products

Attend Wikimania to support and promote Cloud Services products with the Wikimedia communities B: Serving our audiences August YesY Done
Program 10: Public cloud services & support

Outcome 4: First line tech support

Objective 1: Provide first line technical support resources

Hire first line technical support contractor B: Serving our audiences EOQ

Fundraising Tech[edit]

Goal setting process owner: Katie Horn

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Product Program 7: Payment processor investigation and long-term strategy
  • Outcome 1: Advancement and fr-tech find a solution that lowers or does not increase current maintenance costs.
Continue Ingenico intergration Some MVP of Ingenico Ingenico (external vendor) EOQ To do To do
Product Program 8: Donor retention
  • Outcome 1: We hope to reduce the hours spent manually deleting duplicate records during Big English.
Time box work on Civi duplicate records Time boxed effort to resolve merge conflicts Major Gifts feedback EOQ To do To do

MediaWiki Platform[edit]

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 2: MediaWiki
  • Outcome 1: Stakeholders in MediaWiki development will have sense of progress and direction in MediaWiki.
  • Hire product manager
  • Develop MediaWiki roadmap
D: Modernization, tech debt
  • PM
EOQ In progress In progress
Program 2: MediaWiki
  • Outcome 2: MediaWiki code quality will be improved
  • MimeAnalyzer improvements (T155320)
  • Namespaceization (T166010)
C: Improving our offering

D: Tech debt

  • Brion
  • Tim
EOQ To do To do
Program 8: Multi-datacenter support
  • Outcome 1: Our audiences enjoy improved MediaWiki and REST API availability and reduced wiki read-only impact from data center fail-overs.
  • Objective 3: Integrate MediaWiki with dynamic configuration or service discovery, in order to reduce the time required for a master switch from one datacenter to another
  • etcd MW config (T156924)
C: Improving our offering Ops
  • Tim
EOQ In progress In progress
Cross Departmental: Structured Data on Commons.
  • Outcome 1: Store structured data within wiki pages, in particular on media file pages on Commons.
  • Outcome 2: Introduce Multi-Content Revisions (MCR)
  • Actor table (T167246) development substantially complete, ready for deployment
  • Comment table (T166732) development substantially complete, ready for deployment
  • De-globalize EditPage.php (T144366)
  • Narrow AbuseFilter interface
  • EditPage backend (T157658)
A: Foundation level goals

C: Improving our offering

D: Tech debt

Ops (DBA)
  • Brad
  • Brion
  • Kunal
  • Tim
EOQ In progress In progress

Performance[edit]

Speed is Wikipedia's killer feature. ("Wiki" means "quick" in Hawaiian.) As the Wikimedia Foundation’s Performance team, we want to create value for readers and editors by making it possible to retrieve and render content at the speed of thought, from anywhere in the world, on the broadest range of devices and connection profiles.

Goal-setting process owner: Gilles Dubuc

Annual Program/Outcome Quarterly objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance

-> Outcome 1: All production sites and services maintain current level of availability or better

--> Objective 2: Assist in the architectural design of new services and making them operate at scale

  • Follow-up bugfixes and improvements after Thumbor deployment to production - T121388
B: Features we build for others
  • Operations
  • Gilles
EOQ To do To do
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

--> Objective 2: Catch and address performance regressions in a timely fashion through automation

  • Test user performance from Asia to validate changes when the Asia Cache goes live - T169180
B: Features we build for others
  • Operations
  • Peter
  • Gilles
EOQ To do To do
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

--> Objective 3: Modernize our performance toolset. We will measure performance metrics that are closer to what users experience.

  • Rework NavigationTiming metrics to make them stackable - T104902
  • Add metrics for master queries on HTTP GET/HEAD - T166199
C: Feature
  • Aaron
  • Gilles
  • Peter
  • Timo
EOQ To do To do
Program 1: Availability, performance, and maintenance

-> Outcome 2: All our users consistently experience systems that perform well

D: Tech debt
  • Release Engineering
  • Timo
EOQ To do To do
Program 8: Multi-datacenter support

-> Outcome 1: Our audiences enjoy improved MediaWiki and REST API availability

--> Objective 1: MediaWiki support for having read-only "read" requests (GET/HEAD) be routed to other datacenters

  • Enable HTTPS for swift clients (MediaWiki) - T160616
  • Enable HTTPS for mariadb clients (MediaWiki) - T134809
  • Install and use mcrouter in deployment-prep - T151466
B: Features we build for others
  • Operations
  • Aaron
EOQ In progress In progress

Release Engineering[edit]

Goal setting process owner: Greg Grossmeier

All tracked in: #releng-201718-q1 - More details at Wikimedia Release Engineering Team/Goals/201718Q1

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team members ETA Status
Program 6: Streamlined service delivery

Outcome 2: unified pipeline towards production deployment.

Objective 2: Set up a continuous integration and deployment pipeline

  • Define functional tests for Mathoid running on the staging Kubernetes cluster for use in future gating decisions - task T170482
  • Define method for monitoring and reacting to the above functional tests - task T170483
C. Improve our own feature set

D: Tech Debt

  • Operations
  • Services
  • Lead: Tyler
  • Antoine
  • Dan
  • Jean-Rene
EOQ To do To do
Program 1: Availability, performance, and maintenance

Outcome 1: All production sites and services maintain current levels of availability or better

Objective 1: Deploy, update, configure, and maintain production services

  • Deprecate use of Trebuchet across production - task T129290
D: Tech Debt
  • Operations
  • Services
  • Discovery
  • Cloud Services
  • Tyler
  • Chad
  • Antoine
  • Mukunda
EOQ In progress In progress
Program 1: Availability, performance, and maintenance

Outcome 5: effective and easy-to-use testing infrastructure and tooling

Milestone 1: Develop and migrate to a JavaScript-based browser testing stack

  • Migrate majority of developers to JavaScript based browser test framework (webdriver.io)
C. Improve our own feature set

D: Tech Debt

  • All developers
  • Notably: Wikidata, CirrusSearch
  • Zeljko
End of Q2 In progress In progress


Research[edit]

In Wikimedia Research we use qualitative and quantitative methods to provide strategic insights and technological solutions to the movement and the Foundation, to foster innovation and to inform the development of new products.

Goal setting process owner: Dario Taraborelli

Note: This section can change significantly until the end of June when all goals for Q1 are finalized.

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 12: Grow contributor diversity.

Outcome 1: We improve Wikipedia’s contributor diversity after designing and testing potential intervention(s).

Objective 1: Identify the underlying (potential) causes of lack of representative contribution from certain demographics task T166083

Objective 2: Design frameworks to change the current socio-technical infrastructure to address at least one of the underlying causes of lack of representativeness

  • Create one or more formal collaborations for this research (T166085)
  • Perform a literature review and potentially run survey(s) to identify the self-reported causes of imbalanced representation in contribution
  • Initiate the design of a framework to address one of the identified/hypothesized causes of imbalance in contributor demographics, if time permits. (Stretch)
  • Run a couple of quick surveys to get a better sense of where in the pipeline we start loosing diversity. (Stretch)
  • External collaborators (Bob and Jerome)
  • Leila
EOQ In progress In progress
Program 9: Growing Wikipedia across languages via recommendations.

Outcome 1: Surface relevant information about the articles to editors at the time of editing with the goal of helping editathon organizers

Objective 1: Build, improve, and expand algorithms the can provide more detailed recommendations to editors and editathon organizers on how to expand articles/family-of-articles

  • Clean up the category system for machine consumption (this has been the focus for the past few months and we are very close to have a solution)
  • Start surfacing recommendations to collect feedback from editathon organizers
  • External collaborators (Bob, Michele, Tiziano)
  • Leila
In progress In progress
Cross Departmental: Structured Data on Commons. Segment 4: Programs.

Outcome 2: Develop a better understanding of existing needs for Structured Commons (task T152248)

  • Develop interview protocol
  • interview 6-8 GLAM stakeholders
  • Share initial user stories and research themes at Structured Data offsite

Scoring platform[edit]

Goal setting process owner: Aaron Halfaker See the epic task: Phab:T166045

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 5: Scoring Platform (ORES).

Outcome 1: Tool developers and Product teams can innovate tools that use machine prediction to make wiki-work more efficient.

  • Objective 1: Expand vandalism & good-faith detection models to more wikis (focus on Emerging Communities)

Outcome 2: Volunteers are empowered to track trends in prediction bias and other failures of AI in the wiki.

  • Objective 1: Develop best practices for using community input to improve/correct predictions
  • Deploy thresholds selection system (T162217)
  • Deploy advanced support for Albanian and Romanian Wikipedias and basic support for Greek & Tamil Wikipedia
  • Design schema for meta ORES (T153152)
  • Report on implementations of meta ORES (T166053)
C Features that we build to improve our technology offering
  • Operations
  • Community Engagement
  • awight
  • halfak
Q1 In progress In progress

Search platform[edit]

Goal setting process owner: Erika Bjune

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Product Program 1: Make knowledge more easily discoverable
  • Outcome 1: Through incremental Discovery improvements, readers are better able to discover and search for content.
  • Objective 1: Implement advanced methodologies such as “learning to rank” machine learning techniques and signals to improve search result relevance across language Wikipedias.
  • Objective 2: Improve support for multiple languages by researching and deploying new language analyzers as they make sense to individual language wikis.
Quarterly Objective 1:
  • Perform load and A/B tests on new models to make sure they can be safely deployed to production
  • When ready, deploy newly automated models which match (at a minimum) current performance of manually-configured search result relevance

Quarterly Objective 2:

  • Perform research spikes to find new analyzers for different languages
    • Test new analyzers to see if they are improvements (Japanese and Vietnamese)
    • Deploy new / updated analyzers
  • Deploy analyzers in progress from last quarter (Hebrew)
C. improve our own feature set Analytics, Operations, Community Engagement Erik, David, Trey, Daniel (contractor) EOQ In progress In progress
Product Program 1: Make knowledge more easily discoverable
  • Outcome 1: Through incremental Discovery improvements, readers are better able to discover and search for content.
Quarterly Objective 3:
  • Work on expanding category search in the Wikidata Query Service, while also collecting SPARQL statistics.
C. improve our own feature set Analysis team, Operations, WMDE Stas, Guillaume EOQ In progress In progress
Cross-Departmental Program: Structured Data on Commons.

Segment 2: Search integration and exposure

Quarterly Objective 1:
  • Commons search extended to support search via structured data for media

Quarterly Objective 2:

  • Advanced search will be updated to support more specific media search filters
C. improve our own feature set Operations, WMDE Erik, Stas, Guillaume EOQ To do To do

Security[edit]

Goal setting process owner: Darian Patrick

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team members ETA Status
Cross Departmental: Privacy, Security And Data Management.

Segment 2: Technology
Outcome 1: Through improvements to our organizational security posture, the Foundation ensures the high-quality protection and security of our infrastructure and data
Objective 2: Update tools and processes to keep pace with industry-wide security developments

Update MediaWiki security release process and tooling
  • Implement continuous integration of security patches
  • Transfer knowledge from Release Engineering
  • Develop regular release regimen
  • D: Tech Debt
  • TBD
  • Darian
  • Brian
  • Sam
EOQ To do To do


Services[edit]

Goal setting process owner: Gabriel Wicke

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 7: Smart tools for better data

Outcome 2: access to Wikimedia content and data with scalable APIs

Objective 1: Revision storage scaling

Start gradual roll-out of Cassandra 3 & new schema to resolve storage scaling issues and OOM errors. B: Serving our audiences; D: Tech debt
  • Operations
EOQ Draft
Program 8: Multi-datacenter support

Outcome 2: Backend infrastructure works reliably across data centers

Objective 1: Reliable, multi-DC job processing

Begin migrating job queue processing to multi-DC enabled eventbus infrastructure.
  • Implement ChangeProp deduplication and rate limiting.
  • Start migrating job queue use cases.
D: Tech debt
  • Operations
  • Analytics
End of Q1 Draft

Technical Operations[edit]

Goal setting process owner: Mark Bergsma

Annual Program/Outcome Quarterly Objective Tech Goal Team Goal Dependencies Team Members ETA Status
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Continue Asia Cache PoP procurement, installation, and configuration tasks
  • Finish up trailing purchasing tasks from previous quarter (DC, hardware, network links, etc).
  • Procure at least one transit or peering link to help advance address space issue
  • Physically install all hardware
  • Acquire address space & communicate it to Wikipedia Zero partners (via the Zero team)
  • Turn up network links (stretch)
  • Configure network devices and hosts (stretch)
B. Serving our audiences

C. Improving our offering

Finance, Legal, Partnerships EOQ To do To do
Program 6: Streamlined service delivery

Outcome 1: We have seamless productization and operation of (micro)services.

  • Objective 1: Set up production-ready Kubernetes cluster(s) with adequate capacity
  • Objective 2: Create a standardized application environment for running applications in Kubernetes
  • Implement a pod networking policy approach
  • Upgrade to Kubernetes >= 1.5
  • Standardize on a "default" pod setup
  • Experiment with ingress solutions (stretch)
D. Improve our own feature set Release Engineering, Services EOQ To do To do
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Remove Salt from our infrastructure
  • Port debdeploy to Cumin
  • Migrate the reimage script to Cumin
  • Remove support for the Trebuchet deployment system
  • Remove Salt from production & WMCS
C. Improve our own feature set

D. Technical debt

Release Engineering, WMCS EOQ To do To do
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Prepare for Puppet 4
  • Support directory environments in our Puppet infrastructure and add an environment that uses Puppet's future parser
  • Switch at least 3 node groups to the future parser environment
  • Force both current and future parser for every test in the puppet-compiler
  • Integrate puppet-compiler with the Continuous Integration infrastructure (task T166066) (stretch)
  • Speed up CI for operations/puppet (task T166888) and add future parser validation
C. Improve our own feature set

D. Technical debt

Release Engineering EOQ To do To do
Program 1: Availability, performance, and maintenance.

Outcome 1: All production sites and services maintain current levels of availability or better.

Improve database backups' coverage, monitoring and data recovery time (part 1)
  • Adjust configuration management manifests to support MariaDB multi-instances
  • Migrate at least 2 instances on 1 dbstore host to the new multi-instance setup
  • Research backup storage options and prepare a design document
  • Investigate and experiment with replacements of mysqldump
D. Technical debt EOQ To do To do