Wikimedia Engineering/2012-13 Goals

'''This page is the beginning of a template for the 2012-13 engineering goals of the Wikimedia Foundation's tech department. It's still in the early developmental stages.'''

Each section should include the following:


 * Goal statement (what are we trying to achieve)
 * Rationale (how does this project relate to our overall strategy)
 * Resources (broad outline of current and additional resources assigned to this project, including % allocation for partially allocated or part-time staff)
 * Activities (what concrete work will be undertaken through this project)
 * Outputs and outcomes (what will the delta be to current state as a result of this project, in terms of metrics, new functionality, new process, and ultimately towards our strategic goals)
 * Timeline (when will major milestones of this project be hit, with focus on the 2012-13 fiscal year)
 * Interdependencies (do we need help from other departments/individuals to successfully reach this goal)

Big Goals

 * Keeping the lights on
 * Improving the engineering process
 * Reversing the editor decline
 * Increasing mobile access and mobile contribution
 * Strengthening Community-Engineering communication

Site operations
Goals/Budget team: CT Woo (budget owner), project owners (for CapEx), Mark Bergsma (architect, CapEx plan), ops team

Key Goals

 * Goal #1
 * 'Keep the Lights on' - an uptime metrics of 99.85% for *.wikipedia.org and *.m.wikipedia.org for wikipedia readers, and a 99.8% for editors.


 * Goal #2
 * Ensure sufficient computing and network capacity to meet demands of the new projects, 30% growth in user traffic and 200% growth in multimedia contents

Resources

 * Existing team & Existing Contractors
 * New Ops Engineer to focus on non-core projects like CRM, Survey Tools, Backups & Archives
 * New Storage / DBA to help Asher (who will be spending more time on the Application and Site Performance Project)

Key Activities

 * Operational Support (on-going)
 * Shell requests/extension deployment and review
 * RT support
 * Data Center duties


 * Capacity Growth (on-going)
 * storage growth expected to be 100 - 200%
 * site traffic (page views) growth expected to be 30 %
 * network traffic growth expected to be 50%
 * existing projects (e.g., Analytics & Labs)


 * Site Availability (on-going)
 * server refresh
 * deployment process improvement
 * disaster recovery and backups


 * Site Security (on-going)
 * infrastructure security vulnerabilites e.g patches
 * applications maintenance


 * New Projects & non-core (non-mediawiki) services support
 * mobile projects
 * Community projects (.e.g. CRM )
 * Engineering tools (e.g. Mingle, Bugzilla, Git, TestSwarm)
 * WikiData
 * Analytics
 * FundRaising
 * OSM

Site architecture
Goals/Budget team: CT, RobLa, Security engineer, Tim, MarkB, Terry, Roan

Key Goals

 * Goal #1
 * Transition over primary Data Center from Tampa to Ashburn (EQIAD)


 * Goal #2
 * improve/reduce response times by 70ms for users/readers in Asia and West Coast USA (by redirecting users to the nearest server cluster)


 * Goal #3
 * Develop, Plan and Execute plan to Reduce the Operating cost of the data centers without compromising quality and availability

Resources

 * Current Staff members & contractors
 * New Contractor to work on new data center transition (3 months)

Key Activities

 * provide a cost effective hot-failover /disaster-recovery site (West Coast) (Q4)


 * Failover Site / Site Move (TPA to IAD) (Q2)
 * network
 * applications
 * platform
 * ensure better response times for users/readers accessing out web properties  by redirecting users to the nearest server cluster
 * AMS, IAD, TPA & SJC (Q2)
 * Reach out to get sponsorship for more Caching Centers around the world (on-going)
 * Asia
 * S. America


 * build a data archive service to ensure backups are performed, archived and tested for correctness (Q3)
 * setup and deploy backup infrastructure and process for archives and offsite
 * Archives for Community (e.g. XML Dumps)
 * simply and strengthen our infrastructure security practices and technologies (Q2)


 * leverage the various Application stack instances (load-balancing)
 * Application stack seamless cross-site failover @ mediawiki application level infrastructure and system level


 * Develop and implement 'get well' plan for current'Search' (Q2)


 * Intersection with features (e.g., messaging, notifications)?


 * infrastructure security architecture (on-going)
 * access (sudo, ssh open ports, more access control segregation & EOL process etc)

Rationale

 * There is certainly a relationship between site performance and response and overall editor engagement and activities

Interdependencies

 * Site Performance
 * Site Operations

Wikimedia Labs
Goals/Budget team: CT, RyanL, Sumana, RobLa?, team

Key Goals

 * Goal #1:
 * Double the number of users and projects on Labs infrastructure
 * Goal #2:
 * Deploy Tools Labs
 * Goal #3:
 * Significant and meaningful pre-deployment testing of MediaWiki

Resources

 * Ryan Lane (lead)
 * Andrew Bogott (DevOps)
 * Faidon L (Ops Engr)
 * Sara S (Ops Engr - part time)
 * Chris McMahon (QA Lead)
 * Antoine Musso (Continuous Integration engineer)
 * Program Manager (new requisition - share with RobLa)

Q1

 * Upgrade and stabilize current infrastructure and software
 * Upgrade to Ubuntu 12.4
 * Upgrade to Nova Essex release
 * Change OpenStackManager to use Nova API, rather than EC2
 * Push OpenStackManager changes to filter projects
 * Push OpenStackManager changes to automatically reformat or reject SSH keys in the incorrect format
 * Push OpenStackManager changes to show SSH fingerprints for instances
 * Add Gluster support to Nova
 * Add DNS support to Nova (in essex, but we must migrate to using it)

Q2

 * Move to Cisco hardware in pmtpa
 * Upgrade and stabilize current infrastructure and software
 * Add Puppet support to Nova
 * Add MediaWiki support to Nova
 * Create a second zone in eqiad
 * Create an http reverse proxy service, that acts like an OpenStack service, to lessen public IP usage
 * Create a turnkey MediaWiki puppet class

Q3

 * Add Tool Labs features
 * Database replication from production
 * User databases
 * Puppetize deployment-prep (likely more of a community goal)

Q4 or beyond

 * Development support function ("cross functional support") vs. community experiments
 * Enable automated testing for infrastructure, using Jenkins
 * Wishlist

Editor engagement features
Goals/Budget team: Howie, Fabrice, Ian, Terry, team

This team is focused on medium-sized infrastructure improvements which help us engage new and retain existing contributors.

Goal
Reverse the editor decline by making foundational improvements on Wikpedia.

Rationale

 * The Editor Trends Study
 * Board feels that editor retention is the most important focus for the Foundation.

Resources

 * Existing team (Fabrice, Ian B., Kaldari, Benny S., Andrew G. [part time], Brandon, Dario T.)


 * 2012-13:
 * Additional Development Team:
 * Front-end engineer
 * Back-end engineer
 * UX (shared)
 * Other
 * Data Analyst (if Dario works with E3 team)

Activities

 * Page Triage (in process)
 * Article Feedback (in process)
 * Article Creation Workflow (in process)
 * Notifications
 * Profile
 * Affiliation
 * Messaging
 * Curation processes

Outputs and Outcomes

 * Quantitative metrics TBD; but aim to have an active editors target for this team

Timeline

 * Ongoing
 * Second Team: Terry, can you let me know when you think this second team would be in place?

Interdependencies

 * Mobile
 * E3 team

Editor engagement experiments
This team is focused on smaller, daily or weekly experiments which demonstrate measurable impact on editor numbers and can be productized

Goals/Budget team: Karyn, Howie, Alolita, Zack, community team ..

Goal
Reverse the editor decline by experimenting with smaller features. The output of these experiments has to be measurable and needs supporting metrics gathering.

Rationale

 * The rational for Editor engagement in general applies here also.
 * There is the opportunity to grab a bunch of "low hanging fruit" to move the needle on editor engagement with small changes.

Activities

 * Lots of small fun stuff. Some examples…
 * User warning templates
 * Size, wording, location, color of buttons
 * Microtasks as new engagement strategy

Output and Outcomes
There was currently no project last fiscal year.

Interdependencies

 * Analytics to be able to "close the loop" on testing

Multimedia participation
Goals/Budget team: Erik, Howie, Terry, Alolita

Goals:
 * Enable multimedia contributions in a more user-friendly and seamless manner
 * Improve display of multimedia content
 * Support curation resulting from new contribution streams
 * currently almost "orphaned" (very little development going on)

Rationale

 * For editor engagment (above), Wikimedia Commons is an area of actual editor growth
 * With 1.19 and Swift, there is finally infrastructure to act aggressively on this space

Resources

 * Product Manager
 * Front-end engineer
 * Back-end engineer

Activities

 * WP/Commons integration
 * WP displays of MM content
 * Mobile photo upload
 * Associated curation
 * Potential: support for WikiLovesMonuments (?)
 * support for diff file types (?)

Output and Outcomes

 * Target: xx commons contributors/month

Timeline

 * Being work Q4 (calendar quarter) 2012

Interdependencies

 * Mobile drives a lot of need
 * Will drive a lot of storage load on Site Operations

Visual Editor
Goals/Budget team: Howie, Terry, Trevor, Gabriel, Roan, Rob

Goal Statement
To create a reliable rich-text editor that allows for editing underlying wikitext on multiple platforms (including mobile) and facilitate a possible future implementation of real-time collaborative editing.

Rationale
This relates directly to the editor retention issue:
 * The | decline in new contributor growth is the single most serious challenge facing the Wikimedia movement since 2007.
 * Board feels that editor retention is the most important focus for the Foundation.

How the visual editor addresses the above problem:
 * Removing the avoidable technical impediments associated with Wikimedia's editing interface is a necessary precondition for increasing the number of Wikimedia contributors.
 * Many key features for Editor Engagement are dependent on a working two-way parser that the project is building (e.g. GoogleDoc-like annotation collaboration/talk page replacement).
 * Other key features for Editor Engagement are dependent on a working visual editor (e.g. messages in an abbreviated wikitext)

Activities

 * 1) Make the new parser feature-complete
 * 2) Make the user interface of the editor configurable on the wiki
 * 3) Research how to handle difficult parts of pages like images and tables
 * 4) Possible further research into collaboration-based features (through 3rd party or GSoC)

Outputs and outcomes

 * A feature complete parser that passes all the unit tests:
 * Should not break existing content
 * Some obscure wikitext patterns may need to be renormalized and converted to fix the above goal, but the target behavior is to have the parser without the above
 * Should mark output content so bi-directional roundtripping does not modify the original wikitext
 * Will hopefully become the canonical description for the underlying wikitext (folded into platform)
 * A working parser allows for two-way interaction between the user interface and the underlying wikitext
 * Ability to load and save an entire wiki page using the visual editor
 * Ability to extend the user interface with user-created and/or wiki-specific features

Timeline

 * By the start FY 2012-2013:
 * Complete migration from EditSurface to ContentEditable.
 * Most unit tests of the parser pass.
 * First demo of editor able to edit and save a subset of real-life articles (open-edit-save)
 * Late Summer to early Fall 2012 (Difficult to specify because team is currently operating without a product analyst):
 * First opt-in user-facing production site (many parts will not be editable in visual mode)
 * Spec and complete other in-between deployment steps: new page creation, sentence editing, mobile editing
 * Begin work on other Editor Engagement touchpoints: "Edit-in-place" capability, readable diffs, …
 * End of FY 2012-2013 (Difficult to specify because team is currently operating without a product analyst): Some of the following visual editor features may be addressed:
 * table editing
 * lists editing
 * images and captions editing?
 * visual editing of templates?
 * visual diffs (change playback)
 * integration of collaborative editing work from GSoC student, if applicable

Resources

 * Trevor Parscal
 * Brion and Neil originally assigned to this, but were re-assigned elsewhere
 * Gabriel Wicke (parser - 20 hours starting late Oct 2011)
 * After February 1, 2012 established most of full team:
 * Rob Moen (moved from editor engagement per Feb 1, 2012)
 * Roan Kattouw (moved to the US in Feb 2012; 65ish%, other time split with other projects)
 * Audrey Tang (parser - 5 hrs since Feb 1, 2012)
 * Open reqs filled by FY 2011-2012:
 * Parser engineer
 * Product Analyst to be hired in FY2011-12

System resources:
 * Currently no need for additional resources
 * When in production, it may be possible that there is a need for some node.js infrastructure (this is not finalized). Though this is likely to be contained by repurposing existing parser infrastructure with the more efficient parser.
 * If collaboration feature is added, there might be a need for additional resource infrastructure

Interdependencies
Cooperation with Wikia:
 * Inez Korczynski
 * Christian Williams (started around Feb 1st)

Editor Engagement Team. At some point there needs to be sync up:
 * Assist in UI problems (how to edit tables)
 * Social aspect of visual editor (multiple contributors)
 * Editor engagement usage of the parser and the visual editor

Ops assistance:
 * Review node.js infrastructure developed in Labs

Mobile (VE acts as a provider of resources to mobile):
 * Current design should support mobile editing (Android ICS and iOS 5) which should facilitate adoption by the mobile team

Mobile Contribs
This team is under the Mobile team below.

Site performance
Goals/Budget team: RobLa, CT, Tim Starling, Asher, Preilly, MarkB, Terry

Goal statement
For 2012-13, we would like to professionalize our process of performance engineering. In so doing, we plan to appreciably improve the user experience for editors and readers, and achieve savings in hardware requisitions.

Rationale
It is a well-studied phenomenon that even small delays in response time (e.g half of a second) can result in sharp declines in web user retention. As a result, popular websites such as Google and Facebook invest heavily in site performance initiatives, and partially as a result, remain popular. Formerly popular sites (such as Friendster) suffered due to lack of attention to these issues. Wikipedia must remain usable and responsive in order for the movement to sustain its mission.

Currently, complicated and popular articles (e.g. "Barack Obama") often take 30 or more seconds to render when the cache is invalidated for an article (e.g. when the article is edited, or when an included template is edited). While article rendering is possibly an extreme example, we have several other pockets of our systems that have similar problems.

Our volunteer and paid developers have few tools to understand how their work impacts performance (for good or for bad). Furthermore, even editors have an impact on performance, but they have few tools to understand what that impact is.

We need to invest in tools that make it possible for developers and editors to know what impact they are having, both so that we can accurately assess when a feature is creating a performance problem for us, and so that we can better understand the impact of our investment in performance. That will give us the visibility we need to address the most important issues, instead of relying on gut feel and lore to decide what is "good" or "bad" for site responsiveness. This will allow us to focus our effort on the most meaningful of improvements. And of course, we need to use this information to improve site performance.

We made some progress in 2011-12. Asher Feldman deployed many performance measurement tools such as Graphite (publicly available at gdash.wikimedia.org, with more tools available to staff). Tim Starling finished our disk-backed object cache project, which by one measure decreased average response time by 80-100 milliseconds. Tim also intends to make significant progress on introducing Lua as a new, faster template language alternative. However, neither Asher nor Tim have not have had the ability to focus sufficiently on performance to make the kind of progress that we need to make, since both play critical roles in the day-to-day operation of the website.

Resources

 * Tim Starling (75%)
 * Asher Feldman (50%) - contingent on filling a new DBA role in Operations
 * Performance engineer (100%)

Activities

 * Mediawiki improvements
 * Lua
 * Apache CPU/memory (parsing, other PHP-intensive work)
 * Cache utilization
 * Javascript and client side loading issues
 * Perceived performance
 * Other software improvments
 * Remote services (search)
 * HipHop(VM)
 * Operational improvements
 * I/O performance (Swift, NFS)
 * Squid/Varnish/other middleware
 * Network buffers
 * Turn up caching centers to bring data closer to users and reduce latency.
 * new technologies (e.g. Flash Drives, varnish)
 * Tools and measurement
 * Profiling, Analysing and Targeting
 * Tools for PHP developers
 * Tools for template authors

Outputs and outcomes
At the end of 2012-13, we would like to achieve the following:
 * Markedly better article rendering performance, rendering "slow" (>30 second) pages 4x faster with comparable functionality and editor workflow
 * More informed template authors with the tools necessary to avoid performance pitfalls and ability to keep page rendering time at acceptable levels.
 * More informed developers with the tools necessary to avoid performance pitfalls.
 * Reduce ping times from all worldwide locations to <150ms according to Watchmouse and site24x7.com. Reduce ping times to all european and North American locations to <80ms according to the above resources.

Timeline

 * July-September 2012
 * Experimental, limited deployment of Lua to the production cluster.
 * Detailed survey of performance landscape
 * Iterative improvement of performance tooling for developers
 * October-December 2012
 * Ramping up Lua to play gradually larger role on site
 * Page rendering performance tools to help assess impact of Lua
 * Iterative improvement of performance tooling for developers
 * More widespread improvement based on lessons learned during detailed survey
 * January-March 2013
 * Full deployment of Lua to the production cluster (contingent upon October assessment)
 * Iterative improvement of performance tooling for developers
 * More widespread improvement based on lessons learned during detailed survey
 * April-June 2013
 * Iterative improvement of performance tooling for developers
 * More widespread improvement based on lessons learned during detailed survey
 * Deploy HipHop(VM) to cluster (subject to availability of underlying tech)

Mobile
Goals/Budget Team: Tomasz (budget owner), Phil Chang (product manager), Patrick Reilly (sr engineer), Jon Robson (front end dev), mobile team

Increasing Contributions
Overall narrative note:

Upload is our first contribution feature. It will be completed for an experimental community trial by July. At that point we'll shift to experimenting with ways to engage casual users in multimedia contributions. This will interface and depend on the multimedia team that's proposed in the plan. The multimedia team will develop desktop focused curation tool, while the mobile team will develop mobile focused curation tools for multimedia.

The mobile focused curation tools may be our first microtask experiment on mobile devices. If we find that mobile multimedia contributions are not taking off, we will likely shift contribution efforts to other initiatives.

With regard to mobile editing, we are not making any assumptions about what forms of editing are likely to be used. However, all those assumptions require baseline support within the mobile infrastructure for text parsing and text manipulation. This is an infrastructure project that will likely not pay off immediately. We can target early text contribution efforts, such as block-level editing and new page creation, but ultimately our priorities will depend on where we see productive user adoption.

For microtasks, the mobile team will likely need to interface closely with the experimentation team, which already has a list of microtasks it would like to try, but will bias towards desktop experiments if we do not have proper orientation towards the mobile UI and APIs.

Goal

 * To facilitate contributions on mobile devices

Rationale

 * Our mobile page growth continues to be 5-15% every month but these users can't contribute. In order to reverse the trend of editor decline we need to capture new users coming online primarily (and sometimes only) on mobile devices
 * Board supports growth in areas that support mobile contributions

Resources

 * Existing Team (Phil, Patrick, Jon, Arthur, Lindsey, SDE)
 * 2012-13
 * 2 Front end engineers due to heavey user facing features
 * Yuvi (part time)

Activities

 * Media (Upload Wizard)
 * Commons continues to grow as a user community and mobile can help to accelerate and simplify media contributions.
 * Image Curation - If our mobile projects succeed then we'll have to address how were going to review the wealth of content coming into our pejects
 * Editing
 * We have to start supporting editing functionality on mobile devices. This is the key source of contribution and our mobile users can't be treated as second class contributors.
 * Micro Tasks - We need to reach out past edits if we want to tap into the full potential of mobile devices.
 * GPS
 * Article drafts
 * Actions on red links
 * Article Feedback
 * Tablet Support
 * Create a third proto type layout that is geared for large touch devices
 * Extend mobile visual editor work to tablets
 * On going app releases to reach a broader contributor base
 * Mobile Commons
 * Build a custom mobile experience for Commons to better attract contributions

Output

 * Mobile will finally provide its 2 billion pages views a month with a simple and easy set of contributory pipelines. Since mobile is seeing the biggest rise in readership it only makes sense to start funneling those users into contributors. This can immensly solve our editor retention problem.

Timeline

 * Upload Wizard - Q1 Fiscal 2012
 * Micro Contributos - Q2 Fiscal 2012
 * Simple Editng - Q3 Fiscal 2013

Interdependencies

 * Visual Editor
 * Editor Engagement Team

Developing Alternate Access Methods
Goals/Budget Team: Tomasz (budget owner), Phil Chang (product manager), Patrick Reilly (sr engineer), Kul Wadhwa, mobile team

Goal

 * Lower the global barrier for access to our projects by providing low cost and low tech solutions to access Wikipedia.

Rationale
Data charges and technological barriers should not impede access to our projects. There are easy ways that we can reach a significant amount of people if we are innovative with how they can access Wikipedia. Through partnerships, sms/ussd, S40 J2ME we have

Resources

 * Partner Support Engineer (PSE)
 * Additional PSE to support double digit partner growth
 * Praekelt Foundation (contractor)
 * Emmanuel (contractor)

Activities

 * Zero Partnerships
 * Remove Technical Obstacles
 * Vumi (SMS/USSD)
 * Reach millions of new users who don't have data plans
 * J2ME
 * Reach many billions more users who are not upgrading to smart phones
 * Reach
 * OpenZim+PhonGap
 * Support downloading of offline collections withing the official Wikipedia app

Outputs

 * Broadending global reach
 * Capacity support for Zero build out

Timeline

 * 5 - 10 new country launches every two months

Interdependencies

 * Operations
 * Tech Support Manager (Global Dev)

Goal

 * To broaden the reach of the Wikimedia Mobile projects through developer training, conference outreach, user testing of betas, and general evangeslim

Rationale

 * Our mobile projects have the potential to have a lot of developer interaction. Currently were engaging with a small developer base but we could be doing so much more. Given someone focused on outreach we could increase the number of mobile developers, increase the knowledge of the open mobile web and include more non developers in our projects

Resources

 * Community Liason
 * Tomasz

Activities

 * Crowd Sourcing organization for beta test
 * App developer outreach to better understand what our API can't do
 * Brazil Hackathon
 * Conferences
 * Open Mobile Web evangelism

Outputs

 * More volunteer particiaption within mobile

Interdependencies

 * Global Dev
 * TL;DR

Improving the internal infrastructure
Goal: Evolving the infrastructure that powers mobile

Rationale

 * Our mobile projects have been extremely successfull thus far but can't continue to scale at our growth rate unless we better integrate them into the core of MediaWiki, simplify our data sets, build a stronger API, and get better analytics. We have to merge the common functions of MobileFrontend into core so that we can work with non mobile developers to develop mobile solutions for our organizational goals.

Resources

 * Patrick
 * Arthur
 * Jon
 * Max
 * Brion (consulting)

Activities

 * Integrating MobileFrontend (MF) into core
 * Allows any WMF developer to create mobile interfaces
 * API
 * Extend for better data re-use
 * Increase performance for a better user experience
 * Migrate to resource loader
 * Remove the need to reparse page content
 * Open Street Maps
 * Deploy production service for internal use
 * Integrate into mobile web experience
 * WikiData/GPS
 * Deploy GeoHack replacement to store GPS data for mobile re-use
 * Expand Mobile Analytics
 * Migrate away form WURFL to a better licensed solution
 * Internationalization
 * Web fonts
 * Language Selection

Output

 * Mobile will be a core part of MediaWiki
 * Developers outside of the mobile department will be able to easily develop mobile solutions
 * Decrease the time for each mobile project due to improved QA
 * Faster API due to optimizations
 * GPS read/write API

Interdependencies

 * Platform
 * Features
 * Analytics

Internationalization
Goals/Budget team: Alolita (budget owner), Siebrand (product manager), team, +consultation with engineering/product leads


 * Input methods / Onscreen keymaps
 * Fonts
 * Language settings/selection
 * Translation
 * Search?
 * Dictionaries for translation tools, Wikisource
 * Mobile I18n support
 * Visual Editor I18n support/integration
 * Metrics/measurement stats
 * Language support tool APIs (for 3rd party use)
 * (data on impact?)
 * Improve QA / testing
 * Incorporate community feedback loop for I18n tools
 * Improve RTL support

Analytics
Goals/Budget team: Diederik, RobLa, CT, Dave, Howie, PMs, Terry

Key Goals

 * Data Services Platform &mdash; Construct a compute cluster to store, analyze, and query all incoming data of interest to the community, including traffic data, application instrumentation, and edit/editor data.
 * Intelligence for Institutional Goals &mdash; Intermediate solutions for:
 * Supporting the new Editor Engagement Experiments team
 * Editors by Geography
 * Pageviews by Mobile Carrier
 * Instrumentation: WMF Mobile apps, Click Tracking, RevTagging (Account Creation, Edits)
 * Improvements to the high-level metrics Report Card

Rationale
The Wiki Movement has a chronic need for analytics. We need it to understand our editors, to encourage growth, to engender diversity, to focus our resources, to improve our engineering efforts, and to measure our success. It permeates nearly all our goals, yet our current analytics capabilities are underdeveloped: we lack infrastructure to capture editor, visitor, clickstream, and device data in a way that is easily accessible; our efforts are distributed among different departments; our data is fragmented over different systems and databases; our tools are ad-hoc.

Rather than merely improve existing jobs and data pipelines, the Analytics Team aims to construct a Data Services Platform capable of mining intelligence from all datastreams of interest, providing this insight in real time, and exposing it via an API to power applications, mash up into websites, and stream to devices.

Planning Documents

 * Analytics/2012-2013 Roadmap
 * Analytics/2012-2013 Roadmap/Hardware

Resources

 * Capital expenditure for cluster hardware.
 * Product Manager: Diederik van Liere
 * Engineers: David Schoonover, Andrew Otto.

Q1

 * DSP Planning &mdash; Finalize and document:
 * Overall system architecture
 * Data collection design
 * Pipeline integration points (web servers, MediaWiki core + extensions, application instrumentation)
 * Batch job design for core indices and materialized views (especially typical web traffic metrics, editor and edit metrics)
 * Query and API design
 * Stand up core processing cluster
 * Components:
 * Databases
 * Batch processing system
 * Query components
 * Support servers (ZK, monitoring, etc)
 * Puppetize configurations
 * Successfully process and query an ad-hoc job

Q2

 * Offline Batch Data Imports
 * Import ad-hoc dataset
 * Perform ad-hoc analysis
 * Pixel Service (REST tracking endpoint)
 * Code development and testing
 * ETL Stream Processing components
 * Code for jobs (Storm topologies) performing:
 * IP lookup for geo, mobile carrier
 * Anonymization
 * Canonicalize data
 * Systems integration (pull from DBs, push to other standing queries/batch)
 * Core Batch Jobs
 * Support for top-k(*), sum(cardinality estimators, counters, etc)
 * Create indices on request traffic (by timeseries, URL, geo, mobile)
 * Create indices on edits and editor data (by timeseries, wiki page, geo, editor-activity, editor-age)
 * Internal Query Dashboard
 * Integration with Hive/Pig
 * Service some internal queries on ad-hoc data

Q3

 * Server Agent (request data import streams)
 * Integration with web servers & caches
 * Non-production load testing of pixel service and server agents
 * Buffering and streaming (Scribe/Flume?)
 * Zero-downtime/system integration testing
 * Core Batch Jobs
 * Create traffic funnel indices
 * Add referrer data to by-URL indices

Q4

 * Batch Import
 * Editor userdata from SQL production servers
 * Internal Query Dashboard
 * Job control functions
 * Job tuning; internal stats about runtimes

Unscheduled (Q3+ or FY+1)
Aimed at Q3/Q4 if possible, but FY 2013-2014 otherwise.


 * External Query Gateway
 * Design batch jobs for performance-sensitive (API-exposed) metrics
 * Create API for select metrics
 * Developer Accounts
 * API keys, OAuth
 * QoS Controls
 * Job resource usage & execution time monitoring, controls, throttling
 * API rate-limiting
 * Abuse monitoring and filtering
 * UI and signup workflows
 * External documentation
 * Generic MediaWiki Tracking Extension
 * Rewrite of Clicktrack and CentralNotice for new A/B testing extension.
 * Internal PHP API for core
 * Internal PHP API for extensions
 * JS API for on-page tracking
 * Pixel Service (REST tracking endpoint)
 * Integration with mobile app instrumentation
 * Integration with click-tracking
 * Integration with rev-tagging
 * Integration with MediaWiki core
 * Core Batch Jobs
 * Session analysis (views per visit, bounces, per-page {top in-pages, top out-pages}
 * Search analysis (internal & external search referrers, keyword top-k)
 * Other Batch Imports
 * Wiki content via SQL mirror or XML dump

Interdependencies

 * Biggest unknown is the availability of hardware. This planning assumes that the 10 node cluster is available from April 1, and subsequent hardware is available from August 1st.
 * Significant work with ops in migrating the cluster from pre-production setup to production use
 * Close work with data consumers in and outside of the Foundation to ensure solutions meet their needs

Goal statement
To maintain and improve donation pipleine reliability, privacy and security compliance, and to begin to provide business with better analytics they need to reach the Foundation’s fundraising goals.

Rationale
Fundraising is the source to the rest of Wikimedia's sink.

Resources

 * Katie Horn- Technical Lead
 * Jeremy P. - Fundraising Engineer
 * Jeff Green - 50% - Operations Engineer
 * + 2 full-time reqs TBD by end of FY 2012
 * (non-Engineering, Peter Gehres - Fundraiser Production)

[ current system resources allocation to be provided by ops ]

Activities

 * Maintain all pieces of the current donation pipeline
 * DonationInterface extension
 * Anti-fraud measures
 * Additional payment gateways
 * Maintain integration with third parties
 * Ensure security (in progress)
 * Improve redundancy and logging to aid in the event of unforeseen circumstances
 * CentralNotice extension
 * FundraiserStatistics extension
 * FundraiserLandingPage extension
 * ContributionTracking extension
 * CiviCRM (customer relationship database) maintenance and improvements for bare usability. Currently there are major bugs and scalability issues that WMF has run up against.
 * ActiveMQ (queuing)
 * Research into alternatives for ActiveMQ
 * Code changes in most other parts of the pipeline when we find one
 * Improvements to reliability of donation pipeline
 * CentralNotice extension: needs better A/B testing support within the same geolocation bucket
 * Third-party service automatic health check system
 * Upgrade payments cluster to more recent version of MediaWiki (try to keep it near-ish to the version on the cluster
 * Code coverage (unit tests)
 * Documentation of the existing systems.
 * Bulk mailing infrastructure (CiviCRM can't handle it, current structure was done with ad hoc scripts)
 * Greater level of PCI (Data privacy and security) compliance
 * Improve the current analytics system to allow for real-time reporting as well as increase scalability, security and transparency
 * Donation auditing - Many numbers are required by finance and global dev throughout the year. Tools that exist to get these numbers must be written and maintained.
 * Improve the methods used to build "missing" data from our payment processors, logs, and contribution tracking data.

Outputs and outcomes

 * Payment metrics
 * New payment methods
 * Additional payment processors (required for some new payment methods as well as redundancy)
 * Redundant payments clusters between pmtpa and eqiad with failover
 * CiviCRM metrics
 * If it is still running without crashing or frustating business (too much).
 * Higher certifications level measures PCI compliance
 * Code coverage for testing is measureable against the total code base
 * Payments cluster version lag versus current MediaWiki is currently 2 behind 1.19.
 * Better caching configuration for donatewiki
 * Health and Alert System for health (or lack thereof) of the pipeline for high-traffic times, capable of notifying us when immediate action is necessary
 * API for CentralNotice allowing for automated checking of banner allocations and alerts
 * Automated disabling of payment methods for down payment processors (currently very time consuming and error-prone)
 * Selenium testing for banners and landing pages to ensure that changes to templates and JavaScript do not have negative effects
 * Improved transparency to donors with regard to donations and other fundraiser metrics
 * Improved analytics system that allows fundraising creative and production teams to iterate rapidly with near real-time information
 * Fix CiviCRM or improve mass-mailing infrastructure to send emails to current and past donors

Timeline

 * Start of FY 2012-2013 - fundraising tech fully staffed
 * Late summer/Early fall 2012 - Freeze for new payment providers
 * Fall 2012 - Freeze for new payment methods for existing payment providers
 * Nov/Dec 2012 - Annual fundraiser
 * Jan/Feb 2013 - Wrap up of 2012 fundraiser
 * Feb/March 2013 - Kick-off of 2013 fundraiser campaign and planning

Interdependencies
The fundraising tech team works closely with the fundraising production and fundraising creative teams that are part of the Community Department. The team also works with LCA to ensure compliance with various privacy policies, execution of contracts with payment providers, as well as community support.

Any security engineer that we hire will work closely with the FR-tech team to ensure that our systems are secure.

Analytics - There is currently nobody dedicated to fundraiser analytics.

MediaWiki platform
Goals/Budget team: RobLa, TimS, Sumana, Chad, Brion, Roan, team

Rationale
The MediaWiki software at the heart of our software infrastructure needs continuous modernization in order to support our ambitious initiatives and in order to improve site stability.

In 2012-13, there are core technologies need to support new innovation. We need support for new MediaWiki revision types and some level of data transclusion to support Wikimedia Deutschland's Wikidata effort. A number of parts of the core software will need to be reworked in order to support flexible methods of user notifications. OAuth will allow users the ability to securely grant new tools the ability to take actions on their behalf (such as transfer images from other websites), without needing to share their password with anyone.

Our software also needs many improvements to increase our operational efficiency and stabilize our infrastructure. The way we configure our software (global variables) hasn't changed since the very early days of the project, despite enormous problems in maintainabilty, ability to test components, and ability to flexibly configure our systems. We require shell access (and often staff time) to configure many things that site admins should be empowered to change. We need to more fully support our new ability to serve from multiple datacenters, by making it possible to seamlessly switch between data centers without noticable glitches (such as loss of session data). We need to continue to improve MediaWiki's ability to handle different storage techniques such as Swift, so that we can expand our media storage, remove upload limits, and prevent the imminent exhaustion of our existing media storage. Our search infrastructure also needs improvement.

Resources

 * Aaron Schulz (100%)
 * Chad Horohoe (40%)
 * Software Security Engineer (40%)
 * Software Engineer (80%)

Activities

 * Notification framework — Add a common framework for storing/sending/reading talk, watch, and other notifications
 * MediaWiki core features for Wikidata
 * Media types on revisions
 * Raw data transclusion
 * Other core architecture work
 * Configuration database - Revamp our configuration management storage - remove global variables, stop proliferation of new ones, provide extensions means of not adding to technical debt, expand unit-testability
 * Media infrastructure - file storage improvements, limitation removal, and maintenance; transcoding support infrastructure
 * Search maintenance and incremental improvements to MediaWiki/Lucene interchange layer.
 * Super-protection
 * Replicated session handling
 * OAuth/OpenID - possible work on AcademicAccess

MediaWiki development process
Goals/Budget team: RobLa, Erik, Tim, Chad, Antoine, Terry


 * Ongoing activities
 * Continued code maintenance and review of new changes
 * MediaWiki release management and installer maintenance
 * Shell bug requests and MediaWiki-specific operations support
 * Support for outside initiatives (e.g. Wikidata)
 * Security bugfixes and emergency releases
 * Large reviews (security, performance, architecture) reviews of new system components
 * Developer training
 * Manage 20% time and other ways to keep code review in check
 * Git/Gerrit/Gitorious/.... tools & process improvements; further usability improvements, Github integration
 * Continuous integration improvements

QA
Goals/Budget team: RobLa, Chris, Tomasz, Alolita, Antoine, MarkH ...

Goal
Our goal with our Quality Assurance activity is to accurately assess the quality of new software (and address major problems) prior to imposing that software on editors and readers of our site.

Rationale
In 2011-12, Wikimedia Foundation established a beginning for quality assurance activities. We hired Chris McMahon as our QA Lead, and brought in Antoine Musso to help with our test automation infrastructure. We also have a Bugmeister who helps prioritize and assign the bugs submitted through our public bug tracker.

We don't believe our current hiring is sufficient. WMF today has 22 full time developers and approximately 20 more part time or contract developers, in addition to a large contingent of volunteer developers. While there is no standard ratio of developers to testers, and that ratio varies widely across the software development landscape, one commonly quoted figure in practice in the industry is three developers to one tester. Unfortunately, we can't afford that luxury, so we're going to have to be very strategic about the hiring we do in this area.

In 2012-13, we plan to dedicate resources to streamlining test automation, rallying community support for test efforts, providing infrastructure for better developer collaboration with testers, and providing burst testing capacity when needed. This all will help ensure that our site remains stable.

Resources

 * Chris McMahon (100%)
 * Antoine Musso (80%)
 * Bugmeister (100%)
 * QA Engineer (100%)
 * Volunteer QA Coordinator (100%)
 * Contractor budget

Activities

 * Test automation
 * API
 * Desktop browser-based UI automation (e.g. Selenium, TestSwarm)
 * Mobile automated testing (emulation/on-device)
 * Unit testing/CI/phpUnit/QUnit
 * Public manual test environment
 * Beta Labs
 * Labs instances for feature development
 * Community testing/Crowdsourcing
 * Building volunteer base through tester recruitment, bug bashes,
 * Crowdsourcing services
 * Test documentation and management
 * Test plan writing
 * Test plan standards/cleanup and oversight
 * Doc improvements - Install, automated suite docs, labs, etc.
 * Manual testing
 * Supporting the MediaWiki release cycle
 * Cross browser testing
 * Editor engagement features, i18n, Mobile device Core maintenance/misc, multimedia)
 * Manage contractor capacity, especially when burst activity is needed.

Wikimedia technical community
We aim to support volunteers and companies/orgs who work on Wikimedia technology, enable them to achieve more with each other and with WMF, and (when possible) align them with Wikimedia movement goals (especially new editor engagement and the visual editor).

Our volunteer encouragement, mentoring, and alignment “funnels” have holes at different various points for operations, documentation/project management, testing/bug reporting/bug triage (QA), and software development activities. We have a steady stream of new software developers interested in joining MediaWiki development and of new system administrators interested in using Wikimedia Labs. However, we are not as strong at finding or coordinating project management or QA activities, at mentoring the new developers, or providing a compelling and usable development environment in Wikimedia Labs that helps sysadmins and developers more easily write, test, and puppetize their changes. We have also identified the strategic weak points in these processes and aim to strengthen them.

Given that, we aim to reduce our emphasis on initial software development outreach and to improve our mentorship of the existing development community, except in partnering with Global Development and local organizations where we have a strong interest in growing the local Wikimedia community (Brazil and India). We instead aim to support staff and volunteers in mentoring developer volunteers who come in via existing intake processes. And we will partner with QA and with Ops to strengthen our volunteering pipelines in those areas.

Goals/Budget team: Sumana, RobLa, Tomasz, Alolita, Guillaume

Rationale
With more and better focused technical volunteers, we can do more work towards Wikimedia movement goals.

Resources

 * Sumana Harihareswara: 100% as volunteer development coordinator
 * Bugmeister: 50%
 * Guillaume Paumier: 100% as Technical Communications Manager
 * Technical event marketer: 100%
 * Chris McMahon as QA Lead: 25%
 * Volunteer QA Coordinator: 50%

Activities

 * General engineering communications, infrastructure and mentorship
 * Technical communication from engineering
 * Patch review, extensions review, and code review facilitation
 * Mentorship/training programs (UCOSP], [[GSoC)
 * Process/product improvements in Git workflow and Labs environment
 * Events
 * In-person developer outreach events (tutorials)
 * In-person developer inreach events (expert collaboration)
 * Encouraging small regional events
 * QA support
 * Bug squad formation and leadership
 * QA training
 * MediaWiki documentation
 * Strategic collaboration with other large orgs and communities; developer engagement/evangelism
 * Teaching/leading interested nontechnical activists to do product management work

Outputs & outcomes

 * Software development volunteering
 * Currently: Software developers can contribute but sometimes wait for weeks or months before seeing their changes merged; new volunteers don't get aligned to movement priorities.
 * Goal: Facilitate the intake and growth of volunteer developers and their ability to contribute software, and align them to the movement priorities. All new changesets from volunteers reviewed within one week of the merge request.


 * Ops volunteering
 * Currently: We have very few volunteers who can lead system administration projects or contribute to ops.
 * Goal: Many people can puppetize packages, and do so. They don’t need to whine as much to get things done.


 * Engineering project documentation and product management volunteering
 * Currently: A few volunteers make occasional edits to update project documentation, or make efforts to write specs, gather feature requirements, or do other product management work.
 * Goal: There is a team of on-call documenters, and multiple volunteers reliably update documentation about projects they care about. At least one volunteer consistently contributes to product management work.


 * MediaWiki documentation volunteering
 * Currently: sparse, uncoordinated documentation on the MediaWiki software, with occasional uncoordinated updates
 * Goal: a team of on-call MediaWiki documenters who can sprint on specific areas, and up-to-date documentation for the MediaWiki API and for the extensions that Wikimedia Foundation deploys


 * QA volunteering
 * Currently: many people find bugs and report bugs, but we have no systematic volunteer testing and nearly no systematic bug triage by volunteers.
 * Goal: A Bug Squad of good testers whom we can call upon to test at particular moments or on particular components/tools, so we can do strategic outreach; ongoing training to improve testers' skills.

Timeline
(when will major milestones of this project be hit, with focus on the 2012-13 fiscal year)

Interdependencies

 * Ryan Lane as Wikimedia Labs technical lead and Platform Engineering's DevOps Program Manager: assistance in Labs
 * QA Lead and Volunteer QA Coordinator: assistance in QA
 * Platform Engineering: assistance with Git/Gerrit/developer platform and workflow