Wikimedia Services/Roadmap

2014 / 15
The high-level roadmap from the goals page:

Interdependencies

 * Parsoid depends on Rashomon revision storage & content API
 * VE, Flow, Mobile & platform depend on HTML content & page metadata end points
 * Lots of stakeholders on storage service (platform, features, mobile, dev community)
 * Lots of stakeholders on HTML templating (community, platform, features, mobile)
 * We depend on ops for provisioning, deployment & monitoring

= Details on individual projects =

REST API front-end (working title: restface)

 * Goal: support high volume with low latency
 * Varnish caching & reliable purging
 * Usually thin wrapper around back-end services; normal case: just load from storage service
 * If missing, ask other services to create data on demand & save back to storage service
 * Consistent REST API with structured API docs

Enable move to native Parsoid HTML5 storage & page views

 * Use static Parsoid HTML5 for all page views
 * HTML5 load / save entry point for use by desktop and Mobile page views, VE, content translation and others
 * To power Mobile skin, apps
 * Improve desktop page view latency for editors (currently 50+% higher median page load times)
 * Page metadata entry point for rendering of red links and other bits currently implemented as server-side content transformations
 * Facilitate additional content derivative end points (e.g. Mobile: section loading, citations, section image urls)

Miscellaneous service end points

 * Citation expansion service entry point for VE & others: expand a URL to full citation data using Zotero data extractors
 * CentralNotice banner service

API end point design and prototyping support for other teams

 * Example: Help Flow team in the development of a REST API for use by rich front-end, mobile

Storage service

 * See RFC for background
 * Aiming for ability to use this for regular page views Q2
 * Improved page view performance for editors (currently 50+% slower)
 * Reduce load on PHP cluster (HW cost and energy savings)
 * Enables seamless and fast switching from page view to VE, async saving
 * Support for cross-datacenter replication, compression and even load distribution across storage cluster
 * Helps to solve scaling problems in MySQL (revision table, link tables)

Generalization of storage service to support different bucket types

 * Candidate bucket types, roughly by priority: versioned blob, queue, key-value, ordered key-value, counter
 * Features like authentication, TTL

Update & invalidation jobs

 * Ensure that stored data is kept up to date with changes, and front-end caches are invalidated
 * Possibly look into simple HTTP job runner using queue in storage service

Misc backend services

 * Deploy & maintain PDF render service


 * Maintain Math render service (Mathoid)

Structured API documentation

 * Goals:
 * Machine-readable API specs
 * Browsable documentation & sandbox
 * Auto-generated mock APIs
 * Help establish best practices in declarative API documentation using tools like swagger
 * See this section in the content API RFC

Drive automated service testing

 * Mocking


 * Work with QA & Antoine on containerization
 * Try to leverage API specs

Evolve authentication in collaboration with platform

 * Develop security & authentication / authorization architecture in collaboration with platform
 * Least privilege
 * Isolation
 * Efficient for high request volumes
 * Using standards (OAuth2, OpenID connect)
 * Document authentication requirements clearly in API spec

Deployment and Packaging in collab with platform, ops

 * Drive packaging of services for practical third-party and internal use
 * Leverage packages as much as possible for deployment, DRY
 * Use Puppet for configuration management

HTML content

 * Continue work on HTML content & i18n message templating in collaboration with Parsoid & other teams
 * Build on TAssembly, KnockOff
 * Stretch goal: Look into stand-alone HTML diffing service independent from Parsoid