Wikimedia Technical Conference/2018/Session notes/Architecting Core: stand-alone services

Note: Slides for this session will be attached.

Goals for the session (see slide for full text)
A specific set of criteria for determining whether functionality goes in MW or a standalone service. In essence, the outcome of this session will be an RFC, comprising the criteria, requirements, and expectations for MediaWiki functionality that is provided in the form of a standalone service.

Definition of a standalone service (see slide for full text)
For purposes of this session, a standalone service has the following properties.


 * Business logic in separate runtime from MW
 * Interacts with MW via some remote mechanism
 * API
 * Queue
 * XHR
 * Does not directly access MediaWiki's data store
 * May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension
 * May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension

Exercise 1
Question 1: What properties make functionality a candidate for separation in to a separate service?
 * Async
 * Elevated security need
 * State context independency
 * 3rd party library exists (potentially in another language or in another form that makes integration in to MediaWiki difficult)
 * Excessive resource needs
 * Independently useful and/or can be replaced with something off the shelf
 * Better lang or framework exists for solving the problem
 * Independent scalability concerns
 * Different ownership models/autonomy/rate of change
 * Used to triage MW or fix it
 * Need to ship quickly

Question 2: What properties disqualify functionality from separation in to a separate service?
 * Require direct MediaWiki DB access
 * Easy to do in the context of MediaWiki (using existing classes, for example), difficult to do outside of the context of MediaWiki.
 * Too small to justify separation overhead
 * Chattiness with the MediaWiki api
 * Synchronous
 * Needs extensibility by MW features/extensions

Exercise 2
Question 1: What existing MediaWiki functionality is provided by standalone services?




 * Parsing (Potentially quick/large wins available through re-integration in to MediaWiki.)
 * Thumbnailing
 * ORES
 * cp-jobqueue
 * PDF
 * Eventstreams
 * Map tiles
 * Recommendation
 * Search
 * MCS (Mobile content service)
 * Restbase (caching / routing)
 * Citations
 * Mathoid
 * Graph rendering
 * Translations
 * WDQS
 * Analytics
 * Routing (Potentially quick/large wins available through re-integration in to MediaWiki.)
 * CDN

Question 2: What existing MediaWiki functionality could be provided by standalone services?




 * A/B Testing
 * Job Queue
 * Server Side Rendering
 * Maps
 * Inter-Service Discovery/Routing
 * Users and Auth
 * Echo Notification
 * URL Routing
 * l10n/i18n
 * (Some) special pages
 * Edge purger
 * Media handling & transcoding
 * URL shortening
 * Reading lists
 * watch lists
 * Revision service
 * Parser

Exercise 3
Question 1: What technical/architecture requirements should apply to all standalone services?
 * Minimize data collection
 * OSI licensed
 * Respect GDPR and other applicable data privacy frameworks
 * Must do a thing
 * Should not be redundant with other services

Question 2: What additional requirements should apply to standalone services in Wikimedia production?
 * SLIs/SLOs
 * WMF-compatible monitoring
 * Has a privacy policy and policy practices that are compatible with WMF privacy policy
 * Uses Wikimedia deploy tooling
 * Has passed WMF Security review
 * Uses a language and toolset that have been approved by TechCom
 * Has an owner
 * Has Runbooks
 * Is licensed under an OSI-approved Open Source license
 * Has WMF compatible structured logging
 * Swagger specs
 * Fault tolerant
 * Multi data center
 * Backups
 * Pinned/Pinnable dependencies
 * Horizontal scalability
 * Documentation
 * Trusted upstream asset chain
 * Performs sufficiently for Wikipedia use cases
 * Has users (or a plan to get users)

Question 3: What additional requirements should apply to standalone services distributed for 3rd party use?
 * Easy to install
 * Versioned (semver) - Compatible with supported MW releases (LTS)
 * Easy to upgrade and to extend
 * Public docs on install, upgrade
 * config outside of code
 * Operationally independent of wikis
 * Open source - usable, accepts patches, etc
 * Small footprint
 * Public security advisories
 * Support channel