Wikimedia Technical Conference/2018/Session notes/Architecting Core: stand-alone services

From mediawiki.org
Jump to navigation Jump to search

Slides that were used to guide this session are available on Commons.

Goals for the session (see slide for full text)[edit]

A specific set of criteria for determining whether functionality goes in MW or a standalone service. In essence, the outcome of this session will be an RFC, comprising the criteria, requirements, and expectations for MediaWiki functionality that is provided in the form of a standalone service.

Definition of a standalone service (see slide for full text)[edit]

For purposes of this session, a standalone service has the following properties.

  • Business logic in separate runtime from MW
  • Interacts with MW via some remote mechanism
    • API
    • Queue
    • XHR
    • ?
  • Does not directly access MediaWiki's data store
  • May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension

Exercise 1[edit]

Question 1: What properties make functionality a candidate for separation in to a separate service?

Output of exercise 1, question 1
  • Async
  • Elevated security need
  • State context independency
  • 3rd party library exists (potentially in another language or in another form that makes integration in to MediaWiki difficult)
  • Excessive resource needs
  • Independently useful and/or can be replaced with something off the shelf
  • Better lang or framework exists for solving the problem
  • Independent scalability concerns
  • Different ownership models/autonomy/rate of change
  • Used to triage MW or fix it
  • Need to ship quickly

Question 2: What properties disqualify functionality from separation in to a separate service?

Output of exercise 1, question 2
  • Require direct MediaWiki DB access
  • Easy to do in the context of MediaWiki (using existing classes, for example), difficult to do outside of the context of MediaWiki.
  • Too small to justify separation overhead
  • Chattiness with the MediaWiki api
  • Synchronous
  • Needs extensibility by MW features/extensions

Exercise 2[edit]

Question 1: What existing MediaWiki functionality is provided by standalone services?

Functionality already provided to MediaWiki by standalone services. Hearts denote services that are candidates for reintegration in to MediaWiki.
  • Parsing (Potentially quick/large wins available through re-integration in to MediaWiki.)
  • Thumbnailing
  • ORES
  • cp-jobqueue
  • PDF
  • Eventstreams
  • Map tiles
  • Recommendation
  • Search
  • MCS (Mobile content service)
  • Restbase (caching / routing)
  • Citations
  • Mathoid
  • Graph rendering
  • Translations
  • WDQS
  • Analytics
  • Routing (Potentially quick/large wins available through re-integration in to MediaWiki.)
  • CDN

Question 2: What existing MediaWiki functionality could be provided by standalone services?

Grid estimating the difficulty and the scale of "the win" from extracting this service in to a standalone service
  • A/B Testing
  • Job Queue
  • Server Side Rendering
  • Maps
  • Inter-Service Discovery/Routing
  • Users and Auth
  • Echo Notification
  • URL Routing
  • l10n/i18n
  • (Some) special pages
  • Edge purger
  • Media handling & transcoding
  • URL shortening
  • Reading lists
  • watch lists
  • Revision service
  • Parser

Exercise 3[edit]

Question 1: What technical/architecture requirements should apply to all standalone services?

  • Minimize data collection
  • OSI licensed
  • Respect GDPR and other applicable data privacy frameworks
  • Must do a thing
  • Should not be redundant with other services

Question 2: What additional requirements should apply to standalone services in Wikimedia production?

Wmtc 2018 ac standalone ex3 q2.jpg
  • SLIs/SLOs
  • WMF-compatible monitoring
  • Has a privacy policy and policy practices that are compatible with WMF privacy policy
  • Uses Wikimedia deploy tooling
  • Has passed WMF Security review
  • Uses a language and toolset that have been approved by TechCom
  • Has an owner
  • Has Runbooks
  • Is licensed under an OSI-approved Open Source license
  • Has WMF compatible structured logging
  • Swagger specs
  • Fault tolerant
  • Multi data center
  • Backups
  • Pinned/Pinnable dependencies
  • Horizontal scalability
  • Documentation
  • Trusted upstream asset chain
  • Performs sufficiently for Wikipedia use cases
  • Has users (or a plan to get users)

Question 3: What additional requirements should apply to standalone services distributed for 3rd party use?

Output of exercise 3, question 3 at WMTC 2018
  • Easy to install
  • Versioned (semver) - Compatible with supported MW releases (LTS)
  • Easy to upgrade and to extend
  • Public docs on install, upgrade
  • config outside of code
  • Operationally independent of wikis
  • Open source - usable, accepts patches, etc
  • Small footprint
  • Public security advisories
  • Support channel