Wikimedia Technology/Annual Plans/FY2019/TEC2: Modern Event Platform/Goals

From mediawiki.org

Program Goals and Status for FY18/19[edit]

  • Goal Owner: Nuria Ruiz
  • Program Goals for FY18/19: A modern event data platform will make it easier for engineers to build infrastructure for Knowledge as a Service. It will enable measuring the effectiveness of engineering projects, and also provide a base for smart reactive services, such as dependency tracking.
  • Annual Plan: TEC2: Modern Event Platform

[edit]

Outcome 1 / Output 1.1 - 1.4[edit]

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics, Services, SRE

Goal[edit]

  • TechCom RFCs underway and technical decisions made. (more)

Status[edit]

Note Note: July 2018

Discussed JSONSchema versus Avro and decision was taken to use JSONSchemas and is Yes Done task T198256

Note Note: August 2018

Discussed schema registry and metadata service Yes Done task T201643
Discussed scalable event intake service Yes Done task T201963

Note Note: September 18, 2018

Yes Done RFC for schema registry closing soon. We will leave metadata/config service out of MVP. We will work on scalable event intake as part of next quarter goals.


[edit]

Outcome 1 / Output 1.1[edit]

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Core Platform Team, SRE

Goal[edit]

  • Development of intake service for events whose transport is JSONSchema/http task T201068 Yes Done

Status[edit]

Note Note: October 19, 2018

Planing on what language/platform we are going to be building the intake service Yes Done

Note Note: November 14, 2018

Intake service prototype is being built in node task T206815 Yes Done

Note Note: December 12, 2018

Code can be found here: https://github.com/wikimedia/eventgate Yes Done


[edit]

Outcome 1 / Output 1.1[edit]

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)[edit]

  • Deployment of Stream Intake Service (AKA EventGate) using TEC3: Deployment Pipeline Yes Done
  • Mediawiki Monolog+Kafka usage migrated to EventGate. task T214080 task T216163 In progress In progress
  • STRECH GOAL: Migration of some mediawiki 'Eventbus' events to EventGate.
  • STRECH GOAL: Decomission old 'analytics' Kafka cluster.

Status[edit]

To do To do January 2019

Deployment via Docker & Kubernetes in beta and then production. task T211247 Yes Done

Note Note: February 25, 2019

Produce Monolog based events from Mediawiki to EventGate and create Hive tables in Hadoop. In progress In progress

To do To do March 2019 Proposed:

Get users of Monolog based events to use new tables.
Begin migrating some Mediawiki 'Eventbus' events to EventGate.
If possible decommission 'analytics' Kafka cluster.

Marked Yes Done on March 14, 2019:

  • Deployment of Stream Intake Service

Outcome 1 / Output 1.3[edit]

It is clear to engineers how to design event schemas to support analytics and production features to ease future maintenance and evolution of those systems. Dependencies on: Analytics Engineering, Core Platform Team

Goal(s)[edit]

  • Backwards compatibility checks in schema repository CI task T206814 In progress In progress

Status[edit]

To do To do January 2019

JSONSchema backwards compatibility library implemented task T206889 In progress In progress

Note Note: February 25, 2019

mediawiki/event-schemas repository Jenkins CI with backwards compatibility checks In progress In progress

To do To do March 2019

Code on this task is ready to go, checks are not enabled as there is now only two schemas that can benefit from it.


[edit]

Outcome / Output[edit]

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)[edit]

  • Decommission old 'analytics' Kafka cluster.T183303 Yes Done
  • Deploy an instance of EventGate that processes events sent to kafka main cluster T218346 Yes Done
  • Schema Repository CI for convention and backwards compatibility enforcement

Status[edit]

To do To do May 2019

In progress In progress In order to decomission the cluster we need to turn off the old avro workflow, we hope to do that in June.

Also, kafka main event gate deployment is In progress In progress

To do To do June 2019 Some of the cI work will move to next quarter