Wikimedia Technology/Annual Plans/FY2019/TEC2: Modern Event Platform/Goals

=Program Goals and Status for FY18/19=

TEC2: Modern Event Platform
 * Goal Owner: Nuria Ruiz
 * Program Goals for FY18/19: A modern event data platform will make it easier for engineers to build infrastructure for Knowledge as a Service. It will enable measuring the effectiveness of engineering projects, and also provide a base for smart reactive services, such as dependency tracking.
 * Annual Plan: TEC2: Modern Event Platform
 * Primary Goal is Knowledge as a Service: Evolve our systems and structures
 * Tech Goal: Sustaining



 = Q1 Goals =

Outcome 1 / Output 1.1 - 1.4
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics, Services, SRE

Goal

 * TechCom RFCs underway and technical decisions made. (more)

Status
July 2018


 * Discussed JSONSchema versus Avro and decision was taken to use JSONSchemas and is ✅

August 2018


 * Discussed schema registry and metadata service ✅
 * Discussed scalable event intake service ✅

September 18, 2018


 * ✅ RFC for schema registry closing soon. We will leave metadata/config service out of MVP. We will work on scalable event intake as part of next quarter goals.

 = Q2 Goals =

Outcome 1 / Output 1.1
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Core Platform Team, SRE

Goal

 * Development of intake service for events whose transport is JSONSchema/http ✅

Status
October 19, 2018


 * Planing on what language/platform we are going to be building the intake service ✅

November 14, 2018


 * Intake service prototype is being built in node ✅

December 12, 2018


 * Code can be found here: https://github.com/wikimedia/eventgate ✅

 = Q3 Goals =

Outcome 1 / Output 1.1
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

 * Deployment of Stream Intake Service (AKA EventGate) using TEC3: Deployment Pipeline ✅
 * Mediawiki Monolog+Kafka usage migrated to EventGate.
 * STRECH GOAL: Migration of some mediawiki 'Eventbus' events to EventGate.
 * STRECH GOAL: Decomission old 'analytics' Kafka cluster.

Status
January 2019


 * Deployment via Docker & Kubernetes in beta and then production. ✅

February 25, 2019


 * Produce Monolog based events from Mediawiki to EventGate and create Hive tables in Hadoop.

March 2019 Proposed:
 * Get users of Monolog based events to use new tables.
 * Begin migrating some Mediawiki 'Eventbus' events to EventGate.
 * If possible decommission 'analytics' Kafka cluster.

Marked ✅ on March 14, 2019:
 * Deployment of Stream Intake Service

Outcome 1 / Output 1.3
It is clear to engineers how to design event schemas to support analytics and production features to ease future maintenance and evolution of those systems. Dependencies on: Analytics Engineering, Core Platform Team

Goal(s)

 * Backwards compatibility checks in schema repository CI

Status
January 2019


 * JSONSchema backwards compatibility library implemented

February 25, 2019


 * mediawiki/event-schemas repository Jenkins CI with backwards compatibility checks

March 2019


 * Code on this task is ready to go, checks are not enabled as there is now only two schemas that can benefit from it.



= Q4 Goals =

Outcome / Output
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

 * Decommission old 'analytics' Kafka cluster.T183303 ✅
 * Deploy an instance of EventGate that processes events sent to kafka main cluster T218346 ✅
 * Schema Repository CI for convention and backwards compatibility enforcement

Status
May 2019


 * In order to decomission the cluster we need to turn off the old avro workflow, we hope to do that in June.

Also, kafka main event gate deployment is

June 2019 Some of the cI work will move to next quarter