Wikimedia Technology/Annual Plans/FY2019/TEC2: Modern Event Platform/Goals

=Program Goals and Status for FY18/19=

TEC2: Modern Event Platform
 * Goal Owner: Nuria Ruiz
 * Program Goals for FY18/19: A modern event data platform will make it easier for engineers to build infrastructure for Knowledge as a Service. It will enable measuring the effectiveness of engineering projects, and also provide a base for smart reactive services, such as dependency tracking.
 * Annual Plan: TEC2: Modern Event Platform
 * Primary Goal is Knowledge as a Service: Evolve our systems and structures
 * Tech Goal: Sustaining



 = Q1 Goals =

Outcome 1 / Output 1.1 - 1.4
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics, Services, SRE

Goal

 * TechCom RFCs underway and technical decisions made. (more)

Status
July 2018


 * Discussed JSONSchema versus Avro and decision was taken to use JSONSchemas and is ✅

August 2018


 * Discussed schema registry and metadata service ✅
 * Discussed scalable event intake service ✅

September 18, 2018


 * ✅ RFC for schema registry closing soon. We will leave metadata/config service out of MVP. We will work on scalable event intake as part of next quarter goals.

 = Q2 Goals =

Outcome 1 / Output 1.1
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Core Platform Team, SRE

Goal

 * Development of intake service for events whose transport is JSONSchema/http ✅

Status
October 19, 2018


 * Planing on what language/platform we are going to be building the intake service ✅

November 14, 2018


 * Intake service prototype is being built in node ✅

December 12, 2018


 * Code can be found here: https://github.com/wikimedia/eventgate ✅

 = Q3 Goals =

Outcome 1 / Output 1.1
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

 * Deployment of Stream Intake Service (AKA EventGate) using TEC3: Deployment Pipeline ✅
 * Mediawiki Monolog+Kafka usage migrated to EventGate.
 * STRECH GOAL: Migration of some mediawiki 'Eventbus' events to EventGate.
 * STRECH GOAL: Decomission old 'analytics' Kafka cluster.

Status
January 2019


 * Deployment via Docker & Kubernetes in beta and then production. ✅

February 25, 2019


 * Produce Monolog based events from Mediawiki to EventGate and create Hive tables in Hadoop.

March 2019 Proposed:
 * Get users of Monolog based events to use new tables.
 * Begin migrating some Mediawiki 'Eventbus' events to EventGate.
 * If possible decommission 'analytics' Kafka cluster.

Marked ✅ on March 14, 2019:
 * Deployment of Stream Intake Service

Outcome 1 / Output 1.3
It is clear to engineers how to design event schemas to support analytics and production features to ease future maintenance and evolution of those systems. Dependencies on: Analytics Engineering, Core Platform Team

Goal(s)

 * Backwards compatibility checks in schema repository CI

Status
January 2019


 * JSONSchema backwards compatibility library implemented

February 25, 2019


 * mediawiki/event-schemas repository Jenkins CI with backwards compatibility checks

March 2019


 * Code on this task is ready to go, checks are not enabled as there is now only two schemas that can benefit from it.



= Q4 Goals =

Outcome / Output
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.


 * Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

 * Decommission old 'analytics' Kafka cluster.T183303
 * Deploy an instance of EventGate that processes events sent to kafka main cluster T218346
 * Schema Repository CI for convention and backwards compatibility enforcement

Status
May 2019


 * In order to decomission the cluster we need to turn off the old avro workflow, we hope to do that in June.

Also, kafka main event gate deployment is

June 2019


 * Discussed...