Platform Engineering Team/Event Platform Value Stream

From mediawiki.org

What is it?[edit]

Engineers from across Technology (Platform, Data Engineering, Search and Enterprise) will collaborate on a shared event streaming platform capability that is beneficial to each group and the overall foundation.

Existing event streams serve as a change of state but lack many details required to make sense of that change (see T291120), the event platform will enable us to build enriched data streams that will allow the foundation and community to build and share better knowledge experiences.

What we aim to achieve?[edit]

  • Evaluation of event streaming platforms
  • Implementation of chosen event streaming solution as a proof of concept (no SLO's)
  • Implementation of the following services/stream processors:
    • Simple Enrichment - transform a single stream by enriching with calls to MediaWiki API's
    • Research Use Case - transform a single stream to provide data for a Research Use Case
    • Data Integration - integrating streams and databases
  • Understanding the pathway and considerations to take the chosen solution to production
  • Creating tooling and pathways for other engineering groups to build streaming services/processors

How does this benefit the movement?[edit]

  • Knowledge as a service - Publishing enriched event streams to the world will allow anyone to build on that to create new knowledge experiences
  • Knowledge equity - By publishing enriched streams we break down technical barriers in navigating and accessing data that could be used to build new knowledge experiences

Links[edit]


Creating the first service - T307959[edit]

Now that we are moving forward with Flink as a solution, the first service will consolidate existing streams, enrich messages with page content (wikitext, json, etc) and output to a new topic.

More details can be found here

As part of the POC work we also worked on tooling to make consuming existing event platform streams easy, see here.

MILESTONE: Demo see video here

(In Progress): Building on Flink Learnings and the POC Service[edit]

To be groomed and defined:

Ticket Title Description Lead/Backup Timebox Status
T310082 State Changelog schema design Streaming event data represents changes to an entity, e.g. a page. If we are able to represent these changes in a way that can be used to update 'current state', such that after consuming all past events, the current state is materializable via events alone, then the event stream is a changelog. Flink has support for automatically consuming changelog streams and presenting them as materialized views of current state. We should research designing our event streams as changelogs so they can be consumed by Flink in this way. Andrew Otto/David Causse 1 weeks Planned/Needs ticket grooming
T309784 Consolidated and Ordered Page Change Stream POC service to mimic what it would be like to have a consolidated single stream with ordered events. David Causse/Gabriele Modena 4 weeks? Planned
T306627 Integrate Image Suggestions Feedback with Cassandra Design, implement and deploy a service that listens for image suggestions feedback and writes the data to the Cassandra schema so that the feedback can be persisted Thomas Chin/Group Unbounded In Progress
To do Research Use Case Demo and explain event stream to Research, discuss potential use cases or useful streams - diffs, enrichments. Work with Research to implement a POC using events - TBD Group for now 4 weeks? TBD Planned

Future Phases: Tooling and abstractions[edit]

To be groomed and defined:

Ticket Title Description Lead/Backup Timebox Status
T310218 Flink output support for Event Platform We now have a Table API abstractions for Event Platform streams as a Table source. We should automate emitting events too, likely wrapping JsonEventGenerator. Andrew Otto/David Causse? 4 weeks? TBD Planned/Needs ticket grooming
To do AsyncLookupTable for the MW API Can/Should we make an AsyncLookupTable for the MW API? This could wrap handling retries, etc, and would make using the MW API in Flink quite nice. Andrew Otto/? 4 weeks? TBD Planned/Needs ticket grooming
T309699 Retry Logic/Error Handling