Reading/Web/Desktop Improvements/Changes to EventLogging and how they will impact our work


I've tried to simplify this analysis as much as possible from a reading web POV, so have purposely omitted details about the underlying technologies of the new system, which don't seem important from our POV. In future we will meet with Jason Linehan and the team to understand these changes better, but from a higher level here's what we need to know.

How is the eventlogging pipeline changing?[edit]

Schemas are going to be moved to git. We will set the schema up in a single repo. A config change will then be needed to activate it. We will continue to write event producing application code (using provided APIs)

There will be new JS inside the EventLogging extension - a client library will be provided as part of phab:T228175

We won't need to write our own config - that will be built in.

We will also be able to define timeframes for eventlogging to run

This feels very similar to our current approach, just using different repos however it's exciting to see we'll have to worry less about things such as sampling and enabling experiments which will be handled by the platform.

Would this affect the way we write schemas and the things we are capable of measuring?[edit]

It sounds like we'd be able to measure the same things, but the platform will become more reliable as we will not have to spin up our own versions of sampling. I don't think it impacts the way we write schemas.

What impact would these changes have on our current instrumentation?[edit]

The only changes I would advise is to take care when writing anything new that we maintain a clear separation of the code enabling the schema (and dealing with sampling) from the code producing events so that in future it will be easier for us to migrate to the new system. We currently use mw.track in our event producing code and it seems like we should continue to do so.

We will likely want to monitor progress in phab:T228175 and write our application code in WikimediaEvents so that we keep in sync with the latest changes to the repo. If we were to work inside our own repos we may be at more risk of deviating from current best practices.

See also[edit]

Event Platform - Product Usage design document and and its subpages.