Wikimedia Product/Analytics Infrastructure/Event Platform Client

Stream configuration
Streams are configured via  in mediawiki-config/wmf-config/InitialiseSettings.php: These stream names also need to be registered in   like so: This makes those streams' configs available inside EventLogging so   can check sampling and other configuration. Streams that are used exclusively by instrumentation on other platforms do not need to be registered with EventLogging.

Note: The setup is slightly different when developing, since MediaWiki Vagrant doesn't use wmf-config. Refer to https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Event_stream_configuration for more information.

MediaWiki (EventLogging)
To log an event to a stream, use:

should match a stream that is in  (see above)

needs to contain:


 * – the schema title and version (e.g. "/analytics/link_hover/1.0.0")

That is, your instrumentation needs to declare which version of the schema it adheres to.

For additional information, refer to Event Platform instrumentation how-to.

iOS
To log an event to a stream, use:


 * should match a stream that is in  (see above)
 * (e.g. "/analytics/link_hover/1.0.0") declares which schema (and which particular version) the instrumentation conforms to
 * is a Dictionary of specific event data
 * is optional since the concept of domain/wiki is fuzzy on the apps
 * Include in cases where it makes sense – e.g. user is engaging with a specific language's content, contributing to a particular wiki
 * Omit in cases where it doesn't make sense to attribute the interactions to a particular wiki – e.g. user is managing reading lists, engaging with map of nearby articles

Generator
If your instrumentation needs to make multiple calls to the same stream (or to different streams) with same schema, use the following to generate logging functions with fewer arguments:



Integration requirements
EPC needs two main components from the rest of the application for it to function: a  and a.

Storage manager
The  enables EPC to persist data, recall persisted data, and delete persisted data.

Network manager
The NetworkManager enables EPC to HTTP POST event data to the EventGate service located  https://intake-analytics.wikimedia.org/v1/events  and to download the stream configuration from MediaWiki API via  https://www.mediawiki.org/w/api.php?action=streamconfigs&format=json  (for example).

Event queuing and buffering
On the web, specifically within MediaWiki, EventLogging and ResourceLoader take care of this to some extent. Before the EventLogging extension is loaded, events are tracked via  and then logged once logging functionality is available.

Input buffering
If an event is logged before the stream configuration is available – for example if it is downloaded asynchronously on a supported mobile app platform from the MediaWiki API – the library records the date-time of when the event was logged (and the session's ID at the time) and stores it in an internal queue. Once the stream configuration is available, the library processes those queued-up events by re-logging them.

If the application closes while there are still events in this internal queue, the events need to be persisted. When the application opens, if there are events persisted from the previous session, those events need to be recalled.

Output buffering
The code which HTTP POSTs events handles output buffering, which we define as: maintaining a FIFO queue of HTTP requests, which are sent when there is network connectivity and when the application decides it is a good time to wake up the radio antenna, and which are held when there is no network connectivity. It is up to that same code to persist the queued-up HTTP requests if the application closes completely and to recall those requests when the application launches again.

Session expiration
With EPC-iOS and EPC-Android, where the application can enter a background state and potentially return to foreground, sessions are allowed a maximum of 15 minutes of inactivity. If the application comes back to foreground, EPC is notified and performs a check of whether a new session ID needs to be generated.

Stream configuration
For more information refer to stream configuration specification.

Determination
[Algorithm]

Caching
With EPC-iOS and EPC-Android, since the stream configuration cannot change after it has been downloaded on app launch, we cache each stream's determination. The full evaluation of whether the identifier associated with the stream happens only once: the first time an event is logged to that stream. Every subsequent time the determination is retrieved from a cache.

The cache is cleared if the session expired and a new session ID is generated.

Additional platforms
Eventually we would like to offer a version of the EPC library for use in the Wikipedia KaiOS app.

Detailed targeting
Refer to Wikimedia Product/Analytics Infrastructure/Stream configuration for more details.