Wikimedia Product/Analytics Infrastructure/Standard

The instrumentation platform is a set of interfaces for processing instrumentation events. The instrumentation platform specializes the more general event platform, and the two should not be confused, as the event platform also carries other kinds of event traffic, such as events used to drive application behavior. The goal of the instrumentation platform is to allow many different platforms to access a common set of instrumentation capabilities that are consistent, protective, and flexible. TODO: Come up with a easier to remember hierarchy of needs.

(Consistent) Events from the same instrument on different platforms should be statistically comparable. We must ensure the exact same algorithms for sampling, randomness, timestamp application, HTTP request handling, and so on, and compensate for the always-changing quirks of each platform and programming language.

(Protective) Like consistency, performance and privacy can be challenging even for seemingly-simple tasks. We prioritize things like page load time, memory, CPU, storage, radio and battery usage, and focus in to ensure that our core algorithms allow the privacy of the subject to be protected.

(Flexible) Instrumentation is always churning and often moves fast in response to new data or new features. By ensuring the properties of the system are fundamentally consistent and protective, we can grant more flexibility to iterate quickly, knowing that we are safe when we build our instruments with this system. TODO Something something dynamic configuration, stream configuration, schema, something something.

The instrumentation platform builds its core algorithms with a small but well-chosen set of primitives that can be implemented in a transparent style, avoiding the use of language-specific abstractions. This makes it easier to verify critical behavior in a new target language. This core is then wrapped by an integration layer that implements platform-specific bookkeeping.

Each target platform receives one such client library, and all share a common set of interfaces (producer, stream config, schema). This commonality allows for documentation, code, and techniques to be shared between developers, and makes it easier for non-platform experts to understand instrumentation.

Introduction
Description of the software goes here.

Conformance
How conformance is determined. RFC 2199

Event
An event is a unit of structured data containing at least: The event will usually also contain additional data describing the event. Such data are specified by the type.
 * a time (at which the event occurred)
 * a type (specifying its intended data structure).

Instrumentation Event
An instrumentation event is an event carrying an observation about the software under instrumentation. They are strictly observational, and must follow a "prime directive": the software under instrumentation should behave identically whether or not the event is fired. This property allows instrumentation to be enabled or disabled at will, and ensures that regressions or interruptions in instrumentation do not degrade the actual software under instrumentation. This makes it safer for instrumentation to be managed independently, and allows systems which carry instrumentation events to be held to a lower service tier.

Instrument
An instrument is the unit of application code responsible for submitting the event data to the instrumentation platform library. The event data alone is not yet an event, as it does not have a time. The time, and other additional fields, will be added by the instrumentation platform library before the event is produced.

streams
.

Event
The event is formatted as a JSON string that can be validated against a corresponding JSONSchema event schema. The properties of an event must match a schema. The event will identify which schema it believes it matches. Properties can be added at any time prior to schema validation. Some properties are added by the instrumentation platform library and are reserved for its use.

Instrumentation event schema descend from the instrumentation common schema. The fields of this common schema are filled out by the instrumentation platform library automatically, to make things more convenient for the engineer.