UserMetrics/FAQ

From mediawiki.org

What is the UserMetrics API?[edit]

UserMetrics is the name of a platform developed by the Wikimedia Foundation to measure user activity based on a set of standardized metrics. Using this platform, a set of key metrics can be selected and applied to a cohort of users to measure their overall productivity. The platform is designed for extensibility (creating new metrics, modifying metric parameters) and to support various types of cohort analysis and program evaluation in a user-friendly way. It accepts requests via a RESTful API and returns responses in JSON format.

How can I access the UserMetrics API?[edit]

Although the UserMetrics API home page is publically accessible, only authenticated users may make requests, view cohorts, or retrieve the response generated by a request. Currently, authentication credentials are available to internal Wikimedia Foundation staff, early beta testers, and a number of trusted individuals from chapters and other organizations affiliated with the Foundation. To obtain credentials, please contact usermetrics@wikimedia.org.

What is a metric?[edit]

Metrics are well-defined values or sets of values that can be computed for any user registered in Wikimedia projects, and are typically used in aggregate to compare different user groups (i.e., cohorts) against each other. The metrics computed by the UserMetricsAPI help us understand user activity and behavior--from the quality, quantity and type of user contribution, to how well our editors are retained. All metrics are standardized and clearly defined so that we can easily understand what their values mean and consistently use the same standards to evaluate the efficacy of programs and initiatives over time. Note that metrics are dependent on the context in which they are measured and therefore only make sense in these contexts. An editor with a high revert rate could be a vandal, or an advanced user removing vandalized text. The set of metrics supported by the UserMetricsAPI is in no way exhaustive. The system has been designed to be easily extensible, so that new metrics can be added and parameterized in different ways. For a list of currently supported metrics, please see Available metrics in the UserMetrics API Guide.

What is a cohort?[edit]

A cohort is a set of users sharing one or more property or attribute—the time of account creation, for example, or participation in an outreach event or experimental group. At its most basic, a cohort is defined by a single usertag. For more information about cohorts and usertags, please see UserMetrics API Guide.

How can I create a new cohort?[edit]

If you would like to add a new cohort of users, you can do so via the UserMetrics API’s cohort upload feature. Cohorts can consist of users of a single project (e.g., a list of enwiki users) or of multiple projects (e.g., a list of users of either enwiki or arwiki projects). For step-by-step instructions on creating a new cohort, please see Adding a custom cohort in the UserMetrics API Guide.

Can I combine existing cohorts?[edit]

Yes. A cohort can be based on a combination of two or more existing cohorts. We refer to such cohorts as 'multi-tag cohorts', and they are defined in the request string using Boolean operators, which combine single-tag cohorts in the specified way (either ‘union’ or ‘intersection’, though other operators may be supported in the future). For more information about combining cohorts, please see Multiple-tag cohorts in the UserMetrics API Guide.

How do I run a request?[edit]

Requests are built and passed to the UserMetrics API as HTTP requests. At its most basic, a request consists of a cohort of users and a metric of interest. Both can be selected via drop-down menus on the UserMetrics API home page. Once a cohort and metric have been selected, the UserMetrics API will automatically process the request using the default parameter values associated with each metric. For more information about creating requests, please see Understanding different types of requests in the UserMetrics API Guide.

What is returned when I run a request?[edit]

The UserMetrics API receives each submitted request and returns either the cached response (for requests that have been previously processed and that do not specify a ‘refresh’ parameter) or a new response generated by the UserMetrics engine (for new requests). In either case, the data is returned as JSON objects. Each JSON object contains the metric data as well as information about the request itself (i.e., request metadata). For more information, please see Understanding the response in the UserMetrics API Guide.

How can I override the default request parameters?[edit]

To override default parameters, use a global or metric-specific parameter in the request URL. For more information about using parameters, please see Global parameters and Metric-specific parameters and aggregators in the UserMetrics API Guide.

Can the UserMetrics API generate an aggregate metric value for an entire cohort?[edit]

Yes! By default, the API returns the metric value for each cohort user. To return an aggregate value for the cohort, simply specify an aggregator in the request URL. Note that each metric has a set of implemented aggregators that can be used. Please see Metric-specific parameters and aggregators in the UserMetrics API Guide for more information.

Where can I find more information about the supported metrics?[edit]

For more information about the currently supported metrics, their default parameters, and implemented aggregators, please see Available metrics in the UserMetrics API Guide.

What happens if a metric is undefined for a particular user?[edit]

The API returns a ‘-1’ value each time a metric is undefined. If a threshold metric is used to measure the number of new users who make 1 edit in the first 24 hours since registration, for example, the API will return ‘-1’ for any user who has not yet been around the full 24 hours. The API will never return a NULL response.

How do I run a timeseries?[edit]

A time series is used to look at the behavior of users over time. A time-series request returns an aggregate metric value for each slice of a specified time interval. The interval is specified with a ‘start’ and ‘end’ date parameter, and the time slice is specified with a ‘slice’ parameter. A time series request can be run with a ‘group’ parameter to return either data that reflects all user activity for each slice (group=‘activity’), or only the activity of users who registered within that slice (group= ‘registered’). By default, all activity for each slice is returned. For more information about building and working with time-series requests, please see Time series in the UserMetrics API Guide.

How do I export the results?[edit]

How do I report issues/send feedback?[edit]

To reach us and/or obtain help, please write to: usermetrics@wikimedia.org. Bugs and feature suggestions should be reported via Bugzilla. https://bugzilla.wikimedia.org/buglist.cgi?component=User%20Metrics&product=Analytics.


Do metrics include activity on deleted pages?[edit]

No. By default, metrics do not reflect the activity stored in the archive tables.