Analytics/Metric definitions

Background: Boten using a number of metrics to keep track of the health of our projects. This page keeps track of commonly used metrics and their definition. It is our goal to standardize these definitions and to make sure that they are consistently applied. Also it is important to use different terms for different metrics so as to avoid confusion. We hope to make the distinction between related metrics clear and easy to comprehend.

Note: unless otherwise indicated, these definitions apply to Wikistats.

See also short introduction on key usage metrics in laymen's terms.

Edits
In the context of wikistats edits are updates to countable pages (in ).

Content namespaces
Namespace 0 is called main namespace (aka mainspace) because it's always the main content namespace of a wiki. In addition, Wikistats dynamically establishes extra content namespaces per wiki via the |namespaces|namespacealiases|statistics API (since July 2013, for all history)

Countable pages
In the context of wikistats countable pages are pages which contain an internal link (aka wikilink) or category link, and are not a redirect page. This conforms to the traditional definition of an 'article' within the Wikimedia community.

In Wikistats article and editor counts vary a few percent dependent on which type of dump has been processed

In Wikistats the screening of pages which do not fit the criteria of countable content (does the page contain an internal or category link?) can only be applied when a full archive dump has been processed, which contains all content of every revision of every page. Due to time constraints most Wikistats reports are based on so called stub dump (all meta data, like editor name and edit time, but no article content). The article and editor counts will be a few percent higher when a stub dump served as input (for all months, as almost all history is rebuilt on every run of Wikistats).

Over time the countability criteria on Wikimedia evolved, and they still do. Different Wikimedia projects can now have different definitions of what constitutes an article.

Uploads
File uploads create a new page in namespace 6. The first revision for this page describes the original upload. For Commons uploader activity is based on this first revision per namespace 6 page only.

The metric uploads via Upload Wizard for Commons can only be collected from full archive dumps (from the category tag in page content).

Content served
We need to make a distinction between page views and file requests. For both concepts reports are featured on stats.wikimedia.org

Page views
See meta:Research:Page view

Historical definition (used on Wikistats and elsewhere until 2015):

Derived from webstatscollector, which produces hourly aggregates based on incoming squid logs.


 * Relevant filter criteria
 * The url in a logline contains /wiki/. This excludes /w/index.php? and SpecialPages.
 * Not all public wikimedia projects are counted (e.g. foundation wiki).
 * Any article namespace qualifies (unlike the dump based reports (see 'Content' above)).

File requests (historical)
Many reports provide different breakdowns of files served by wikimedia squids. Breakdown by mime type, origin, destination, browser, OS etc. Also breakdown of edits or views by region, country, period, etc.


 * Relevant filter criteria
 * Any html request is counted in relevant reports, based on the supplied mime type 'text/html'. This includes counting special pages, pages requested using /w/index.php?.


 * Known issues:
 * Error pages are counted as html requests.
 * Mobile traffic does not yet supply mime type, this is inferred from url in the log line.

Actors

 * see also Contributors/Metrics

A person can be active editor on one wiki and very active editor on another wiki in the same month, or active editor in one month and very active editor on the same wiki in a different month.

Active editor
An 'active editor' is a registered (and signed in) person (not known as a bot) who makes 5 or more edits in any month in mainspace on countable pages.

(See also some further discussion of it. Note: This definition differs from that of "active users" in Special:Statistics, which counts any account with an action recorded in the RC table during the last 30 days - see the source code of SpecialActiveusers.php for more details.)

As of 2016, there is an attempt to make the definition more detailed: m:Research:Active editor.

Very active editor
A 'very active editor' is a registered (and signed in) person (not known as a bot) who makes 100 or more edits per month in mainspace on countable pages.

Contributor

 * Nicknamed "Wikipedian", "Wiktionarian", "Wikiquoter" etc. in stats.wikimedia.org.

A contributor is a person who made 10 or more edits while logged-in over all time on one wiki. This is a cumulative count. Once a person counts as contributor that will always be, regardless of how long ago his/her last edit was.

Important note: the "birth" as contributor is considered to be in the month when the 10th edit is made, not when the users registers or makes the first edit. (This more or less since the "editors leaving or joining in droves" controversy.)

Newbie editor
A 'newbie editor' is a person who has made fewer than 10 edits on the specific Wikimedia wiki in total, similar to the newly registered user group according to the English Wikipedia configuration.

Bot
Bots are recognized by wikistats in two ways. Ideally bots have been registered with the 'bot flag'. Many aren't registered or only on larger wikis. Wikistats treats any name that is registered as bot on 10 or more wikis as bot on all wikis. (this actually predates unified login, after which name collisions mostly became a thing of the past). Also all user names that contain a segment ending in 'bot' are seen as bot (with only a handful of explicit exceptions, who are verified real persons).