Analytics/Metric definitions

Background: Both the community and WMF have been using a number of metrics to keep track of the health of our projects. This page keeps track of commonly used metrics and their definition. It is our goal to standardize these definitions and to make sure that they are consistently applied. Also it is important to use different terms for different metrics so as to avoid confusion. We hope to make the distinction between related metrics clear and easy to comprehend.

Note: unless otherwise indicated, these definitions apply to Wikistats and report card reports alike.

See also short introduction on key usage metrics in laymen's terms.

=Events=

Edits

 * In the context of wikistats edits are updates to countable pages in countable namespaces (aka mainspace).

=Content=

Mainspace

 * For all Wikimedia wikis namespace 0 is considered main namespace (aka mainspace). In addition these namespaces are counted in wikistats:


 * Commons: namespace 6 (file/image upload info), 14 (category pages)
 * Strategy Wiki: namespace 106
 * All wikisource wikis: namespace 102 (author), 104 (page), 106 (index)

Countable Pages

 * In the context of wikistats countable pages are pages which contain an internal link (aka wikilink) or category link, and are not a redirect page. This conforms to the traditional definition of an 'article' within the Wikimedia community.


 * In Wikistats article and editor counts vary a few percent dependent on which type of dump has been processed


 * In Wikistats the screening of pages which do not fit the criteria of countable content (does the page contain an internal or category link?) can only be applied when a full archive dump has been processed, which contains all content of every revision of every page. Due to time constraints most Wikistats reports are based on so called stub dump (all meta data, like editor name and edit time, but no article content). The article and editor counts will be a few percent higher when a stub dump served as input (for all months, as almost all history is rebuilt on very run of Wikistats).


 * Over time the countability criteria on Wikimedia evolved, and they still do. Different Wikimedia projects can now have different definitions of what constitutes an article. Some day Wikistats may dynamically establish countable namespaces per wiki via the |namespaces|namespacealiases|statistics API

Uploads

 * File uploads create a new page in namespace 6. The first revision for this page describes the original upload. For Commons uploader activity is based on this first revision per namespace 6 page only.


 * Metric uploads via uploadwizard for Commons can only be collected from full archive dumps (from category tag in page content).

=Content served= We need to make a distinction between page views and file requests. For both concepts reports are featured on stats.wikimedia.org

Page views
Derived from webstatscollector, which produces hourly aggregates based on incoming squid logs.


 * Relevant filter criteria
 * The url in a logline contains /wiki/. This excludes /w/index.php? and SpecialPages.
 * Not all public wikimedia projects are counted (e.g. foundation wiki).
 * Any article namespace qualifies (unlike the dump based reports (see 'Content' above)).

File requests
Many reports provide different breakdowns of files served by wikimedia squids. Breakdown by mime type, origin, destination, browser, OS etc. Also breakdown of edits or views by region, country, period, etc.


 * Relevant filter criteria
 * Any html request is counted in relevant reports, based on the supplied mime type 'text/html'. This includes counting special pages, pages requested using /w/index.php?.


 * Known issues:
 * Error pages are counted as html requests.
 * Mobile traffic does not yet supply mime type, this is inferred from url in the log line.

=Actors=

A person can be active editor on one wiki and very active editor on another wiki in the same month, or active editor in one month and very active editor on the same wiki in a different month.

Active editor
An 'active editor' is a registered (and signed in) person (not known as a bot) who makes 5 or more edits in any month in mainspace on countable pages.

Very active editor
An 'very active editor' is a registered (and signed in) person (not known as a bot) who makes 100 or more edits per month in mainspace on countable pages.

Contributor
A contributor is a person who made 10 or more edits over all time on one wiki. This is a cumulative count. Once a person counts as contributor that will always be, regardless of how long ago his/her last edit was.

Newbie editor
A 'newbie editor' is a person who has made fewer than 10 edits on some Wikimedia wiki in total.

Bot
Bots are recognized by wikistats in two ways. Ideally bots have been registered with the 'bot flag'. Many aren't registered or only on larger wikis. Wikistats treats any name that is registered as bot on 10 or more wikis as bot on all wikis. (this actually predates unified login, after which name collisions mostly became a thing of the past). Also all user names that contain a segment ending on 'bot' are seen as bot (with one a handful of explicit exceptions, which are verified real persons).