Analytics/Metric definitions

From MediaWiki.org
Jump to: navigation, search

Background: Both the community and WMF have been using a number of metrics to keep track of the health of our projects. This page keeps track of commonly used metrics and their definition. It is our goal to standardize these definitions and to make sure that they are consistently applied. Also it is important to use different terms for different metrics so as to avoid confusion. We hope to make the distinction between related metrics clear and easy to comprehend.

Note: unless otherwise indicated, these definitions apply to Wikistats and report card reports alike.

See also short introduction on key usage metrics in laymen's terms.

Events[edit | edit source]

Edits[edit | edit source]

In the context of wikistats edits are updates to countable pages (in #Content namespaces).

Content[edit | edit source]

Content namespaces[edit | edit source]

Namespace 0 is called main namespace (aka mainspace) because it's always the main content namespace of a wiki. In addition, Wikistats dynamically establishes extra content namespaces per wiki via the API (since July 2013, for all history)

Countable pages[edit | edit source]

In the context of wikistats countable pages are pages which contain an internal link (aka wikilink) or category link, and are not a redirect page. This conforms to the traditional definition of an 'article' within the Wikimedia community.

In Wikistats article and editor counts vary a few percent dependent on which type of dump has been processed

In Wikistats the screening of pages which do not fit the criteria of countable content (does the page contain an internal or category link?) can only be applied when a full archive dump has been processed, which contains all content of every revision of every page. Due to time constraints most Wikistats reports are based on so called stub dump (all meta data, like editor name and edit time, but no article content). The article and editor counts will be a few percent higher when a stub dump served as input (for all months, as almost all history is rebuilt on very run of Wikistats).

Over time the countability criteria on Wikimedia evolved, and they still do. Different Wikimedia projects can now have different definitions of what constitutes an article.

Uploads[edit | edit source]

File uploads create a new page in namespace 6. The first revision for this page describes the original upload. For Commons uploader activity is based on this first revision per namespace 6 page only.

The metric uploads via Upload Wizard for Commons can only be collected from full archive dumps (from the category tag in page content).

Content served[edit | edit source]

We need to make a distinction between page views and file requests. For both concepts reports are featured on stats.wikimedia.org

Page views[edit | edit source]

Derived from webstatscollector, which produces hourly aggregates based on incoming squid logs.

Relevant filter criteria
  • The url in a logline contains /wiki/. This excludes /w/index.php? and SpecialPages.
  • Not all public wikimedia projects are counted (e.g. foundation wiki).
  • Any article namespace qualifies (unlike the dump based reports (see 'Content' above)) .

File requests[edit | edit source]

Many reports provide different breakdowns of files served by wikimedia squids. Breakdown by mime type, origin, destination, browser, OS etc. Also breakdown of edits or views by region, country, period, etc.

Relevant filter criteria
  • Any html request is counted in relevant reports, based on the supplied mime type 'text/html'. This includes counting special pages, pages requested using /w/index.php?.
Known issues
  • Error pages are counted as html requests.
  • Mobile traffic does not yet supply mime type, this is inferred from url in the log line.

Actors[edit | edit source]

A person can be active editor on one wiki and very active editor on another wiki in the same month, or active editor in one month and very active editor on the same wiki in a different month.

Active editor[edit | edit source]

An 'active editor' is a registered (and signed in) person (not known as a bot) who makes 5 or more edits in any month in mainspace on countable pages. (Note: This definition differs from that of "active users" in Special:Statistics; see also some further discussion of it.)

Total active editors is a key metric for WMF and appears in the http://reportcard.wmflabs.org/ as "Active Wikimedia Editors for all Wikimedia Projects (5+ edits per month)".

Very active editor[edit | edit source]

An 'very active editor' is a registered (and signed in) person (not known as a bot) who makes 100 or more edits per month in mainspace on countable pages.

Contributor[edit | edit source]

A contributor is a person who made 10 or more edits while logged-in over all time on one wiki. This is a cumulative count. Once a person counts as contributor that will always be, regardless of how long ago his/her last edit was.

Important note: the "birth" as contributor is considered to be in the month when the 10th edit is made, not when the users registers or makes the first edit. (This more or less since the "editors leaving or joining in droves" controversy.)

Newbie editor[edit | edit source]

A 'newbie editor' is a person who has made fewer than 10 edits on the specific Wikimedia wiki in total, similar to the newly registered user group according to the English Wikipedia configuration.

Bot[edit | edit source]

Bots are recognized by wikistats in two ways. Ideally bots have been registered with the 'bot flag'. Many aren't registered or only on larger wikis. Wikistats treats any name that is registered as bot on 10 or more wikis as bot on all wikis. (this actually predates unified login, after which name collisions mostly became a thing of the past). Also all user names that contain a segment ending on 'bot' are seen as bot (with one a handful of explicit exceptions, which are verified real persons).