Wikimedia Product/Data dictionary/pageviews_daily

From mediawiki.org


This page describes the data set pageviews_daily that stores on Druid Datasources, which can be accessed via Superset/Turnilo. pageviews_daily on Druid is generated by aggregating wmf.pageview_hourly on Hive by day, while wmf.pageview_hourly on Hive is extracted from wmf.pageview_actor.

Schema[edit]

Field name data type description data example source schema source field
project string Project name from requests hostname aa.wikibooks wmf.pageview_actor pageview_info['project']
agent_type string Agent accessing the pages, can be spider, user or automated (see BotDetection) user wmf.pageview_actor agent_type
ua_browser_family string Name of web browser (if not using an official Wikipedia mobile app), extracted from the client device's User-Agent Opera Mini wmf.pageview_actor user_agent_map['browser_family']
ua_wmf_app_version string Version of official Wikipedia mobile app (for iOS, Android, and KaiOS), extracted from the client device's User-Agent - wmf.pageview_actor user_agent_map['wmf_app_version']
country string Country (text) of the accessing agents (computed using maxmind GeoIP database) Iran wmf.pageview_actor geocoded_data['country']
country_code string Country iso code of the accessing agents (computed using maxmind GeoIP database) IR wmf.pageview_actor geocoded_data['country_code']
ua_os_major string Operating System family used by the client device, extracted from the User-Agent - wmf.pageview_actor user_agent_map['os_major']
continent string Continent of the accessing agents (computed using maxmind GeoIP database) Europe wmf.pageview_actor geocoded_data['continent']
ua_os_family string Operating System family used by the client device, extracted from the User-Agent Other wmf.pageview_actor user_agent_map['os_family']
language_variant string Language variant from requests path (not set if present in project name) default wmf.pageview_actor pageview_info['language_variant']
ua_os_minor string Minor version of that Operating System, extracted from the client device's User-Agent - wmf.pageview_actor user_agent_map['os_minor']
referer_class string Can be none (null, empty or \'-\'), unknown (domain extraction failed), internal (domain is a wikimedia project), external (search engine) (domain is one of google, yahoo, bing, yandex, baidu, duckduckgo), external (any other) internal wmf.pageview_actor referer_class
zero_carrier string NULL as zero program is over Null wmf.pageview_actor NULL
access_method string Method used to access the pages, can be desktop, mobile web, or mobile app desktop wmf.pageview_actor access_method
ua_browser_major string Major version of the client browser, extracted from the client device's User-Agent 4 wmf.pageview_actor user_agent_map['browser_major']
project_family string Project family wikipedia canonical_data.wikis database_group
view_count bigint Number of views 1 wmf.pageview_actor count(1) then aggregated by day

Dashboards which use this table[edit]

Pageviews Dashboard

Known issues and changes[edit]