Wikimedia Product/Data dictionary/content edit daily

From mediawiki.org

This page describes the data set content_edit_daily that is loaded from jiawang2.content_edit_daily on Hive through Presto, which can be accessed via Superset. jiawang2.content_edit_daily on Hive is a derived and aggregated table of the raw event table wmf.mediawiki_history and the topic table isaacj.article_topics_outlinks_2021_07

Schema[edit]

Field name Data type Description Data example Source schema Source field
`date` date The date the edits made 2021-05-02 wmf.mediawiki_history event_timestamp
project string The project the edits made on af.wikipedia canonical_data.wikis domain_name
wiki_db string The porject code enwiki wmf.mediawiki_history wiki_db
topic string Topics related to certain articles using outlink-based model (refer to the taxonomy for detailed article topics) Geography.Regions.Asia

.East_Asia

isaacj.article_topics_outlinks topic
main_topic string Main topic tagged on page Geography cchen.topic_component main_topic
sub_topic string Sub topic tagged on page East_Asia cchen.topic_component sub_topic
user_is_anonymous boolean Whether user is anonymous or not FALSE wmf.mediawiki_history event_user_is_anonymous
user_is_bot boolean Whether user is bot or not TRUE wmf.mediawiki_history event_user_is_bot_by_historical
edit_count BIGINT Number of edits 12 wmf.mediawiki_history count(1)
month string The month the edits made 2021-05


Dashboards which use this table[edit]

Edit Topic Dashboard

Known issues and changes[edit]