SQL/XML Dumps

This document is for people adding features or new datasets to the SQL/XML dumps. It is focused on the implementation as deployed at the Wikimedia Foundation; if you are dumping a large number of wikis elsewhere, you will need to make appropriate adjustments.

A few documents are for adding new dump items to the non-SQL/XML dumps. These are extremely specific to the Wikimedia Foundation, but they might prove interesting to third party users.

Background reading

 * Enduser documentation: Data_dumps
 * Maintainer documentation: Dumps
 * How to write a dumps maintenance script: SQL/XML_Dumps/Writing_maintenance_scripts
 * Other dumps-related pages on this wiki: Category:Import/Export

Workshop documents

 * August 2020: "SQL/XML Dumps/Daily life with the dumps"
 * August 2020: "SQL/XML Dumps/Anatomy of a dumps job"
 * September 2020: "SQL/XML Dumps/A dump job using an existing MediaWiki script"
 * September 2020: "SQL/XML Dumps/Command management walkthrough"
 * September 2020: "SQL/XML Dumps/Stubs, page logs, abstracts"
 * September 2020: "SQL/XML Dumps/Wikibase dumps via cron"
 * October 2020: "SQL/XML Dumps/Running a dump job"
 * October 2020: "SQL/XML Dumps/"Other" dump jobs via cron
 * December 2020: "SQL/XML Dumps/Puppet for dumps maintainers" plus slides, speaker notes

Becoming a dumps co-maintainer

 * February 2021: "Setup"
 * Unknown (WIP): "Deployment-prep"
 * March 2021: "Access"
 * March 2021: File: Wikimedia dumps high level overview.pdf (6 slides)

General talks

 * November 2020: "Dumps are not backups": slides, speaker notes

Getting around the code
This is currently a stub. Nag the editor(s) if a few weeks pass with no activity.