Content translation/Deletion statistics comparison

Data on the deletion ratio of Wikipedia articles created with and without Content Translation.

The query
To examine the queries used to created and to run this yourself, see query 53775 in Quarry. To look at different languages and dates, replace the database name and the timestamp value.

Wikis with higher deletion ratios for CX created articles
We reviewed wikis where the deletion rate of articles created with content translation is higher than the deletion rate for articles created with other tools as part of T286636 during a specified timeframe. This data is collected quarterly (every three months) to assess the evolution of deletion rates as improvements are made. This timespan was selected to caputre a sufficient time for editors to review content and avoid seasonalilty effects. Note: Articles created within a certain time period may be deleted at any point in the future, as a result, the deletion ratios for an identified time period may change over time depending on when the data is queried.

Data comes from the [mediawiki_history](https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/MediaWiki_history) table and reflects the deletion ratios of main namespace articles that were created using Content Translation compared to the deletion ratio for main namespace articles created without the tool. Bots were excluded. We also removed wikis where 15 or fewer articles were created with content translation during the reviewed period to reduce noise in the data and focus on wikis with more representative data.

Monthly deletion ratios for representative wikis (Jan 2016 - Jan 2019)
Monthly data about the deletion of articles created with and without Content Translation on several Wikipedias was prepared as part of T215397. This data is for whole months, not quarters, from January 2016 until January 2019.

The results can be found in a Google spreadsheet.

This spreadsheet is publicly shared and can be filtered, copied, etc. Note that only pages in the main namespace are counted. This may lead to discrepancies between this data and the data at Special:CXStats, which includes all namespaces.