Content translation/Published translations

Information about published translations are generally helpful for machine translation developers and others for different purposes. Content translation aims to provide this data as much as possible in different ways. The amount and details of the data will be improved over time. This page captures the current state.

List of of published source and target titles
Content translation has an API to get list of all published translations across languages. Currently the API output returns the following details(illustrated with example)
 * List all published translations across all languages. Example: https://ca.wikipedia.org/w/api.php?action=query&list=cxpublishedtranslations&limit=50&offset=5
 * List of all published translation between two languages: Example: https://es.wikipedia.org/w/api.php?action=query&list=cxpublishedtranslations&limit=50&offset=5&from=en&to=es

The stats data shows the percentage of translation completion. human indicate manual translation percentage. mt indicate machine translation percentage. Any edits on top of machine translation is considered as manual edits. The percentages are calculated at section level. any indicate the total translation(any=human+mt). Content translation does not demand full translation of source article. So users can freely translate section they feel to translate. mtSectionsCount shows the number of sections. These stats are also used for abuse prevention(read more about the percentage calculation in that page).

Future plan
We do not collect more than this information right now. But we are aware of the following details requested by others
 * Revision ids for source and published translation to identify exact revision of source and translation
 * Machine translation provider, if any
 * Section level alignment information about source and target content
 * Timestamps of translation

Phabricator tasks

 * https://phabricator.wikimedia.org/T111905