Extension talk:Translate/Mass migration tools

Feedback
Ok, this looks good enough a draft to seek more comments, especially as you need to write the "use cases" section and that can only be done by hearing from users: so far you only had the point of view of one translation admin, me; I can "represent" most of the concerns from MediaWiki.org, Meta-Wiki and Commons translation admins but not everything. Stakeholders/venues to contact, where you can post an invite to come check your proposal, comment it and watchlist it: --Nemo 12:45, 2 March 2014 (UTC)
 * Project talk:Language policy, Project:Current issues
 * OSM
 * userbase.kde.org and other wikis of the family (whatever place they use to discuss)
 * m:Meta talk:Babylon
 * commons:Commons:Translators' noticeboard
 * wikidata:Wikidata:Translators' noticeboard
 * mediawiki-i18n, extension talk:Translate
 * Other wikis, among the 100+ using Translate, which have a lot of old translations

Feedback plus suggestions
Having read your proposal, I think it is worth being implemented.

Question: How to implement it?
 * As an extension inside MediaWiki?
 * Somewhat indepentent, such as with the Pywikibot framework?

A workflow question: Importing existing translations (i.e. step 2) likely often needs to be done by people who can read and understand these translations well enough. It may take a long time to find these experts. What happens meanwhile, so as to not hamper translating to other languages? Is it possible to have a consistent mix of unsplit translations while other language pages and the source page are split already?

Ideas and suggestions: --Purodha Blissenbach (talk) 11:37, 10 March 2014 (UTC)
 * First time splitting a source page into translateable units is language dependant. At least it depends on language types and writing systems. I would suggest to create some very basic code for it, and then only implement English, and thus likely some other Latin script based European languages, to begin with. English would be the most predominant use case anyways.
 * Splitting strategies vary on text types. Thus allow users to choose the best one. At least, I would suggest to have "by sentence", plus "by paragraph", plus, of course, what existing markup may suggest.
 * Hi, thank you very much for your feedback and suggestions.
 * It would be implemented as something independent, it won't be an Extension inside MediaWiki. I am fine with both PHP and Python, but the "Skills required" part of the project listed "PHP" as a required skill. Basically, it would be a bot asking for confirmation when needed.
 * I am not sure about the relevance of the splitting strategy here, as that is already the job of Translate extension. The Translate extension does the job of splitting into units, once the page is prepared and marked for translation, which is what the first part of this project is all about. And then, the step 2 would import the translations already present before Fuzzy Bot's edit (example). Right now, this importing job is tedious copy-paste work and the person doing that need not know the various languages. Having the source text (English) and the translated page before Fuzzy bot's edit is sufficient enough to do that. The second part is about automating this work. Hope this clears the workflow.
 * BPositive (talk) 14:37, 10 March 2014 (UTC)