Content translation/Development Plan/Roadmap

Content Translation Minimum Viable Product (MVP) release
See below for the detailed development plan for each of these features

Feature Set: Translation View

 * 1) Entry points
 * 2) Entry point: Red interlanguage link
 * 3) Editor
 * 4) Basic editor to support side-by-side and section-by-section editing
 * 5) Progress bar
 * 6) Tools column
 * 7) Limited machine translation with support to detect lack of post-editing (correction of machine translation output)
 * 8) Reference adaption - unchanged copying and manual editing
 * 9) Rich-text manipulation with LinearDoc (partially flattened tree structure)
 * 10) Publish the content as a user sub-page after HTML to wikitext conversion with Parsoid
 * 11) Link Adaptation
 * 12) Support for link adaptation (applied to source text only) (wikidata support)
 * 13) Support for link adaptation (applied to MT target text)
 * 14) Machine translation support (MT)
 * 15) Beta deployment using Apertium MT: Spanish-Catalan
 * 16) MT translation warning and progress
 * 17) Store MT unchanged percentage on save (can use a category)
 * 18) Dictionary support
 * 19) At least one bilingual dictionary based on dictd files from Freedict or similar #3639.
 * 20) Templates support
 * 21) Copy other templates unchanged
 * 22) Do not allow editing templates, references or other alienated content. Editability must be whitelisted not blacklisted.
 * 23) Support for template “adaptation” in at least one language pair
 * 24) Architecture (technical feature)
 * 25) Use aggressively cacheable architecture
 * 26) Server testing infrastructure

Production Deployment - Resources & Provisioning
WIP

Development Plan
Mingle Story Board

CX Deployment Plan for Beta Feature July 2014
Deployment date: TBD (July 1-15 2014)

Project: Content Translation Framework

Initial Release: Beta Feature release July 2014

What is targeted for Beta Feature release July 2014

Minimum Viable Product (MVP) for v1.0: https://www.mediawiki.org/wiki/Content_translation/Roadmap#Content_Translation_Minimum_Viable_Product_.28MVP.29_release

Long-term project roadmap: https://www.mediawiki.org/wiki/Content_translation/Roadmap

Language Pair to be supported for MVP: Spanish - Catalan

Release as: Beta Feature

Overall Plan
We plan to have deploy the CX framework for only Spanish and Catalan language wikis to support article content translation for Spanish to Catalan where there are red links for parallel articles in Catalan.

System Architecture
See: https://www.mediawiki.org/wiki/Content_translation/Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation#Workflow_and_Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation

Caching Architecture
The following diagram includes the caching requirements for the CX framework.

https://www.mediawiki.org/wiki/Content_translation/Server_communications_workflow

https://commons.wikimedia.org/wiki/File:CX_ArchitectureV1.svg

Components to be provisioned for production
CX server installation and configuration: https://git.wikimedia.org/markdown/mediawiki%2Fservices%2Fcxserver.git/HEAD/README.md

See Setup: https://www.mediawiki.org/wiki/Content_translation/Setup for detailed information about component, installation and configuation and instructions.


 * Node.js


 * Dictd server (Also see: https://www.mediawiki.org/wiki/Content_translation/Dictionaries)


 * Apertium


 * Extension dependencies:
 * BetaFeatures
 * CLDR
 * EventLogging

Varnish:
 * Backend Services


 * External APIs called by CX
 * Wikidata
 * Parsoid API

Upstart and Systemd scripts are at: https://www.mediawiki.org/wiki/Content_translation/Setup
 * Configuration Scripts

Provisioning Plan
a. Storage Requirements To be determined from discussion with ops

b. Hardware Requirements To be determined from discussion with ops

c. Bandwidth Requirements To be determined from discussion with ops

d. Performance expectations https://www.mediawiki.org/wiki/Performance_guidelines
 * MT TPS (Transactions per second)
 * User responsiveness
 * MT Round trip
 * General guidelines

https://www.mediawiki.org/wiki/Performance_profiling_for_Wikimedia_code

Monitoring and metrics

 * EventLogging activity for CX
 * Number of users enabling the feature
 * Performance of S:CX, backend calls?
 * Check for node and varnish? Who to page?
 * Graph showing requests or timings for the WikiData API(s) we are calling
 * Graph showing requests or timings for the Parsoid API(s) we are calling

External Signoffs Required

 * Faidon - Ops
 * Gabriel - Infrastructure architecture
 * Ori - Performance
 * Chris Steipp - Security
 * Greg G - Release engineering
 * Mark - Ops
 * Tim - Platform

LE Team responsibilities

 * Kartik - Deployment, Engineer
 * Niklas - Engineer, Code Reviewer
 * Santhosh - Engineer, Code Reviewer
 * David - Engineer, Code Reviewer
 * Runa - Team Scrum-Ninja / testing and communications
 * Pau - Feature UX reviewer, designer
 * Amir - Feature signoff
 * Alolita - Engineering coordination, Eng Manager

Updates

 * cxserver deployment repository: https://gerrit.wikimedia.org/r/mediawiki/services/cxserver/deploy ✅
 * Create cxserver instance on beta labs
 * Puppetize cxserver: https://gerrit.wikimedia.org/r/#/c/139095/ ✅