Jump to content

Content translation/Development Plan/Roadmap/CX02Release

From mediawiki.org

Content Translation 0.02 release


The goal of this release is to make the translation process more fluent and provide more flexibility in the way they start to translate. See below for the detailed development plan for each of these features

Increase language support (Labs->Beta->Prod)

  1. Languages with high-quality support through Machine Translation Engines
    1. Define criteria for enabling new language pairs. Done
    2. Selection waiting on prelim user testing of production-ready language pairs in Apertium
    3. Blocked due to technical issues in the infrastructure setup on wikimedia betalabs

Feature Set

  1. New entry Points
    1. Translation dashboard to initiate and continue translations.
      1. Auto-saving translation drafts as users translate.
      2. Initiate translations from dashboard
      3. Notifications pointing to the dashboard) about relevant translation-related events.
    2. Entry point to the dashboard from the contributions page.
  2. Editor: improved language tools
    1. Editing
      1. Keep focus on content for a fluent editing.
      2. Warnings and options for existing translations.
      3. Avoid formatting to be added when pasting content.
    2. Exploration and basic support for the Yandex, Google or Bing API
    3. Category adaptation
    4. Better support for links:
      1. Red links support
      2. Handle link adaptation for disambiguation pages
      3. Creating links and editing their target
  3. Infrastructure improvements
    1. Make it ready to be deployed.
  4. Analytics:
    1. Content Translation publishing data
    2. Visualization (basic)

Auto-saving translation drafts


From gerrit:172528: this is about translation drafts. A translator can save translation and resume later. The draft content is annotated html with segmented sections and sentences (also lot of other data in DOM that represent a state in translation workflow). This drafts won't be available as articles but it can be opened in translation editor and resumed, published.

Drafts can be resumed from any OS, browser, any wiki, any machine, any other translator (this is futuristic) from content translation central dashboard.

Production Deployment - Resources & Provisioning



Completion Date/Milestones Features Sprints
October 8 - October 21 2014
October 22 - November 4 2014
November 5 - November 18 2014

Development Plan


Mingle Story Board

Feature Details
Entry Points
  • Translation Dashboard (more below)
  • Entry point: New translation from Contributions page
  • "New translation" dialog improvements
  • Notifications pointing to the dashboard) about relevant translation-related events.
  • Layout and Design
    • Top navigation bar adjustments
    • Keep text focus on content for a fluent editing.
  • Editing
    • Handle red links in the source column
    • Adapt red links in the translation
    • Existing translation: warning and options
    • Link highlighting: distinguish active from connected links
    • Auto-save translations
  • Publishing
    • Mark articles published with a high amount of automatic translation
    • Warnings about existing articles and options to deal with them
Link and Category Adaptation
  • Auto-adapt categories
  • A keyboard shortcut for link adaptation
  • Support link adding with disambiguation pages
  • Link adaptation - edit links
  • Red link adaptation.
Translation Dashboard
  • Create a new translation from the Translation Center
  • Add content to existing articles
  • Translation-related notifications infrastructure
  • Continue an existing translation from the Translation Dashboard list
Machine Translation Support (mt)
  • Support for one additional translation service.
Dictionary Support
Templates Support
Architecture (technical feature)
Research and preliminary development
  • Expose Content Translation publishing data
  • Update publishing data collection
  • Set up Limn instance in labs

CX Deployment Plan for 0.02 Release November 2014


Deployment date: TBD

Project: Content Translation Framework

Release: 0.02 (third release)

Long-term project roadmap: Content_translation/Roadmap

Language Pairs to be supported:

Release as: Beta Feature

Overall Plan


System Architecture


See: https://www.mediawiki.org/wiki/Content_translation/Technical_Architecture



Caching Architecture


The following diagram includes the caching requirements for the CX framework.



Components to be provisioned for production


CX server installation and configuration: https://git.wikimedia.org/markdown/mediawiki%2Fservices%2Fcxserver.git/HEAD/README.md

See Setup: https://www.mediawiki.org/wiki/Content_translation/Setup for detailed information about component, installation and configuation and instructions.

  • Node.js
  • Apertium
  • Extension dependencies:
    • BetaFeatures
    • CLDR
    • EventLogging
  • Backend Services


  • External APIs called by CX
    • Wikidata
    • Parsoid API
  • Configuration Scripts

Upstart and Systemd scripts are at: https://www.mediawiki.org/wiki/Content_translation/Setup

Provisioning Plan

  • Storage Requirements

To be determined from discussion with ops

  • Hardware Requirements

To be determined from discussion with ops

  • Bandwidth Requirements

To be determined from discussion with ops

  • Performance expectations
    • MT TPS (Transactions per second)
    • User responsiveness
    • MT Round trip
    • General guidelines

Monitoring and metrics

  • EventLogging activity for CX
  • Number of users enabling the feature
  • Performance of S:CX, backend calls?
  • Check for node and varnish? Who to page?
  • Graph showing requests or timings for the WikiData API(s) we are calling
  • Graph showing requests or timings for the Parsoid API(s) we are calling

External Signoffs Required

  • Faidon - Ops
  • Gabriel - Infrastructure architecture
  • Ori - Performance
  • Chris Steipp - Security
  • Greg G - Release engineering
  • Mark - Ops
  • Tim - Platform

LE Team responsibilities

  • Kartik - Deployment, Engineer
  • Niklas - Engineer, Code Reviewer
  • Santhosh - Engineer, Code Reviewer
  • David - Engineer, Code Reviewer
  • Joel - Engineer, Code Reviewer
  • Runa - Team Scrum-Ninja / testing and communications
  • Pau - Feature UX reviewer, designer
  • Amir - Feature signoff
  • Alolita - Engineering coordination, Eng Manager