Content translation/status

Last update on: 2014-08-monthly

2013-07-01
Project started.

2014-01-monthly
The language engineering team kicked off development of a prototype version of context translation workflow. This functionality aims to create a workspace for helping editors bootstrap new articles in non-Latin language Wikipedias. In the prototype, Russian and Welsh are being used for initial concept verification.

2014-02-monthly
The prototype ContentTranslation server was created in Node.js, mostly by Santhosh Thottingal and David Chan. The server will be responsible for syncing the translations between all the languages, storing translated parallel texts (using Redis) and retrieving caching the results of language tools queries (machine translation, translation memory, dictionaries, segmentation, etc.).

Some front-end components for the translation interface were made, mostly by Sucheta Goshal and Amir Aharoni.

2014-03-monthly
 Santhosh Thottingal and David Chan continued development and technology research on the Content Translation project. Development was focused specifically on updates to the side-by-side translation editor and section alignment of translated text. Kartik Mistry and Santhosh Thottingal worked on infrastructure for testing the Content Translation server. David Chan continued his technology research on sentence segmentation.

Pau Giner updated the Content Translation UI design specification incorporating review comments from UX and product reviews. The team also participated in a review of the Content Translation project with the product team leadership.

2014-04-monthly
ContentTranslation was the team's main effort this month. Source text segmentation was further improved and stabilized. Other developed features include:
 * A beta feature that shows a red interlanguage link when the article is not translated to the user's language;
 * Basic handling of templates and images;
 * Basic publishing of the translation as a formatted article;
 * Testing infrastructure for the server.

2014-05-monthly
Most of the team met in Valencia to complete the ContentTranslation architecture and roadmap. The dictionary feature is now up for limited testing.

2014-06-monthly
The team added support for link adaptation, worked on the infrastructure for machine translation support using Apertium and on hiding templates, images and references that cannot be easily translated. They also prepared for deployment on beta wikis and made multiple bug fixes and design tweaks.

2014-07-monthly
An initial version was released on Beta Labs; it supports machine translation between Spanish and Catalan. The machine translation API leverages open source machine translation with Apertium. The tool supports experimental template adaptation between languages. Numerous bug fixes were made based on testing and user feedback. We worked on matching the Apertium version to the cluster, and planning for the next round of development has started.

2014-08-monthly
<section begin="2014-08-monthly"/>* Machine translation abuse algorithm redone.
 * Reference adaptation improvements.
 * Refactoring of the frontend event architecture.
 * Rewrote the cxserver registry to support multiple machine translation engines.<section end="2014-08-monthly"/>