Content translation/Machine Translation/Configuration

From mediawiki.org

Machine Translation configuration for ContentTranslation stays in various repositories. This article tries to list them as reference for a service maintainer in WMF Infrastructure.

Add new languages in existing Machine Translation[edit]

Update cxserver/config/MTSERVICE.yaml file (example: Google.yaml) for new language. Make sure to check respective web service APIs if language is listed there. For example, Google Translate supported languages are listed with code at: https://cloud.google.com/translate/docs/languages

Default machine translation for language pairs[edit]

If language pair has more than one machine translation services, cxserver will pickup first one from alphabetical order as a default MT.

To update default MT for a language pair, update cxserver/config/mt-defaults.wikimedia.yaml This affects production. For local testing or other purpose, use cxserver/config/mt-defaults.yaml

Examples:

'en-zh': Google
'sv-da': Apertium

Adding machine translation for closely related languages[edit]

  • Update lib/mt/MTSERVICE.js as needed.

For example, update googleLanguageNameMap in lib/mt/Google.js to add new mapping for closely related languages.

wuu: 'zh', // T258919 
gan: 'zh-TW' // T258919

See also[edit]

Removing language from the target[edit]

  • Update cxserver/config/MTSERVICE.yaml, add 'notAsTarget' block for a language that not needed as the target in Machine Translation.

Example: If we don't need 'xy' language as a target:

notAsTarget:
- xy

Update machine translation threshold for a wiki[edit]

  • Update mediawiki-config/wmf-config/InitialseSettings.php using wmgContentTranslationUnmodifiedMTThresholdForPublish. Example

Disable or enable machine translation for a wiki[edit]

  • Update mediawiki-config/wmf-config/InitialseSettings.php using wgContentTranslationEnableMT (true or false). Example

Add new Machine Translation service[edit]

Add code as mentioned in the above page. Update configuration in deployment-charts repository and this also requires coordination with SRE team for deploying private API key. It may also require security review.