Content translation/Deployments/How-to

From mediawiki.org

This document describes the deployment procedure for ContentTranslation, cxserver, Apertium, and MinT.

ContentTranslation[edit]

Content Translation is updated via regular MediaWiki train. In case a manual update is needed,

  1. Use the Backport window to Cherry-pick desired changes.
  2. Make sure that the Gerrit patch is merged only "after" the deployment server is updated to the branch we want to deploy.

See also[edit]

  1. Branches at Gerrit interface: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/ContentTranslation,branches
  2. To manually update the extension branch: https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Updating_the_submodule (You'll need a clean copy of MediaWiki/core)

Services[edit]

Status of all deployed services from Language team can be retrieved from this Grafana dashboard.

Deployment for all services is common except few minor changes. The following procedure can be applied for Apertium, Cxserver, and MinT.

Testing locally[edit]

Note the image tag version from the Gerrit patch to be deployed. For eg: gerrit:502964 has 2019-04-11-112002-production tag.

Run it:

docker run -p 4000:8080 docker-registry.wikimedia.org/wikimedia/mediawiki-services-cxserver:2019-04-11-112002-production --it --entrypoint /bin/bash -c config.dev.yaml

Where config.dev.yaml is the local cxserver config file.

Endpoints can be tested at: http://localhost:4000

For example, MinT can be tested using:

docker run -p 8989:8989 wikipedia-mt:2023-06-16-042302-production

Testing on staging[edit]

To test whether MinT translation is working on staging, we need to supply the source, target languages, and model to use.

curl 'https://machinetranslation.k8s-staging.discovery.wmnet:30443/api/translate' -X POST -H 'Content-Type: application/json' --data-raw '{"source_language": "en", "target_language": "wuu", "model": "madlad-400", "format": "text", "content":"Jazz is a music genre"}'

Testing MT services[edit]

Apertium and MinT[edit]

Example: curl -vk 'https://staging.svc.eqiad.wmnet:4002/v2/translate/en/es/Apertium' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "html": "<p>Water is cold</p>"}'

Services with proxy[edit]

Services like LingoCloud, Yandex, etc require cxtoken using, https://en.wikipedia.org/wiki/Special:ApiSandbox#action=cxtoken&format=json

Once the token is generated,

curl -vk 'https://staging.svc.eqiad.wmnet:4002/v2/translate/en/zh/LingoCloud' -H 'accept: application/json' -H 'Content-Type: application/json' -H 'Authorization: CXTOKEN' -d '{ "html": "<p>Water is cold</p>"}'

Testing APIs[edit]

Config files[edit]

Production config stays in:

helmfile.d/services/SERVICE/values.yaml and,

WMF-specific config stays in:

deployment-charts/charts/SERVICE/templates/_config.yaml

When anything under the chart directory is updated. Bump chart version in deployment-charts/charts/SERVICE/Chart.yaml

Also see[edit]

Deployment[edit]

  1. Clone deployment-charts repository (for first time).
  2. Do needful changes in config (update image or other configuration changes as needed).
  3. Make a CR (Example: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/623475) and after a successful review, merge it.
  4. After the merge, login to a deployment server (eg: deploy1001), there is a cron (1 minute) that will update the /srv/deployment-charts directory with the contents from git.
  5. Go to /srv/deployment-charts/helmfile.d/services/cxserver.
  6. Execute: helmfile -e ${CLUSTER} diff --context 5 This will show the changes that it will be applied to the cluster.
  7. Execute: helmfile -e ${CLUSTER} -i apply This will materialize the previous diff in the cluster and also will log into SAL for the change.

Status[edit]

This is done using helmfile:

  1. Change directory to /srv/deployment-charts/helmfile.d/services/${CLUSTER}/SERVICE on a deployment server
  2. Unless you are mid un-applied changes the current values files should reflect the deployed values
  3. You can check for un-applied changes with: helmfile -e ${CLUSTER} diff --context 5
  4. You can see the status with helmfile -e ${CLUSTER} status

Logs[edit]

  • Service logs are available in logstash such as 'cxserver-last-24-hours' and other similar dashboards.
  • Logs can be accessed from deploy1002 if needed:

kube_env cxserver eqiad

kubectl logs cxserver-production-6c4f65bc-z6hcb cxserver-production

cxserver-production-6c4f65bc-z6hcb is the pod name.

To see all logs:

kubectl logs -l app=cxserver -c cxserver-production

Rolling back changes[edit]

If you need to roll back a change because something went wrong:

Reverting patch[edit]

  1. Revert the git commit to the deployment-charts repository.
  2. Merge the revert (with review if needed)
  3. Wait one minute for the cron job to pull the change to the deployment server
  4. Change directory to /srv/deployment-charts/helmfile.d/services/SERVICE
  5. Execute helmfile -e ${CLUSTER} diff --context 5 to see what you'll be changing.
  6. Execute helmfile -e ${CLUSTER} -i apply where CLUSTER is one of (staging, eqiad, codfw).

When the patch with the chart reverted, helmfile will pick the highest number of chart present. Reverting such a change will require pinning the desired chart after the revert configuration. eg. https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/803873

Reverting to particular release[edit]

Sometimes helm process is stuck in the 'pending-upgrade' status.

First, set deploy user permission by,

export KUBECONFIG=/etc/kubernetes/${SERVICE}-deploy-${CLUSTER}.config

Check for the last deployed release from REVISION using the status command and rollback using,

helm -n ${SERVICE} rollback production ${REVISION}

Updating Template Parameter Alignment database[edit]

Cxserver Database[edit]

  • x1: wikishared.cx_*: cxserver main database.
  • m5-master: titles: section title mapping database.

Access[edit]

x1[edit]

x1 can be accessed via sql.php script on the mwmaint server. See: https://wikitech.wikimedia.org/wiki/Debugging_in_production#Debugging_databases

Note that ContentTranslation in testwiki is not using wikishared and using a separate database testwiki.

m5-master[edit]

m5-master requires cxserver user password access.

% mysql --skip-ssl -h m5-master.eqiad.wmnet -ucxserver -p
Enter password:

Secrets[edit]

If secrets like API keys or tokens need update, it needs to be done via SRE at Private Puppet repository.

To update or new keys, open the Phabricator task with details and subscribe SRE clinic duty person. Example: task T284887

Also see[edit]