Help:Extension:Translate/Validators

Translatable strings often contain markup that should be retained as-is in the translation. Typing that markup can be slow and difficult because special characters are common. Translate extension can provide translators a button clicking on which inserts the piece of markup into the translation to the current cursor position. In addition, if a translation is missing that specific markup, the Translate extension can either warn the translator or simply reject the translation, since such markup is usually mandatory to display the messages properly to the end user.

For example in the string, "Adapted by %{name} from a work by %{original}" there are two insertables - and. If the translator does not add them to their translation, the end user using the software will not see a proper message.

The framework has been added with the intent of helping with validating translations. Validators run on the translated message and based on the configuration, a warning or error message is shown to the translator. Translations with warnings can still be saved, but ones that have error cannot. Only a user with permission can save translations that have errors.

When configuring a validator, a regex is defined to identify markup that is mandatory. The validator can also be marked as insertable, in which case a button will be displayed to the translator to add that markup into the translation.

Adding a custom validators is still possible and will be needed for more specialized validations.

Configuration
Following is a summarized validator configuration,

In the example above,


 * 1)   is a bundled validator that can accept a custom regex and run validations.
 * 2)   is a custom validator class.
 * 3)   is another bundled validator.

uses an array format. Lets look at the various parameters being used here in each array item,

Pre-provided / Bundled validators
Following is a list of bundled validators,

BraceBalance
ID:

Ensures that the number of open braces / brackets, matches the number of closed braces / brackets in the translation.

For example, the following translations would pass,



whereas, the following would fail,



This validator cannot be marked as insertable.

EscapeCharacter
ID:

The validator ensures that only the specified escape character are present in a translation.

The allowed escape characters can be specified when adding the validator, and can only include,



This validator is not insertable.

GettextNewline
ID:

This works specifically for GetText based message groups.

Ensures that the translation has the same number of newlines as the source message at the beginning and end of the string.

GettextPlural
ID:

This works specifically on GetText based message groups.

Ensures that if the source / definition contains a plural in the format -, the translation must contain it as well. Based on the language this also checks if the translation has the correct number of plural forms. For example, English has two, but Hebrew has four.

InsertableRegex
ID:

A generic reusable validator that can be used to specify custom validations and insertables.

For example, take the following configuration where the validator is marked as insertable and enforced,

Given the following source message - ''Hello $name. My name is $myName. that is being translated, the translation must have the parameters - $name and $myName''. They will also be displayed as insertables to make it easier for translators to use them in the translation. An absence of these parameters will cause an error to be displayed to the translator.

InsertableRubyVariable
ID:

This is a validator that matches ruby variables in the translations. Internally it extends and uses the following regex -. This validator is insertable.

Example:

IosVariable
ID:

An insertable variable validator for Ios. Regex is used from here [ https://github.com/dcordero/Rubustrings/blob/61d477bffbb318ca3ffed9c2afc49ec301931d93/lib/rubustrings/action.rb#L91 here]. This validator is insertable.

Example:

MatchSet
ID:

Ensures that the translation is present in the list of values. Also takes a parameter - that can be either true (default) or false.

For example the following configuration, the validator will validate the message with key - and ensure that the values for it can either be ltr or rtl. Note that LTR or RTL will not be valid values, since is true by default.

MediaWikiLink
ID:

Checks if the translation uses links that are discouraged. Valid links are those that link to Special: pages, or project pages trough MediaWiki messages like. Also links in the definition are allowed.

MediaWikiPageName
ID:

Ensures that if the source / definition contains a namespace such as the translations made do not try to translate the namespaces itself.

MediaWikiParameter
ID:

This is a validator that matches wiki parameters in the translations. Internally it extends and uses the following regex -. This validator is insertable.

Example: ,.

MediaWikiPlural
ID:

Ensures that if the source / definition contains a, the translation should also have it. It can also be used as an insertable. Based on the language this also checks if the translation has the correct number of plural forms. For example, English has two, but Hebrew has three.

MediaWikiTimeList
ID:

Provides validations for expiry options and IP block options specified in the MediaWiki core. These are usually in the format,

The validations ensure that the translations have the exact same number of key-value pairs. These validations are run only on messages with keys,


 * 1) protect-expiry-options
 * 2) ipboptions

Newline
ID:

Ensures that the translation has the same number of newlines as the source / definition message at the beginning of the string. This validator is not insertable.

NotEmpty
ID:

Ensures that the translation has some content, and that content is not just whitespace. This validator is not insertable.

NumericalParameter
ID:

This validator matches numerical parameters by using the following regex:. This validator is insertable.

Example: ,  etc.

Printf
ID:

This validator checks for missing and unknown printf formatting characters in translations. This validator is insertable.

Example: ,  etc.

PythonInterpolation
ID:

This validator matches python string interpolation variables by using the following regex:. This validator is insertable.

Example: ,

Replacement
ID:

Checks if a translation is using the string, and instead suggests the translator to use the string mentioned under. This validator is not insertable.

SmartFormatPlural
ID:

This works specifically on SmartFormat based message groups.

Ensures that if the source / definition contains a plural in the format -, the translation must contain it as well. Based on the language this also checks if the translation has the correct number of plural forms. For example, English has two, but Hebrew has four.

UnicodePlural
ID:

Ensures that if the source / definition contains a plural in the format -, the translation must contain it as well. Based on the language this also checks if the translation has the correct number of plural forms. For example, English has two, but Hebrew has three.

User interface
The user interface has been updated to differentiate between errors and warnings.

During translation, if an error is noticed with the translation, the Save translation button is disabled unless the user who is translating has permission.

Additionally validation is also done on the server when the user is saving the translation. This will still allow users who have the permission to save the translation even if it has errors.

Custom validators
Certain complicated validations might still require a custom validator to be written. Custom validators must implement the interface.

Below is an example of a custom validator,

Also see the following classes,


 * 1)   - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Translate/+/master/src/Validation/ValidationIssues.php
 * 2)   - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Translate/+/master/src/Validation/ValidationIssue.php

The add the custom validator in the configuration file,