User:Seb35/Terminology

From mediawiki.org

Random notes for adding a terminology module inside the Translate extension.

Big picture[edit]

  • Terminology: set of keywords defined in a small context (a document, a specific topic area) with a special meaning; in the context of the translation, the terms have some fixed translation(s)
  • The terminology database has to be created, either manually, either automatically, either semi-automatically, or it can be imported from other terminology databases
  • It is wanted the translator is proposed suggestions of terms, possibly with a definition and sorted by likelihood (e.g. (ultimately) correct male/female/plural/grammatical case, correct letter case, etc.)
  • Manage the input inflections of the words (grammatical forms) and output inflections

Backend:

  • Efficiently match terms in existing and new messages, either on-line or off-line (e.g. during importation/creation of a new document to be translated), it has to be done only once per document

PoC[edit]

  • gerrit:132384
    • Introduce a namespace Terminology: where content of the pages are the given terms to be searched
    • If a term is found in the Terminology: namespace, the translation used is the content of the page Terminology:$term/$language
    • Asumptions: the terminology database is already in the Terminology: namespace, no interface to add it
    • Problems: interferences between the languages (terms are search accross the whole namespace), inflections not managed, probably not scalable when the number of terms increases