User:Jeblad/NLP

Tag functions for dictionary
It will be defined a base dictionary, but it might med incomplete. This base dictionary will be defined as one or several pages, and can be protected against further changes. To make it possible to extend the dictionary for each article a few tag functions are necessary.

Parser functions for morphing
This form will inherit the language for the word with the one given as the site wide one or with the one specified within the lexical context. | This form will override the language for the word with the locally specified one, while the destination language is still given by the site wide one or with the one specified within the lexical context.

The initial word will always be analyzed.

The word can be a phrase, then the positional parameters will map to each word | or the positional parameters might be replaced with patterns | A pattern will use a best match (?)

Directives
The directives are pairs of operators and part of speech tags. The classes are noun (N), verb (vbmod, vbser, vbhaver), adjective (Adj) and so forth. Note that there are some differences between different tools. Part of speech tags can also be clustered with parenthesis, and this happens implicit if an example word is used. When this word is analyzed the resulting tags will be clustered. All words that isn't recognized as part of speech tags will be analyzed to produce tags, and also all words enclosed within string delimiters.

The operators ..
 * + tag : Add the following part of speech tags unconditionally to the set during synthesis.
 * - tag : Remove the following part of speech tags unconditionally from the set during synthesis.
 * ~ tag : The following part of speech tags will be preferred during synthesis.

Last observed operator takes precedence, except the fuzzy operator which is sticky. It will remain set on a tag, even if the tag is added once more, but the tag can be removed.

Example
If the following is evaluated inside Northern Sami Wikipedia, then the following results should be produced.

If we write a call like → → Aurdalas we simply says "translate the Norwegian Aurdal like the Northern Sami form Alvdalas". The word Alvdalas will then be analyzed and will produce a more complete form, and then this form will be used to produce Aurdalas. Sometimes the results from the analysis will be insufficient and we will have to refine it by adding or removing switches. This can be done like this → → Aurdala whereby Aurdal become Aurdala and not Aurdalas. In addition there are times when we don't know if we have a complete match. In those circumstances we want to get as close as possible to a given form, which could be given by an example word. We can write this as →  → Aurdalas

Patterns
The patterns are also pairs of operators and part of speech tags.

The operators ..
 * + tag : The following part of speech tags must match the word.
 * - tag : The following part of speech tags must not match the word.
 * ~ tag : The following part of speech tags will be preferred if it is possible to match the word with them.