Jump to content

Writing systems/Syntax

From mediawiki.org
Translated version of zh:Help:高级字词转换语法

This page describes special markups found in LanguageConverter, a system which converts between language variants via means of character/word replacement. In all examples below, characters in lowercase are used to represent Simplified Chinese, and UPPERCASE ones represent Traditional Chinese.

Concept

[edit]

LanguageConverter markups looks like:

  • -{ text }-
  • -{ flag | variant1 : text1 ; variant2 : text2 ; }-
  • -{ flag1 ; flag2 | from => variant : to ; }-

Where flags are described below, and variant names are language codes (like zh-cn or zh-tw).

Fallback among variants is available. In the following examples designed for Chinese, zh-cn and zh-sg are variants written in the zh-hans script, while zh-hk, zh-mo are variants written in the zh-hant script. For example, according to the $variantfallbacks definition in LanguageZh.php, if no rules for zh-hk were found, the converter would try using definitions for zh-hant, zh-mo, and zh-tw.

In this example you may notice that zh-mo and zh-sg are absent from most examples. This is due to their high similarity to other variants, zh-hk and zh-cn respectively.

Syntax

[edit]

Basic syntax

[edit]
Name Description Example Note
Wikitext Output
Bidirectional Bidirectional conversion with flag support. Most common syntax in manual conversion.
-{zh-hans:computer; zh-hant:ELECTRONICBRAIN;}-
zh, zh-hans, zh-cn, zh-sg
computer
zh-hant, zh-tw, zh-hk, zh-mo
ELECTRONICBRAIN
Unidirectional Unidirectional conversion with flag support. Primarily used for adding new conversion rules. Runs faster than bidirectional rules.

Mainly useful for unifying language variance and preventing erratic conversion.

-{H|HUGEBLOCK=>zh-cn:macro;}-
Test: HUGEBLOCK, macro
zh
Test: HUGEBLOCK, Macro
zh-hans, zh-sg
test: hugeblock, macro
zh-hant, zh-hk, zh-tw
TEST: HUGEBLOCK, MACRO
zh-cn
test: macro, macro
Fallback is not supported in unidirectional conversion. For example, in the above example, zh-hans and zh-sg won't use zh-cn's rules. On the other hand, if the rule was inserted into zh-hans, it would only apply to zh-hans, but not zh-cn.
Disabled Disable language conversion
-{SimpTrad}-
<!--Alternatively, with the "R" flag: -{R|SimpTrad}--->
(all variants)
SimpTrad
This disables conversion completely.
(Semi-)disabled Disable "word" conversion
(breaking words apart, won't affect character transcription)

Useful for stopping erratic conversion by splitting words correctly.

HAN-{}-GUK, cho-{}-sun

(assuming there's a system-wide conversion rule between HANGUK and chosun)

zh
HANGUK, chosun
zh-hans, zh-cn, zh-sg

hanguk, chosun

zh-hant, zh-hk, zh-tw
HANGUK, CHOSUN
There's another way to achieve this, which is more convenient in templates. See "Combined conversion flag" below.
For example: -{zh;zh-hans;zh-hant|HANGUK}- -{zh;zh-hans;zh-hant|chosun}-

Conversion flags

[edit]

Common flags

[edit]
Flag Description Example Note
Wikitext Output
H Insert a conversion rule without output
-{H|zh-cn:blog; zh-hk:WEBJOURNAL; zh-tw:WEBLOG;}-
Test: blog, WEBJOURNAL, WEBLOG
zh
Test: blog, WEBJOURNAL, WEBLOG
zh-hans
test: blog, webjournal, weblog
zh-hant
TEST: BLOG, WEBJOURNAL, WEBLOG
zh-cn, zh-sg
test: blog, blog, blog
zh-hk
TEST: WEBJOURNAL, WEBJOURNAL, WEBJOURNAL
zh-tw
TEST: WEBLOG, WEBLOG, WEBLOG
zh-hans and zh-hant are simply scripts and won't apply bidirectional rules. See also the $manualLevel parameter in LanguageConverter.
A Insert a conversion rule with a result in the current language
-{A|zh-cn:blog; zh-hk:WEBJOURNAL; zh-tw:WEBLOG;}-
-{zh-hans:→; zh-hant:⇒}-
blog, WEBJOURNAL, WEBLOG
zh
blog → blog, WEBJOURNAL, WEBLOG
zh-hans
blog → blog, webjournal, weblog
zh-hant
WEBLOG ⇒ BLOG, WEBJOURNAL, WEBLOG
zh-cn, zh-sg
blog → blog, blog, blog
zh-hk
WEBJOURNAL ⇒ WEBJOURNAL, WEBJOURNAL, WEBJOURNAL
zh-tw
WEBLOG ⇒ WEBLOG, WEBLOG, WEBLOG
Compare the first (A) output with the one from H and flag-free operation.
- Remove existing conversion rule
-{H|zh-cn:blog; zh-hk:WEBJOURNAL; zh-tw:WEBLOG;}- <!-- Add rule -->
+ blog, WEBJOURNAL, WEBLOG

-{-|zh-cn:blog; zh-hk:WEBJOURNAL; zh-tw:WEBLOG;}- <!-- Remove rule -->
- blog, WEBJOURNAL, WEBLOG
zh
+ blog, WEBJOURNAL, WEBLOG
- blog, WEBJOURNAL, WEBLOG
zh-hans
+ blog, webjournal, weblog
- blog, webjournal, weblog
zh-hant
+ BLOG, WEBJOURNAL, WEBLOG
- BLOG, WEBJOURNAL, WEBLOG
zh-cn, zh-sg
+ blog, blog, blog
- blog, webjournal, weblog
zh-hk
+ WEBJOURNAL, WEBJOURNAL, WEBJOURNAL
- BLOG, WEBJOURNAL, WEBLOG
zh-tw
+ WEBLOG, WEBLOG, WEBLOG
- BLOG, WEBJOURNAL, WEBLOG
Compare test (+), test (-), and basic hans/hant
T Override page title
-{T|zh-cn:tom hanks; zh-hk:SOUP HANS; zh-tw:TOM HANKS;}-

(assuming the original title is "TomHanks")

(shown as page title)
zh (gerrit:19746)
TomHanks
zh-hans, zh-cn, zh-sg
tom hanks
zh-hk
SOUP HANS
zh-tw
TOM HANKS
D Describe conversion rule
-{D|zh-cn:tom hanks; zh-hk:SOUP HANS; zh-tw:TOM HANKS}-
(all variants)
China: tom hanks; Hong Kong: SOUP HANS; Taiwan: TOM HANKS
Good for generating overviews in public conversion rule-groups.

Combined conversion flag

[edit]
Flag Description Example Note
Wikitext Output
variant name Only consider certain variants for conversion
-{H|zh-cn:blog; zh-hk:WEBJOURNAL; zh-tw:WEBLOG;}-
Test 1: -{zh;zh-hans;zh-hant|blog, WEBJOURNAL, WEBLOG}-

Test 2: -{zh;zh-cn;zh-hk|blog, WEBJOURNAL, WEBLOG}-

(assuming this runs on a Chinese wiki)

zh
Test 1: blog, WEBJOURNAL, WEBLOG
Test 2: blog, WEBJOURNAL, WEBLOG
zh-hans
test 1: blog, webjournal, weblog
test 2: blog, blog, blog
zh-hant, zh-tw
TEST 1: BLOG, WEBJOURNAL, WEBLOG
TEST 2: WEBLOG, WEBLOG, WEBLOG
zh-cn, zh-sg
test 1: blog, webjournal, weblog
test 2: blog, blog, blog
zh-hk
TEST 1: BLOG, WEBJOURNAL, WEBLOG
TEST 2: WEBJOURNAL, WEBJOURNAL, WEBJOURNAL
Compare test 1, test 2 and the usage case in H flag above.

Exceptions

[edit]

Language converter avoids converting anything found in "code" blocks like ‎<pre>...‎</pre>, ‎<code>...‎</code>, as well as the ‎<script>...‎</script> tag used for carrying executable JavaScript. Putting an empty conversion rule block -{}- inside these tags will function as a "force convert" switch for the converter. This hack can be useful for code samples nested in these tags.

A caveat, however, is that this switch doesn't seem to work for the extension-provided ‎<syntaxhighlight> tags which eventually generates a ‎<pre>...‎</pre> nested with elements (T34943). The switch also won't work with scripts not originally included with the page's HTML source that LC is designed to operate on.

MediaWiki messages are not processed by LC. This inconvenience is tracked in T170916.

See also

[edit]