Руководство:$wgCategoryCollation
Категория: $wgCategoryCollation | |
---|---|
Какие категории сопоставления используются для сортировки |
|
Введено в версии: | 1.17.0 (r72308) |
Удалено в версии: | всё ещё используется |
Допустимые значения: | (строка) |
Значение по умолчанию: | 'uppercase' |
Другие настройки: По алфавиту | По функциональности |
Подробности
As an example, to use the Spanish collation, you'd write $wgCategoryCollation = 'uca-es';
in LocalSettings.php and then run updateCollation.php for your change to take effect.
Currently supported are:
Collation algorithm | MW version | Description |
---|---|---|
uppercase
|
по умолчанию | make everything uppercase, then sort by binary value of string when stored as UTF-8. Essentially case-insensitive sort by code point. |
numeric
|
MW 1.28+ | Same as uppercase , but with numeric sorting.
|
identity
|
MW 1.18+ | sort by binary value of string when stored as UTF-8 (without converting to uppercase). Essentially sort by code point. |
uca-default
|
MW 1.17+ | Unicode collation algorithm – complex, much more multilingual-friendly category collation. |
uca-default-u-kn
|
MW 1.28+ | uca-default with numeric sorting.
|
uca-<langcode>
|
MW 1.21+ | uca-default with language-specific adjustments. См. ниже.
|
uca-<langcode>-u-kn
|
MW 1.28+ | uca-<langcode> with numeric sorting.
|
xx-uca-ckb
|
MW 1.23+ | центральнокурдский |
xx-uca-et
|
MW 1.24-1.31 (удалено в 1.32) | Estonian but with W and V being considered separate letters. |
xx-uca-fa
|
MW 1.30-1.31 (удалено в 1.32) | персидский |
uppercase-ab
|
MW 1.31+ | абхазский |
uppercase-ba
|
MW 1.30+ | башкирский |
uppercase-se
|
MW 1.31 (удалено в 1.32) | северносаамский |
Since MediaWiki 1.18, extensions can add extra collations via the Collation::factory hook.
The value is also stored inside the categorylinks
table to determine which rows need updating when the collation algorithm changes.
Инструкция по установке
- After changing this option, you must run updateCollation.php to recompute sort keys for all pages, or your categories will be sorted inconsistently.
- Updating collations is slow and may take several hours on large wikis.
uca-default
/uca-xx
collations require the PHP intl extension.
- If you are using Varnish, Squid or file cache, you may have to purge category pages after running updateCollation.php to see the results.
- If you update or recompile your version of PHP, you must run updateCollation.php --force.
Language-specific collations
MediaWiki also supports many collations designed for specific languages.
These are based on the Unicode collation algorithm (UCA) uca-default
and have the same requirements; they are named
uca-<langcode>
, where <langcode> is one of:
af, am, ar, as, ast, az, be, be-tarask, bg, bn, bn@collation=traditional, bo, br, bs, bs-Cyrl, ca, chr, co, cs, cy, da, de, de-AT@collation=phonebook, dsb, ee, el, en, eo, es, et, eu, fa, fi, fil, fo, fr, fr-CA, fur, fy, ga, gd, gl, gu, ha, haw, he, hi, hr, hsb, hu, hy, id, ig, is, it, ka, kk, kl, km, kn, kok, ku, ky, la, lb, lkt, ln, lo, lt, lv, mk, ml, mn, mo, mr, ms, mt, nb, ne, nl, nn, no, oc, om, or, pa, pl, pt, rm, ro, ru, rup, sco, se, si, sk, sl, smn, sq, sr, sr-Latn, sv, sv@collation=standard, sw, ta, te, th, tk, tl, to, tr, tt, uk, uz, vi, vo, yi, yo, zu
For example, to use a collation for Spanish, one would use the uca-es
collation.
Using these collations provides both correct sorting order for given language and proper headings for first letters of article titles. Earlier versions of MediaWiki might not support all of these language codes.
Getting new collations added
There are two parts to having a new language supported:
- It being supported by the International Components for Unicode library (the list of language codes it supports is available at [1]).
Note, however, that Wikimedia's production servers do not use the latest version of the ICU library. As of 2016, they use version 52.1, which supports a significantly smaller set of languages.
- It being additionally supported by MediaWiki itself (this basically requires listing the additional characters, or character groups, that are considered separate letters in the given language, in addition to the basic alphabet) – the always up-to-date list of currently supported ones is available at includes/collation/IcuCollation.php.
It might also be the case that the default ICU ordering ('uca-default' collation) orders the titles correctly, but does not correctly separate the letters – it can be used for the first step in that case.
Sometimes the letter ordering of a different language might fit yours, if they are related – a custom collation can sometimes be provided in such case (there is one for Sorani Kurdish / Central Kurdish language ('ckb') already, called xx-uca-ckb
includes/collation/Collation.php).
Числовая сортировка
В числовой сортировке страницы будут упорядочены следующим образом: 1, 2, 9, 10, 11, 20, 21, 99, 100. В обычной (нечисловой) сортировке страницы будут упорядочены как обычный текст: 1, 10, 100, 11, 2, 20, 21, 9, 99. Если используется числовая сортировка, все страницы, начинающиеся с цифр, будут собраны вместе под заголовком «0–9». Если используется обычная сортировка, страницы начинающиеся с цифр будут разделяться по подзаголовкам с ведущими цифрами: «0», «1», «2» и т.д. Для дополнительной информации о числовой сортировке, см. Unicode Technical Standard #10. Для проверки числовой сортировки, см. ICU Collation Demo. Обратите внимание, что числовая сортировка действует только в пределах непрерывной последовательности цифр. Цифры, разделённые запятыми, точками или пробелами интерпретируются как отдельные числа.
См. также
- MediaWiki 1.17/Category sorting - release notes about this feature from MediaWiki 1.17.
Ссылки
Примечания
- ↑ Collation refers to how data is sorted according to its set of characters, applying defined sorts criterias (i.e. alphabetic or reverse sort, case dependent or not, etc.)