User:TJones (WMF)/Notes/Crimean Tatar Transliteration/Wiki tokens

Below is a very large table of words found in a sample of 500 Crimean Tatar Wikipedia pages. They contained 13,920 tokens (individual words) and 5,685 types (distinct words). Most Crimean Tatar words in Wikipedia are in the Latin alphabet. There are also names and other words written in Greek, Cyrillic, Georgian, Armenian, Arabic, Devanagari, and Chinese.

The list is alphabetical by default, but sort by Freq to see the most common words and their transliterations.

If either the Latin or Cyrillic Transliteration is blank, the word is not changed by that kind of transliteration. See What is Subject to Transliteration on the parent page for more info on why things are not transliterated.