Extension talk:Narayam/archive

top
I, didn't find typing tools (script) in Nepali. Would you manage from this-Bhawani Gautam 01:50, 12 June 2011 (UTC)
 * Done --Junaidpv 19:00, 22 July 2011 (UTC)

Changes for Nepali Extension in Inscript
Hello everyone, Thanks for the nice extension. There are few changes in Nepali though. The "V" in Nepali should be "भ" instead of "व". People write investment as इन्भेष्टमेन्ट rather than इन्वेष्टमेन्ट so they expect "भ" sound for "V". The keyboard implemented there is more like a standard keyboard that is used in Nepali. It would be really nice if you could incorporate the keyboard layout that is listed here here.

It would be nice if we could have "a traditional" keyboard and a "romanized" keyboard. The romanized keyboard is more or less already available except that the V needs to be changed. The website has both the keyboards listed as "romanized" and the "traditional" keyboard. Most people used to use the traditional keyboard in the past. The younger generation like the "romanized" because its easy to write in English and they don't have to learn the Nepali keyboard which they call it the traditional keyboard layout. Inscript is used in India whereas "Traditional" used to be used in Nepal.

Thanks.
 * Rajesh Pandey
 * --RajeshPandey 17:51, 20 August 2011 (UTC)


 * So, you says 'v' should be mapped to 'भ' but the website you pointed out also mapped to 'व'! What should be the name when I add your 'traditional' keyboard to Narayam, like we say 'InScript'?


 * Please also give feedback if any. Thanks --Junaidpv 20:19, 12 September 2011 (UTC)
 * Hi, The keyboard layout should be called as : Nepali Traditional Keyboard Layout and Nepali Romanized Keyboard Layout.
 * 'v' should be 'भ' but I don't want to argue about it. That's okay if we could have Nepali Traditional Keyboard Layout. I have enabled 'भ' for 'v' in Nepali Wikipedia. Its a typo and officially people write इनभेष्टमेन्ट for Investment For example this is the website of an investment bank and भिषा is Visa. Example:  where they talk about Visa and Mastercards. So v is pronounced as 'भ' in Nepali. Virus is भाइरस and Violet is भायोलेट। व is W in Nepali and they make a clear distinction between 'v' and 'w' while pronouncing. 'v' is pronounced by biting or touching the lips and forcing the air out, w is not. W is pronounced by releasing air while the mouth is round. Thanks --RajeshPandey 20:37, 12 September 2011 (UTC)


 * OK, I have updated romanized version according to Nepali wikipedia: r96943. The update will be effective within one day on translatewiki.net, you can test it there.
 * Can you give the link that will be useful for me to create traditional layout? I think this site can be used also. --Junaidpv 06:51, 13 September 2011 (UTC)
 * Traditional Screenshot
 * Traditional
 * Software Download link for Traditional Keyboard
 * Nepali Sabdakosh You can also click on "किवोर्ड देखाउ" button for both traditional and Romanized. There are buttons on ट्रेडिसनल and रोमनाइज्ड on that page. You will have to play around the keyboard for a while. Or I can send you some screenshots if you want.
 * A lot of images will come in google image search Click to Search for Nepali traditional keyboard layout--RajeshPandey 10:03, 14 September 2011 (UTC)

Enable by Default for everyone
For my site, i wished to enable Narayam for hindi for everyone, including those whose language is set to hindi. Unfortunately, this is not an option so far, but i changed the following to get the effect:::

in Narayam.hooks.php replace >> $userlangCode = $wgLang->getCode; << with >> $userlangCode = 'hi'; <<

Hope this becomes a option in the next release.


 * A configuration option, $wgNarayamAlwaysLoadForLanguages, has been added for this purpose.
 * You can put this line to LocalSettings.php to always load Hindi typing schemes. --Junaidpv 17:24, 8 October 2011 (UTC)

-- did not realize there is no transliteration for hindi yet? or is there one i can use? Thanks!!!

Hindi Transliteration
Hi,

Is there a Hindi transliteration in the works/available? Thanks!!

Amharic language available?
Hello there, can someone develop or show us how to make up the scheme to develop transliteration for Amharic language?
 * Do you get any clue by reading Extension:Narayam? Or should we develop one for you? If you have any other tools that can be used for this purpose please let us know. --Junaidpv 06:47, 8 October 2011 (UTC)
 * Fast response, I saw the description but I didn't understand it. If its not too much to ask, it'd be great to see how it works and I will troubleshoot for you on any improvement can be done. Here is a website that has an amharic input settings Amharic Input Amharic Input 2, I'm not sure how else i can help.
 * Thanks for the link. I will take the first link as model for implementation. Once I developed it, I will inform you. You can test it then. You can email me if you want. Also I am available at IRC channel (irc.freenode.net) #mediawiki and my IRC nick name is Junaidpv. --Junaidpv 07:04, 8 October 2011 (UTC)

Gadgets Support
Will this extension add support for various gadgets used in the WMF wikis which require text input, like HotCat for instance? Or Twinkle? Or will input support for these gadgets require separate scripts? Thanks.--Siddhartha Ghai 14:00, 6 November 2011 (UTC)


 * Yes, it supports gadgets like HotCat. You can verify it at http://translatewiki.net.--Junaidpv 14:12, 6 November 2011 (UTC)


 * What about Twinkle, which uses a jquery dialog with(I think) HTML forms in it? And basically any gadgets that don't use the MW interface(are there others?). Thanks again.--Siddhartha Ghai 22:53, 6 November 2011 (UTC)


 * It works in XFD atleast. Some part of twinkle use Browser's native input box rather than created with JavaScript/HTML, that Narayam can do nothing, so Narayam will be inactive. Narayam is designed to work with all input boxes within HTML of the page. --Junaidpv 03:48, 7 November 2011 (UTC)


 * What about wikilove, which also seems to use windows similar to twinkle(Maybe they're jquery based too)?--Siddhartha Ghai 18:27, 15 December 2011 (UTC)


 * Ok, just checked Twinkle in sa-wp. Narayam works with it(even in the jquery dialogs), so I'm hoping it will work with wikilove too :) Siddhartha Ghai 10:22, 16 December 2011 (UTC)


 * Yes it works with wikilove and it should work with all jquery dialog boxes too. --117.206.13.149 10:29, 16 December 2011 (UTC)

Thoughts about Narayam architecture
Hi,

After the successful trip to India and learning a lot about the Indic scripts, i made more fixes to the Gujarati phonetic Narayam mapping. (Tests and reviews are welcome.)

While doing it i realized that more structure may be needed in developing these keymaps. Here's a brain dump of things i've been thinking of.

First, some facts and assumptions (i'm not completely sure which is which):


 * 1) Most scripts of India, as well as many scripts of South-East Asia, Indonesia and Philippines share a very similar structure. The shapes of the letters are different and the characters are encoded separately in Unicode, but the basic ideas of inherent vowels, virama, phonetic order (ka kha ga gha nga etc.), separate characters for initial and medial vowels, unusual behavior of the character that represents the sound of /r/, etc. are similar. (For details see Brahmic family of alphabets ).
 * 2) The Indian standard InScript keyboard tries to map the keys similarly in all the scripts of India according to this structure.
 * 3) Some of these scripts are used to write several languages - for example, the Devanagari script is used for Hindi, Marathi, Sanskrit and other languages; the Burmese script is used to write Burmese, Mon, Shan and other languages etc.
 * 4) The same script can be used slightly differently in different languages. For example, Devanagari is used differently in Hindi, Marathi, Sanskrit and Nepali. (This happens in many other scripts, too - Cyrillic is used differently in Russian and Ukrainian.)
 * 5) The main problem with the Gujarati keyboard as it was submitted to us was the confusion between Devanagari virama and the Gujarati virama. They have similar appearance and function and probably the creator of the mapping thought that it's the same character.
 * 6) Some of the changes that the Marathi speakers made to the Hindi transliteration keymap to adapt it to their language are probably useful for Hindi, too.
 * 7) A lot of people that write in the languages that use these scripts would probably love to use a phonetic keymap, although i don't know how many exactly.
 * 8) C-DAC, Red Hat and Google have created phonetic typing schemes.

Now, some ideas about what we can do about it. They are very half-baked and possibly silly; it is also quite hard to put them in writing and it would be better to discuss them in person, but i prefer not to wait until the next time it happens.


 * 1) A phonetic keyboard for different language versions of the same script (e.g. Devanagari-Hindi and Devanagari-Marathi) should be based as much as possible on one copy of the code. (Code reuse FTW.)
 * 2) It can be something like an object-oriented approach. For example, there would be a parent Devanagari class that implements the common functionality from which different schemes for different languages inherit (Hindi, Marathi etc.). The same for the Bengali script, from which Bengali and Assamese would inherit, etc. And maybe there can also be a parent Brahmi class from which Devanagari, Bengali, Malayalam etc. would inherit, replacing the needed characters according to Unicode range. Unfortunately, i'm not sure that it would actually be easy to implement in JavaScript.
 * 3) This approach, if at all feasible, can also be useful for completely different script systems - Arabic, Cyrillic, Latin etc.
 * 4) The schemes must be based as much as possible on C-DAC, Red Hat and Google mappings, unless they are really bad. Is it possible at all to get the mappings for C-DAC and Google or is it only possible to reverse-engineer them?
 * 5) I don't have much to say about InScript, because as far as i understand, an InScript mapping is easier to implement than a transliteration mapping, and my feeling is that transliteration mappings are in higher demand, as much as we want to love InScript.

That's it, more or less. Any ideas about this are welcome. --Amir E. Aharoni 23:04, 28 November 2011 (UTC)

Commons feedback
This tool is currently being discussed at commons:Commons:Village_pump, following the recent activation of the extension there. Rd232 13:58, 7 December 2011 (UTC)

Numerals
Currently, while trying out Narayam at sa-wp, I find that if hindi transliteration is enabled, Narayam converts all numerals to the hindi numerals automatically. However, on hi-wp, there is currently no consensus on the use of hindi or arabic numerals. So, I think there should be an option in the preferences to allow use of the arabic numerals(123...) with narayam enabled. This will also be useful wherever template addition is required (and the templates use parser functions), since the template names and parameter names can be translated, but parser functions do not accept hindi numerals. Alternately, this option could be kept as a checkbox in the dropdown menu.--Siddhartha Ghai 19:29, 29 December 2011 (UTC)

Enabling Narayam / WebFonts
Off late, I found few bugs related to enabling / disabling, I did the below matrix and looking for your views. This needs to done, irrespective of UI change which is on the cards. Shall we agree to a scheme like this and fix the enabling logic in these extensions? We could also post a link to commons(where few bugs have raised) / meta / mediawiki(other multilingual wiki village pumps) for their feedback too. I know this adds to slightly more complex logic, but we do live in a complex world :)

Logicwiki 17:48, 1 February 2012 (UTC)

PS : Lets assume for a moment we have WebFonts in Tamil :) PPS : We could also optionally think of outside world(not just wmf wikis or even mediawiki), the functionality could be agnostic of these.

Marathi
Hi All, We have enabled this extension on Marathi Wikipedia. We want the following things to be changed

"." (पूर्णविराम) is transliterated to "।" (स्तंभ), but in Marathi "।" is used very rarely. "." should not be transliterated and shoud remain to be "." (पूर्णविराम) I suggest that we transliterate "|" (pipe) to "।" to serve rare cases.

- कोल्हापुरी (talk) 04:47, 22 February 2012 (UTC)


 * Please report the features/issues through [Wikimedia Bugzilla https://bugzilla.wikimedia.org/], so the developers can follow up it well. And I think somebody already has reported this issue. Thank you. --Junaidpv (talk) 04:32, 1 March 2012 (UTC)

Hindi transliteration suggestion
As of now, the following happens in hindi transliteration:
 * "g" followed by space gives "ग्". Similarly this applies for other alphabets. In normal situations, this is fine, but when it comes to word-endings, its a problem. The ending "Schwa" sound (the a in about) is sometimes omitted at the end of words in hindi (see w:Schwa deletion in Indo-Aryan languages), but even in these cases, it is very rare to see words ending with a halant (्). However, to input anything without a halant at the end, Narayam currently requires an "a" to be input at the end to remove the halant. This is cumbersome to use.
 * I suggest that if any alphabet is followed by a space, it be converted to a full one (without a halant). For example: "g" followed by space should give "ग". The intended result is something like: "Raam" --> "राम" instead of "Raama" input being needed. For ending words with a halant in hindi, we could have something like "g" + shift + space --> "ग्".--Siddhartha Ghai (talk) 20:24, 5 March 2012 (UTC)


 * Point 2:
 * "D" gives "ड"(U+0921) + "्"(U+094D)
 * "D" + "`" gives "ड़्" i.e. "ड"(U+0921) + "़"(U+093C) + "्"(U+094D) but this (atleast in my view) should be "ड"(U+0921) + "़"(U+093C) or rather (and more preferably) "ड़"(U+095C) (without the halant, since I have not heard ड़ being used anywhere with the halant).

Similarly:
 * "D" + "h" gives "ढ"(U+0922) + "्"(U+094D)
 * "D" + "h" + "`" gives "ढ़्" i.e "ढ"(U+0922) + "़"(U+093C) + "्"(U+094D). As I see it, it should be either "ढ"(U+0922) + "़"(U+093C) or rather (and more preferably) "ढ़"(U+095D) (again without halant).--Siddhartha Ghai (talk) 20:50, 5 March 2012 (UTC)

Similarly:
 * Certain alphabets when combined with ़ have their own coding points in UTF-8. I think it would be better if the single coding point is used instead of two coding points. These include:
 * "फ़" (U+095E): Currently, "F" gives "फ"(U+092B) + "़"(U+093C) + "्"(U+094D). It should give "फ़" (U+095E) + " ्"(U+094D) instead. Also, "f" + "`" should also give the same result, i.e "फ़" (U+095E) + " ्"(U+094D).
 * "ज़":"j" + "`" gives "ज"(U+091C) + "़"(U+093C) + "्"(U+094D). It should give "ज़"(U+095B) + "्"(U+094D) instead.
 * "क़" (U+0958): "k" + "`" gives "क" + " ़" + " ्". It should give "क़" + " ्" instead.
 * "ख़" (U+0959): "k" + "h" + "`" gives "ख" + " ़" + " ्". It should give "ख़" + " ्" instead.
 * "ग़" (U+095A): "g" +"`" gives "ग" + " ़" + " ्". It should give "ग़" + " ्" instead.
 * "य़" (U+095F): "y" + "`" gives "य" + " ़" + " ्". It should give "य़" + " ्" instead.

As of now, "z" and "Z" both give "." (a dot). And there is no direct input for "ज़"(U+095B). The sound of ज़ is the same as z. So it will be better if "z" gave "ज़" while "Z" gave "."
 * Point 3:

"o" and "O" both give "ओ". I think it was supposed to be: "o" --> "ओ" and "O" --> "ऑ" just like "e" --> "ए" while "E" --> "ऍ" currently. (Note: this same error was recently found and fixed in the script currently used at hi-wp)
 * Point 4:

"c" gives "च्" while "C" gives "क्क्" unlike other alphabets which give forms of themselves. Is this one intended? Shouldn't it have been "च्च्" instead?--Siddhartha Ghai (talk) 22:03, 5 March 2012 (UTC)
 * Point 5:

"o" + "o" gives "ओओ" i.e the same letter twice, but in practical usage it is used to refer to "ऊ".--Siddhartha Ghai (talk) 22:28, 5 March 2012 (UTC)
 * Point 6:

Similarly, "e" + "e" gives "एए", but is used to refer to "ई".--Siddhartha Ghai (talk) 22:29, 5 March 2012 (UTC)

These also mean that "h" + "e" + "e" should give "ही" (i.e "ह" + " ी") instead of the current result of "हेए" while "h" + "o" + "o" should give "हू" (i.e "ह" + "ू") instead of the current result of "होओ". Here "ह" could be replaced with any other consonant (not vowel).--Siddhartha Ghai (talk) 23:16, 5 March 2012 (UTC)

"q" currently gives "॑" (U+0951). However, I have never seen any use of this in hindi (I don't know what it sounds like either, so I could be wrong). Maybe it should be replaced by "क़" (U+0958) which is closer to "q"? (Quran is written with क़ in hindi). Or if we wish to retain "॑", maybe move it to "Q" which is currently unused, and make "q" give "क़".--Siddhartha Ghai (talk) 23:03, 5 March 2012 (UTC)
 * Point 7:
 * Note:Unicode describes "॑" as Udatta, a stress symbol used in vedic texts whereas "क़" is called "Qa", so "q" should definitely give output of "क़" (U+0958).--Siddhartha Ghai (talk) 23:21, 5 March 2012 (UTC)

"ज्ञ": Currently, to type this, we need to enter "j" + "Y". However, for practical purposes, it is pronounced like "ग्य" (see the ITRANS transliteration in w:Devanagari_transliteration) and so its input should be "g" + "Y" (the "j" + "Y" input could be kept, if needed).--Siddhartha Ghai (talk) 03:31, 7 March 2012 (UTC)
 * Point 8:

"़" (U+093C) nukta: It should not be possible to add two nuktas to the same character. This is incorrect. Also, nukta added after a matra should automatically convert to be before the matra and after the alphabet.--Siddhartha Ghai (talk) 10:47, 21 March 2012 (UTC)
 * Point 9:


 * The striked-out portions are redundant since the individual coding points are deprecated in Normalization form C.--Siddhartha Ghai (talk) 01:15, 21 March 2012 (UTC)

Formalization of rules
I'll try to formalize the rules for the above points at sa-wp (I'm not sure if I'll get them all correctly).--Siddhartha Ghai (talk) 03:38, 7 March 2012 (UTC)


 * Formalized rules for all above points except point 5 at w:sa:User:Siddhartha Ghai/vector.js (the current rule is probably better considering pucca should give पक्का). However, I wasn't able to test them due to the following javascript error:

Uncaught TypeError: Cannot read property 'rules' of undefined


 * This error is in reference to my script file and probably refers to

hi_scheme.rules.unshift.apply(hi_scheme.rules, custom_rules);


 * Narayam still works, just the rules in my userscript don't. Have I done something wrong?--Siddhartha Ghai (talk) 12:08, 16 March 2012 (UTC)


 * The updated documentation has made it possible to test the rules.--Siddhartha Ghai (talk) 01:15, 21 March 2012 (UTC)


 * Submitted patch for updating rules.--Siddhartha Ghai (talk) 19:01, 24 March 2012 (UTC)

Customisation
The customisation notes given in Extension:Narayam is a bit incomplete. Most importantly, in the rule  what is the second character?

Also, does this mean complex assignments can be done (like saving particularly annoying spelling to easier input methods)?--Siddhartha Ghai (talk) 22:26, 5 March 2012 (UTC)

Just to clarify, refering to the empty character, i.e  .--Siddhartha Ghai (talk) 01:34, 6 March 2012 (UTC)
 * Explanation for all three items in a rule is given at Extension:Narayam. I hope that clarifies your question.--Santhosh.thottingal (talk) 04:05, 6 March 2012 (UTC)
 * To a limited extent, it does. Thanks--Siddhartha Ghai (talk) 12:01, 6 March 2012 (UTC)
 * That section reads "The transliteration algorithm processes the rules in the order they are specified, and applies the first rule that matches." So if I add key mappings to my userscript file, which will take precedence? Meaning that the custom rules are added to the top or bottom of the default rules?
 * Asking this because some of the rules to the problems I specified in the above section would work only if they take precedence above the default rules.
 * Also, is there any way to increase the keystroke buffer size via userscripts? That, if possible, would enable me to store a few particular spellings that I might want to save, like "google" always giving me "गूगल", "software" always giving me "सॉफ़्टवेयर" etc.--Siddhartha Ghai (talk) 04:11, 7 March 2012 (UTC)

Sorry for the late reply, I was unable to get involve in this. I think customizing existing ruleset will be less effective since the order of rules are important. So I suggest to override entire ruleset rather than modifying it. You can start by copying the ruleset from Narayam code.

I haven't tried this, so don't know whether it will work or not. --Junaidpv (talk) 03:19, 21 March 2012 (UTC)


 * Thanks for the response. I was able to test the changes and submit a patch (without having to copy the entire ruleset, primarily because the rules I had to apply were higher in priority than other existing rules.)--Siddhartha Ghai (talk) 19:11, 24 March 2012 (UTC)

Method to enable transliteration for Specific language without dependency on Language of Wiki.
I was helping develop Junaid a language i used for my wiki. In the initial stages of Narayam, there was a configuration that enabled setting the language intended for transliteration. However, that was removed in the latter versions of Narayam. I was hoping this was for development purposes and I have asked Junaid about this issue. For many smaller languages the main site language is kept in English because users are more familiar with English writing for preference names and structural info. However content is in local languages. It is in this situation that I manage my website. As a result is there a way or method to allow input of a specific language while the main wiki language is set to 'en' or English? I do believe that configuration options will empower users with choice and I am hopeful that this will always be considered in upcoming versions of Narayam.

Thank You for this essential Extension. Wikimanz (talk) 22:19, 19 March 2012 (UTC)

Hi, If your content is not English, I suggest you to set $wgLanguageCode to the content language instead of English. For example, if your wiki content is Malayalam - ml, in your LocalSettings.php set $wgLanguageCode = 'ml'; And since you want the interface language to be English by default, set it like this in LocalSettings.php $wgDefaultUserOptions['language'] = 'en'; With the above two settings, your Wiki's interface will be English, while content will be in local language. If Narayam is installed you will see that in English interface itself it is available in top of page. You may enable it by default or not using the $wgNarayamEnabledByDefault setting. Even if the content is multiple languages, Narayam alllows you to select input method from multiple languages. --Santhosh.thottingal (talk) 04:11, 20 March 2012 (UTC)