Topic on Talk:Universal Language Selector

Several problems with web fonts

7
Olaf Studt (talkcontribs)

Since the German Wikipedia only recently supplies web fonts, I added missing {{lang}} templates to a lot of articles during the last days, and I encountered several problems:

The thing I saw last is perhaps the easiest to fix: The mechanism is case sensitive. {{Lang|km|}} works, {{Lang|Km|}} doesn't.

Script suffixes as {{lang|bo-Tibt|}} or {{lang|am-Ethi}} do not work, so I had to change {{lang|bo-Tibt|}} to {{lang|bo|}}.

For the Ge'ez script, only {{lang|am}} and {{lang|ti}} work, {{lang|gez}} (Ge'ez language) and {{lang|tig}} (Tigre language) don't, so I left de:Ge’ez (Sprache) and de:Tigre (Sprache) without {{lang}}. Nor do all the smaller Ethiopian Semitic languages (for these, maybe the suffix -Ethi should be mandatory because some of them are also written in Arabic or hardly written at all).

For the Tibetan alphabet, only {{lang|bo|}} and {{lang|dz|}} work, not the other Category:Languages written in Tibetan script.

Nemo bis (talkcontribs)

As for case sensitiveness, you should just make the template convert to lowercase with the {{lc:}} magic word.

Mps (talkcontribs)

Converting the annotations as a whole to lowercase is going against the recommendations of RFC4646 and the therein mentioned ISO standards w.r.t. to the format of each annotation part. It is the responsibility of the implementations (here the Universal Language Selector) making use of the annotations to treat them internally as case-insensitive. Additionally, your proposal would only address this singular issue with this very template, but neither other templates using language annotation nor where language annotation is done without templates, and thus not solve the general issue.

Nemo bis (talkcontribs)

What general issue? I'm not aware of a general issue. People seem to be mostly using language codes correctly.

The RFC says:

  The format of the tags and subtags in the registry is RECOMMENDED.
  In this format, all non-initial two-letter subtags are uppercase, all
  non-initial four-letter subtags are titlecase, and all other subtags
  are lowercase.

Hence {{Lang|Km|}} is incorrect, I don't see why the template shouldn't fix it. Sure, the RFC also states "All comparisons MUST be performed in a case-insensitive manner" (which could be an issue for the extension or for core), I was merely giving a practical suggestion about that specific case.

Mps (talkcontribs)

The general issue is that the Universal Language Selector is not performing its comparisons in a case-insensitive manner, otherwise {{Lang|km|}} and {{Lang|Km|}} wouldn't behave differently.

Besides, if every input would just be filtered through an "{{lc:}}" something like "{{lang|bo-Tibt|}}" would just become "{{lang|bo-tibt|}}". And as you say "People seem to be mostly using language codes correctly", so just lower-casing everything would result in mostly non-recommended outputs, just for the sake of removing rare non-recommended (but nevertheless allowed) cases like "{{Lang|Km|}}". Of course one could to some parameter parsing (where I doubt the performance disadvantages are compensating the advantage of a not required wellformed output for a few malformed inputs), but this does not change the situation that a "{{lc:}}" would only treat the symptoms but not the cause.

Nemo bis (talkcontribs)

Again, I never said "lowercase everything". Your point is very clear and I'm unable to comment it, no purpose in repeating.

Santhosh.thottingal (talkcontribs)
Reply to "Several problems with web fonts"