Talk:Universal Language Selector/Compact Language Links

Jump to: navigation, search

About this board

Edit description

This is the feedback page for the Compact Language Links feature.

See the Frequently Asked Questions.

All feedback is welcome. You can write in any language.

You don't have to do this, but it will be helpful if you mention the following things:

  • The browser and the operating system you are using, including version numbers.
  • The language of your operating system.
  • The country from which you are reading the site.

See also:

By clicking "Add topic", you agree to our Terms of Use and agree to irrevocably release your text under the CC BY-SA 3.0 License and GFDL
Od1n (talkcontribs)

When this thing is enabled, it now produces the following JavaScript warning:

This page is using the deprecated ResourceLoader module "es5-shim".
Use of the "es5-shim" module is deprecated since MediaWiki 1.29.0
Nikerabbit (talkcontribs)

It's harmless, though annoying. It's being discussed in Phab:T162590.

Reply to "JavaScript warning"
Pepparkaksgubbe (talkcontribs)

This new idea is strange. Links to lots of languages are hidden away, instead of being shown in the list where they are easily found.

If someone like this idea, let them have it as an option. For all others, including people who are not logged in, the default should be the real, full list of languages for every article, so you can find them. ~~~~

~~~~

83.134.162.97 (talkcontribs)

I support your remarks for 100%, but is there still hope for a multilingual list as a default choice ?

DigitalHamster (talkcontribs)

Yes, I completely agree. For instance, if you want to see how many languages a page is in, or you want to translate a page and are seeing if it already exists, this setting would make it difficult to check.

Whatamidoing (WMF) (talkcontribs)

I think it's the other way around. Compact Language Links gives you a short list of languages that are most likely to interest you, plus an actual, numeric count of the others. For example, https://simple.wikipedia.org/wiki/Bergen lists nine languages for me and tells me that there are 86 more. I can quickly determine that 9+86=95 (plus the page that I'm on). https://en.wikipedia.org/wiki/Bergen (where I have CLL turned off) gives me a long list with no count. With CLL, I can tell you that 96 Wikipedias have an article on that subject. Without CLL, I will tell you that the answer is "lots".

If you are translating a page, you've probably been reading in that language, so CLL will put the target language in the short list (not somewhere in the long list of 95 other languages).

Even if it's not in the short list, with CLL, you can search for that language by name and ISO language code just by clicking the button that says "85 more". For example, if you want to know whether that article exists in Finnish, you can find it by searching for fi (language code), Suomi (Finnish name) Finnish (English), ffinneg (Welsh), fiński (Polish), or many other languages.

Without CLL, you have to know that the Finnish word for that language is "Suomi", and you have to scan through about 85% of the long list until you find that word (because the fi language code is alphabetized in the S's). This does not sound like an improvement to me.

DigitalHamster (talkcontribs)

Actually, thinking back on what has been said and seeing other comments, if this list was expandable (just by clicking the "x others" button that is actually a very nice feature

Whatamidoing (WMF) (talkcontribs)

If you haven't used it for a while, then I encourage you to try it out for a few days. There's nothing more effective for discovering its strengths and weaknesses than trying to live with it.

As I said, I've got it turned off at the English Wikipedia, and that's not an accident. I need to do things like open every language's equivalent of the Village Pump (technical) in tabs more often than I need to find articles in languages that I can actually read. It's good for finding languages that I visit often; it's not so great for opening 62 tabs. (Also: Oh, how I miss Linky.)

For power users like us, whether CLL is more efficient or less efficient is a pretty personal thing, based on the kind of work that you personally do. You'll know whether it fits your personal work patterns after interacting with it a few days.

Madglad (talkcontribs)

I really miss an understanding that "power users like us" and registered, experienced user is a very small minority. Wikipedia is an open encyclopedia for everybody, and should work also for ip users in the whole world. CLL should be rolled back, until acceptable algorithms are devoloped. The algoritms take into consideration dialect continua and neighbouring countries. The current algorithm seems to have absolutely no understanding of this.

Whatamidoing (WMF) (talkcontribs)

The current algorithm is based on an internationally recognized database (whose name escapes me at the moment). I believe that the Language team would be perfectly happy to see that database updated to recognize that several Scandinavian languages are divided more by politics and history than by actual linguistic elements. If you've got the data that says, for example, that most Norwegians and Swedes and Danes can read each other's "separate" languages, then I'll find someone who can help you figure out how to propose a correction to the database.

Madglad (talkcontribs)

It seems that the purpose of the mentioned database is not the cover mutually intelligible, but to cover official languages. Mutual intelligibility is not only restricted to Scandinavian languages, see .

Mutual intelligibility is not the only issue. Schools tend to educate also in other important languages in the region, eg. many schools in western Europe teaches German and French etc.

DigitalHamster (talkcontribs)

good point actually, and i didn't know that it tells you the number. Nevertheless, the proposed feature makes searching for a language "irrelevant to you" more complicated and cumbersome, and I still think it should be optional for added customisability.

Pepparkaksgubbe (talkcontribs)

This was written by me at ~~~~

Pepparkaksgubbe (talkcontribs)

Why can't I sign my messages here? ~~~~

Amire80 (talkcontribs)

You don't need to sign the messages here, because the signature is automatic :)

Making this feature on by default makes it easier for most people to find and click the languages that they need. This was proven by experimenting with real users during the design stage, and also by the data about clicks that has been collected since the deployment of this feature started in June 2016. As you can see in the data, the number of clicks in all languages went up, and in many languages it more than doubled.

People who prefer to see the full list all the time can turn it off in the preferences.

Pepparkaksgubbe (talkcontribs)

(There is no full automatic signature, since there is no time stamp next to it. Now I see there is another kind of time stamp to the right.)

This feature hides most of the language links, making them harder to find and harder to go to. I don't believe in the proof, it is certainly sqewed because just the changed appearance made people click more, but with this feature it is harder to find the languages you are specifically looking for. The old, full list should be default.

Whatamidoing (WMF) (talkcontribs)

If it was just the changed appearance, then we should see a one-time spike followed by a decline later. I don't think that the data supports that theory; it seems to support a sustained higher level. The fact that it learns the languages you're most likely to visit probably affects this; instead of seeing a list of 100 languages, it shows a short list of the languages that you're most likely to visit.

That said, I've got it turned off on several wikis, because I have an unusual need to get between many dozens of language editions of Wikipedia. But I recognize that my work pattern is a bit different from typical editors. Most editors are only interested in languages that they can understand; they rarely need to leave a message or check a feedback page on wikis that they can't read. I also imagine that one of my colleagues, who can read about 20 languages, would find it a bit limiting. But we're the outliers, and the average editor seems to be better served by a tool that focuses on what you need (and lets you search by name or language code for anything that's not shown).

Madglad (talkcontribs)

This thing doesn't show the languages of our neighbourcountries, that everybody understands.

On the other hand it shows us languages, which may be big on the other side of the world, but languages almost nobody in my country understands.

It's algorithms simply aren't professional, it's data material also not. Please remove it from da-wiki, it's not suitable for our country.

Whatamidoing (WMF) (talkcontribs)

What language do you have your user interface set to (in Special:Preferences)?

Madglad (talkcontribs)

da - dansk

81.230.62.175 (talkcontribs)

The algoritms for language selection are indeed very unprofessional, we se the same problem on swedish wikipedia. It seem to be based on this

http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html

list - interlingua is there set as zero speakers in sweden - still it's one of the default langages set on swedish wikipedia - simply because it's in the list.

Bandy Hoppsan (talkcontribs)

I agree, make this optional. And since this is all new, it was implemented just days ago, you can't have statistics for any significant amount of time yet to know wether the spike is a spike or not.

Amire80 (talkcontribs)

@Bandy Hoppsan this was enabled in most languages in June–August 2017, so we do have data about almost languages for more than nine months by now. See Universal Language Selector/Compact Language Links/metrics/data.

Madglad (talkcontribs)

So when neighbouring mutually intelligible are not shown and not easily accessible for the average reader, this is recorded in the statistics as that there is no interest to read the articles in the mutually intelligible neighbour language. Strange logic.

The best way to collect statistics is first turn the system off, collect the statistics, then implement the system (after testing).

Amire80 (talkcontribs)

That's exactly what was done, as you can see on the statistics page. Statistics collection started before the system was deployed.

In June 2016, the number of clicks on the interlanguage links in the Danish Wikipedia out of the number of pageviews was 0.48%. In February 2017 it was 1.84%. Here's the full data:

June July August September October November December January February
0,4801% 0,6607% 0,4995% 0,4867% 0,9838% 0,9653% 2,2078% 1,8352% 1,4091%

The deployment to the Danish Wikipedia was done on July 14. It took a couple of months until the percent actually started growing, but it jumped up in October, and even further in December.

The numbers for most other languages in the same range of pageviews as the Danish Wikipedia have a similar growth trend. Here is Greek, for example:

June July August September October November December January February
0,2812% 0,2803% 0,2444% 0,2914% 0,6072% 0,8320% 1,4014% 1,2741% 0,8271%

(And here is the pageviews data.)

Both the user testing and the subsequent clicks data show that the compact list makes language links more easily accessible for most people. People who find it inconvenient can disable it.

Madglad (talkcontribs)

It might be that there is an increase in interlanguage clicks with fewer languages to select from.

BUT:

This is not the full statistics. A year is 12 months and a statistic examination should contain data for at least 5 years to make a reliable test in the multinomial distribution of the hypothesis that the increase is due to the new system. Another test that could be interesting to do is the hypothesis that there is seasonal changes, it might be your numbers are just reflecting this. You need a professional educated in statistics at master level. Just looking at some numbers and say "There is a tendency" is somthing we should leave to the tabloid journalism. In other words the claim "the subsequent clicks data show that the compact list makes language links more easily accessible for most people" cannot be scientifically justified.

Most importantly: Your data is lacking info of the decrease (my assumption, for obvious) in clicks on nn, nb and sv from da-wiki, after these languages have been removed.

The claim "People who find it inconvenient can disable it" relies on the false assumption that the majority of readers are experienced, registered users, not ip-readers.

Do you have a link to the full data set, you are using?

Bandy Hoppsan (talkcontribs)

@Amire80, I have been around many language versions and never seen it anywhere until some days ago.

I also like to have different languages for different kinds of articles, if this has to be mandatory (which I still think it shouldn't be). When I write about bandy, I want to have links to Swedish, Finnish, Russian and Norwegian. If I write about Japanese manga, I want to have links to Japanese, English and some other languages. If I write about something in Switzerland, I want to be able to easily compare German, French and Italian. This seems to be impossible with this system, which is one more reason to be against it.

Amire80 (talkcontribs)

@Bandy Hoppsan, I'm not sure why you haven't seen this anywhere until recently. This existed as a beta feature since 2014. It was enabled as a default (non-beta) feature in Wikipedia in all languages except Swedish, Dutch, French, German, and English in August 2016. In February 2017 it was enabled also in French, Dutch, and Swedish. German and English will soon follow.

People who want to always see the long list, can disable the compact list it in the preferences (Appearance -> Languages; Utseende -> Språk).

Madglad (talkcontribs)

"People who want to always see the long list, can disable the compact list it in the preferences" - under the false assumption: All readers are experienced registered users and there er not many ip-readers.

Bandy Hoppsan (talkcontribs)

It should be disabled by default.

Madglad (talkcontribs)

Tell me, if an ip-reader wants to read about the Norwegian city Bergen, there will be a good chance that there will more extensive articles in Norwegian Bokmål (no, which is almost identical to Danish) and Norwegian Nynorsk (nn, which is mutually intelligble with Danish). These languages are not listed. As not logged in I get this list:

Another example, the Dutch city Utrecht, as IP:

In Europe people in general understand many languages, and these languages should initially be presented based on neighbouring countries and etymylogical relations, not on wrongly collected statistics. It make no sense to present languages big elsewhere in the world, but with almost zero understanders in europe.

This system is unusable in Europe.

Pepparkaksgubbe (talkcontribs)

So, when will this idea with compact language links be scrapped? If you don't want to make compact links optional, you shouldn't implement them at all! This forcing compact links upon everyone because of subjective and unscientific interpretations of statistics is ridiculous and insulting,v

Whatamidoing (WMF) (talkcontribs)

Compact language links have always been optional for logged-in users. Go to Special:Preferences to enable or disable this tool.

Pepparkaksgubbe (talkcontribs)

So? The problem is, that this is something you have to opt out from. If it is to exist at all, it should be something you could opt in to. Compact language links should not be forced upon people who are not logged in or who haven't discovered how to get rid of it.

Madglad (talkcontribs)

Maybe, but this software obviosly doesn't work in all regions, so disable it per default. What works for registered users is less important than making it work for all users.

Pepparkaksgubbe (talkcontribs)

The extensive link of languages cannot disturb anyone, as it lies in the left margin and each of the languages is easy to find since they are in alphabetical order and you also can see what they link to by hovering over them if you want to.

When the language links are cropped together, you have to go search for them in a square which opens over the text in the article, thus overlaying part of the text you are researching at the moment, and in this square you have to scroll and the links are not easy to find, partly because of the scrolling and partly since they seem to be in some random geographical order.

Whatamidoing (WMF) (talkcontribs)

You don't have to scroll. There's a search field at the top of the box. Start typing in the name of the language you want, and it will find it for you.

Without CLL, Bergen gives me three full screenfuls of languages in the sidebar.

Pepparkaksgubbe (talkcontribs)

So? It's still much harder to find the wanted language than to just have it in the left margin in the full list of languages, where it can be clicked immediately without searching or scrolling.

Whatamidoing (WMF) (talkcontribs)

If you can actually look at a list of 95 items in multiple scripts and "immediately without searching or scrolling" find a single item in it, then I'm impressed. Please consider uploading a screencast of how you "immediately without searching or scrolling" find items that are below the scroll; I'll send it to Design Research.

Madglad (talkcontribs)

No need to search, the language links are actually in alphabetic order. Scrolling through an alphabetic list is easier and more intuitive than searching through a a pop-up list ordered by first some arbitrary geopolitical grouping, next by the alphabet of the language, and third by the name of the country.

This system should be turned off by default until it can be reviewed by people having studied linguistics and/or computer science subject like HCI and user interface design.

Whatamidoing (WMF) (talkcontribs)

I believe that in the actual user research (which was, in fact, conducted by people who have studied human-computer interaction and UI design), words like "more intuitive" and "alphabetical" never appeared in descriptions of the older system.

This is hardly surprising, because very few people can correctly tell you whether Í precedes or follows I, or whether the Hebrew ע‎ precedes or follows the Arabic equivalent – or, for that matter, whether ﻉ precedes or follows itself, since there are at least three different systems for alphabetizing that letter. It is alternatively the 16th, 18th, and 21st letter of the alphabet, depending upon whether you're writing Arabic or Persian, and which of the two Arabic systems you are using for ordering.

So it might "be" alphabetical, but it "seems" like a random, unsearchable, unintuitive jumble to most readers, which is doubtless why so many strongly preferred a system that let them search in their own languages.

Madglad (talkcontribs)

Is this research documented somewhere?

Pginer-WMF (talkcontribs)

We have been doing research around language selection at different stages. For example, you can check the recordings of the research we did with the initial prototypes where we evaluated the general idea. The approach for language selection has been part of content Translation where different research studies have been organised in the context of campaign support and template translation in those studies language selection has been a common activity.

We also used secondary research such as the study on multilingual users to have a better understanding of the number of languages users use. After the initial deployments we are measuring data about the effect on cross-language navigation to better understand the effects on cross-language navigation.

Overall, it seems that finding a language you are looking for is easier in a short list than in a long list, but even for the cases where the language you look for is not there (which is an edge case that will only happen the first time but not the hundreds of times you'll access that language later) users can figure out that more languages are available and search helps them to find the language they look for. Data confirms that the changes not only do not introduce any barrier for users to navigate across languages but that such navigations increased after the change.

As mentioned above, the pattern of language access normally involves accessing a few languages many times, the main time-saver is the fact that after the initial selection, you will have the languages you use in a short and convenient list. Trying to optimise the fist-time experience is beneficial, but I don't think it should focus the discussion about the feature since the purpose is to improve the much common recurrent use.

Reply to "Make this optional, not default"

Artificial Intelligence for choosing languages - really?!

1
Martin Josefsson (talkcontribs)

In 2014-05-10 I wrote "Automation is overkill in a situation where there is only a few hundred languages to choose from". And still somebody is working on it! Why waste time on the AI in this project? Why not simply let the user choose which languages should appear in the list, and in which order?

For about a year or so I have been using a Firefox/Greasemonkey extension called Wikipedia rearrange other languages, instead of the built-in Compact Language Links function. It is a simple extension (31 or 50 rows of code), but it works exactly as I want it to. In the code of the extension you can insert the languages that you want to appear at the very topmost of the list and then you can choose if you also want the other languages to show up. My customization of the code looks like this:

// set your languages here

var myLangs = ["en", "sv", "fi", "da", "no", "nn", "de", "et", "nl", "es", "simple"];

// setting false will leave other languages in the list

var removeOthers = false;

Reply to "Artificial Intelligence for choosing languages - really?!"
Nouill (talkcontribs)

Some peoples complains that wp:fr should offer on the default/short interwiki menu (https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/7_f%C3%A9vrier_2017#Langues_disponibles_sur_la_page_d.27accueil), like deutsch and spanish languages, instead of regional languages like picard. I relatively agree with that complain. I take deutsch and spanish for example, because there are predominant foreign languages that the general population learn on third language in school. 

Trizek (WMF) (talkcontribs)

@Amire80 told me to that a possible lead is to find data concerning the main languages spoken in France. That way, it would be possible to change the CLDR used by the CLL. But I've found nothing so far (INSEE, DGLFLF...). @Xenophôn, can you help?

Nouill (talkcontribs)

http://www.ined.fr/lili_efl2010/cahier_ined_156/ci_156_partie_8.30.pdf => "Institut national d'études démographiques" (Ined) studied that the main language (other that french) in Ile-de-France (12 millions habitants => Paris and around Paris) are in 1999 :

  • English (18 %, not 39 % and this is surely the more english-speaking region)
  • Arabic (5 %)
  • Spanish (5 %)
  • Portuguese (4 %)
  • Deutsch (~2 %)
  • Italian (~2 %)
  • Regional language (1 %) => (Ile-de-France don't have local/regional language)

http://www.ined.fr/fichier/s_rubrique/18724/pop_et_soc_francais_376.fr.pdf => show that the main language (other that french) in France which are transmitted to children in 1999 (so the regional language are surely more low now) are :

  • Arabic (500 k)
  • Alsatian (400 k)
  • Portuguese (350 k)
  • Oïl language (300 k)
  • Spanish (290k)
  • English (280 K)
  • Occitan language (250 k)
  • Italian (200k)
  • Deutsch (150k)

http://www.onisep.fr/Parents/Cartographie-des-principales-langues-vivantes-enseignees-au-college-a-la-rentree-2016 and http://ec.europa.eu/eurostat/documents/2995521/5177349/3-25092014-AP-FR.PDF/fce15e33-b870-4f68-9c06-0bb906186ec9?version=1.0 => Show that the 3 languages that are studied in school in France are English, Spanish and Deutsch.

http://ec.europa.eu/public_opinion/archives/ebs/ebs_386_en.pdf (page 23 (on pdf) or 21 (on the document)) => show that the foreign language those are comprehended in France are :

  • English (39 %)
  • Spanish (13 %)
  • Deutsch (6 %)

http://languageknowledge.eu/countries/france (is reporting the previous source) indicate that main languages spoken in France are :

  • English (24%)
  • Spanish (9 %)
  • Deutsch (5 %)
  • Italian (3 %)
  • Arabic (2 %)
  • Portuguese (2 %)

http://www.culturecommunication.gouv.fr/content/download/93537/841041/version/4/file/lc_10_occitan_def.pdf (page 5) show that in Provence in the same survey in 1999, the people who speak those language with others peoples are (the text say that the values are underestimated) :

  • English 4,4 %
  • Italian 2,6 %
  • Spanish 2,4 %
  • Arabic 2,2 %
  • Occitan 2,2 %
Trizek (WMF) (talkcontribs)

Wow. Great job!

131.175.28.130 (talkcontribs)

This is explained in ULS/FAQ#language-territory. English, Spanish and German have already been fixed: http://unicode.org/cldr/trac/ticket/9680 . Perhaps ULS needs to be updated to the latest CLDR release.

The INED data is not directly usable since it's about education (we already discarded Euridyce data which is similar), while the figures for Italian, Arabic and Portuguese seem usable given that http://languageknowledge.eu/about claims to have the raw Eurostat data as source.

Nouill (talkcontribs)

The Ined data isn't about education. The two links (and other that doestn't found) resume a survey on 380 000 peoples => "l'enquête famille de l'Insee-Ined de 1999". That I think is one of the main references on the subject. The picture 5 of http://www.ined.fr/lili_efl2010/cahier_ined_156/ci_156_partie_8.30.pdf speak explicitely about spearker. http://www.ined.fr/fichier/s_rubrique/18724/pop_et_soc_francais_376.fr.pdf is about language transmission in familly, it is incomplet because it doesn't include language of immigrant, or language learn in school, but that show clearly that the CLDR is very incomplete.

More generally, I don't very happy with the CLDR, because I see that highlight regional language and english. And I have the feeling that modify that CLDR is very long. And I don't thing the french community (which can be heavily against the dev when they want) will wait 6 month or more, to have deutsch, spanish, arabic, etc, on default setting, and I think that they will rapidly ask to remove the feature if it does not be improve.

Trizek (WMF) (talkcontribs)

It is possible to rollback the change made to display a list of languages on the Main page, until we find the relevant data and fix it. @Amire80, what do you think?

Nouill (talkcontribs)

I created a ticket http://unicode.org/cldr/trac/ticket/10056, you can see similar ticket on http://unicode.org/cldr/trac/search?q=%22Add+language+to%22&noquickjump=1&changeset=on&milestone=on&ticket=on&wiki=on

Trizek (WMF) (talkcontribs)

Okay, apparently Nemo-bis has forgotten to update the data. :)

Everything is ready http://unicode.org/cldr/trac/ticket/9680#comment:1 and France will have Spanish and German languages as inter-wiki links.

Reply to "Most common languages of France"
193.163.131.133 (talkcontribs)

Why is this feature turned off on the english-language wikipedia ? And the german and french language.

Amire80 (talkcontribs)

Also Swedish and Dutch.

These are bigger projects with more users, so it requires a bit more planning, but it will definitely be enabled in the coming weeks. Announcements with dates will be published soon.

In the meantime, it can be enabled as a beta feature, and we'll be very happy to listen to the feedback.

Liuxinyu970226 (talkcontribs)

@Amire80: And how about Meta-Wiki? I even can't try it out?

Amire80 (talkcontribs)

It's not on Meta because Meta doesn't have different language versions in the same way that Wikipedia does.

Technically, the preferences doesn't appear anywhere at Meta, because $wgInterwikiMagic is set to false there. If I understand correctly, pre-Wikidata interlanguage links like [[fr:Accueil principal]] won't go to the sidebar without $wgInterwikiMagic. Meta can have Wikidata sitelinks, however, which is why the Meta Main page has interlanguage links, but where else does Meta have them?

I guess that Compact Language Links could be enabled on Meta if the use case on Meta is comparable to Wikipedia. (And on a more technical note, maybe $wgInterwikiMagic could be set to true now that we've had Wikidata for years, but that would be a separate discussion.)

Reply to "On english language wikipedia"
Madglad (talkcontribs)

Please deactivate this unsolicited and misguided feature for dawiki.

The languages selected are not very relevant to Danish speakers. Several relevant languages are absent, and in their place are languages understood by only a miniscule fraction of Danish speakers.

Note that major changes to da.wiki are not to be implemented prior to achieving consensus on , or, if need be, a vote on the issue.

Rodejong (talkcontribs)

If it can't be deactivated, it would be nice to have the following relevant languages connected to danish:

  • DE German
  • EN English
  • FO Faeroe
  • FR French
  • KL Greenlandic
  • IS Icelandic
  • NL Dutch
  • NN Norwegian nynorsk
  • NO Norwegian bokmål
  • SV Swedish

Kind regards Rodejong

Pginer-WMF (talkcontribs)

Thanks for the feedback @Madglad and @Rodejong,

Multiple criteria is used to determine the likely languages for a user, and users can customise the languages shown at different levels. The easies way is just to navigate through the languages you are interested in, and those will be remembered for the next time. Users can also adjust their language settings in the browser or contribute to CLDR to have more accurate information about the languages in their region which is used as fallback information when direct information from the user is not available. Communities can also customize the order of the languages which is also taken into account in the language selection.

Note also that the languages shown are selected only from those in which the article is available, so they can differ from article to article. the lack of global settings, makes that the system does not learn about previous choices globally but on a per wiki basis, as a result it may take more time until your usual options are totally personalised.

Our observations suggest that the new links make it more easy to switch across languages, and the data we collected so far indicates that the cross-language navigation has increased for the projects where the compact language links are available. If this is not your case, we are definitely interested in knowing more details about your particular cases (which languages were shown for which article, which ones were expected, whether the results were better over time or not, etc.).

Thanks!

Reply to "Request for deactivation for dawiki"
Gamliel Fishkin (talkcontribs)

It is not a feature, it is a bug! It discriminates against many languages. These are languages with not many speakers (tens of languages in Russia, hundreds of languages in Africa, etc.), and also languages whose most speakers are not familiar with computers and Internet. For example, I know that the Saam language exists thank to links to Wikipedia in it. But the new bug will hide links to articles in "small" languages, and Wikipedia readers will think that those languages do not exist. So, please do not enable this bug and no more develop it!

DidiWeidmann (talkcontribs)

Dear Gamlie Fishkin I can strongly support what you say: The new policy is a big discrimination of small languages and is completely unacceptable it is against the principles of human rights and lacks of respect of small cultures! ~~~~

Amire80 (talkcontribs)

This feature will make these languages more prominent. Now languages of Russia, such as Tatar, Bashkir and Udmurt, will be shown prominently to people who connect from Russia. Earlier, you had to look for them in a list of more than 100 languages. Same for Saami—it will be shown prominently to people who connect from Norway or Finland.

DidiWeidmann (talkcontribs)

It gives the impression that this new feature is especially and expressly designed with the intention to discriminate several languages like Esperanto or Yiddish! There was now real need for such a system – I ask to restore the old System!

Eugrus (talkcontribs)

This is simply not true. The minor languages of Russia are not shown to me in the compact list on the Russian Wikipedia. What is being shown are just the wikis I use frequently. See w:ru:Земля, for instance, which has interwikis in a dozen of minor languages of Russia, but none are shown.

Amire80 (talkcontribs)

@Eugrus, from which country are you connecting?

Which languages do you see? If you see languages that you use frequently, then it works as it is supposed to. Languages that you use frequently are probably the languages that you need the most. Languages of your country are shown if languages that you use frequently are not known, which will be true for all people the first time they see compact interlanguage links.

Gamliel Fishkin (talkcontribs)

So, human beings outside of Russia will think that the Tatar language does not exist, etc. It is just a discrimination. As a final result of such a discrimination, almost any human being in the world will think, that there in the world only two languages do exist: his or her native language and English.

Amire80 (talkcontribs)

The user interface shows a list of languages that is customized for every user and helps people find information in their language. In articles with a lot of languages the list will have nine languages, and not two, and there will a button that says "X more languages", where X is the number.

Holder (talkcontribs)

For me this feature also looks like a discrimination of especially small languages. It does not help people to find information in their language, it helps people to find information just in big dominating languages!

Amire80 (talkcontribs)

Hi @Holder,

Thanks a lot for your comment.

As I explained above, this feature doesn't discriminate minor languages, but actually helps them by showing them more prominently to users that are most likely to know them.

I noticed on your user page that you are writing in the Alemannic Wikipedia. I checked the CLDR territory-language information table, and this language is supposed to be shown prominently to people who are connecting from France, Liechtenstein and Switzerland (search that table for "gsw"). At the moment, however, there is a particular bug for this language because of which it is not actually shown. I filed this as a task with high priority, and it will be fixed very soon. Once it's fixed, it will be shown prominently to everybody who is connecting from these countries.

Holder (talkcontribs)

@Amire80, that's indeed interesting news, thank you very much.

This long known problem is much more complicated: Alemannic language (and therefore also Alemannic Wikipedia) covers gsw, swg, wae and gct. That's why it hasn't been solved over the last ten years.

How will this be fixed in this case? It would be nice if als:wp would also be shown for readers in Germany where Alemannic is also be spoken by about five million people.

Amire80 (talkcontribs)

Hi,

The data that we use can be found at http://www.unicode.org/cldr/charts/29/supplemental/territory_language_information.html

I can see that swg is listed in Germany and wae is listed under Liechtenstein and Switzerland. gct is not listed anywhere, but you can ask to add it by clicking "add new" under the relevant country and supplying information about the number of speakers of this language in that country.

Technically, we can probably redirect all these codes to als, but I'll have to discuss it with the team. I added a comment at the bug report: https://phabricator.wikimedia.org/T139949

C933103 (talkcontribs)

It rely on CLDR and CLDR rely on some official figures, so if a country refuse to recognize a language is spoken in it then the data could be slewed. Different country also have different standard of what language being spoken there are common enough to be listed in it, for instance some languages spoken by only 0.x% population are listed for some countries while they are not in some other regions.

Amire80 (talkcontribs)

From my experience, CLDR is fairly flexible with sources, and they listen to people who send reasonable bugs. If you have data that a language is spoken by a certain number of people, I encourage you to submit a bug there.

C933103 (talkcontribs)

And so we need people with enough knowledge in individual country's situation and are willing to put effort into searching for non biased info about language usage situation and also understanding that some languages that are traditionally not considered as language otself is actually a language, and the person must also be neutral enough in term of the matter to avoid intentional overlooking some languages and must also be willing to spend time to report the problem to CLDR.

Gamliel Fishkin (talkcontribs)

Firstly, some human beings speaking the Alemannic language can live outside of the countries where most its speakers live. Secondly, as a result of this universal language selector, human beings outside of these countries will not know that the Alemannic language exists.

It was in some of the first years of the twentieth century in the Russian Empire. Some day, one little Russian girl seen a nameplate in Yiddish or Hebrew on the door of some Jewish family. She was not Jewish, just Russian, but these letters interested her, she learned much and became a Soviet semitologist. Similarly, someone can be interested in a language of another nation thanks to seeing language's name in the interwikis; but that universal language selector destroys such a chance.

Amire80 (talkcontribs)

@Gamliel Fishkin, I understand, but there is also another possibility: That somebody who lives in Russia and thinks that there is no Wikipedia in the Tatar language, will find out that there is one. Compact Language Links make this more likely.

C933103 (talkcontribs)

@Amire80 When most major languages are displayed outside the panel, the need to find interlanguage link from the panel would be minimized. This reduce the chance for user to discover discover what they might want, if they don't know such a Wikipedia exist before. Even in a huge list, users would have a higher chance to discover their familiar small language than such a large list because users would be more familiar with language names written in their native script and native language, but if they never click into the panel then the chance for them to discover their language Wikipedia become 0

Amire80 (talkcontribs)

The languages are tailored for each user, and they are not necessarily major. A minor language of the user's country will be preferred to a major language spoken outside of the user's country.

C933103 (talkcontribs)

In countries like Russia, India or China, there are far more than 10 languages spoken in those countries and inevitably native language of some users can only be found in the expanded panel.

Amire80 (talkcontribs)

This is indeed an issue: https://phabricator.wikimedia.org/T133029

There's no easy solution for it, but we'll definitely get there.

Gamliel Fishkin (talkcontribs)

The only solution for such an issue is to turn this "feature" off and forget it.

C933103 (talkcontribs)

Even if you enable subregion-based filtering, there are always regions like Moscow or Shanghai where every community in the country would have people going to there for economic reason and result in more than 10 languages spoken in the same subregion.

Jørgen (talkcontribs)

I can see great possibilities in this feature - if it is changed a little bit. It is impossible to get all people satisfied with a uniform solution. Let the user decide! Have a list in 'preferences' where you can tick all the languages you want shown, and a button below to show the full list. As a dane, I see english, spanish and german, but need french, swedish and norwegian too. I have arabic, urdu, chinese and hindi. These languages are probably spoken by some immigrated inhabitants of my country, but useless to the vast majority. Føroysk and kalaalisut are languages from the north atlantic former possesions. I have no idea what to do with them, most danes cannot understand them, let alone write these languages.

Holder (talkcontribs)

@Jørgen, the problem is that there has to be a decision what is shown to readers.

Jørgen (talkcontribs)

yes, let the readers decide themselves by ticking a list. And for Ip-readers, let the list be default 'all' as it used to be.

Amire80 (talkcontribs)

You can pre-select the languages according to instructions at Universal Language Selector/Compact Language Links.

Also, every language that you select simply by clicking is remembered, so this feature adapts itself to every user (including anonymous readers).

Madglad (talkcontribs)

Agree with Jørgen. The list is unusable, because it shows "big languages", of which many, nobody in a far away region understand (like Indonesian languages in Denmark, on the other side of the planet). It does not show the languages in the neighbourghing countries, that most people understand. Note also, that most users are not registered and cannot change their settings.

Amire80 (talkcontribs)

Hi @Madglad,

Are you connecting from Denmark? May I ask on which article do you see Indonesian?

Madglad (talkcontribs)

I saw Indonesian on several articles as far as I remember. Logged in from Denmark, and visiting da-wiki. But the languages are changing, depending on how I search around. But on another clean browser (tor) and logged out, I see

(still visiting da-wiki). This list of languages is not a good starting default value for Danish language speakers. It should be assumed that most people visiting da-wiki also know the other neighbourghing languages, and that almost no Danish speakers understand Indonesian and Indian languages. The algorithm shouldn't try to guess known languages, but should pick them from a list when visiting a language-specifik Wikipedia. Connecting to da-wiki from Australia, it should be considered more likely that the user understands Norwegian, than some aboriginal language. I don't understand why things like these are rolled out without previous discussion i the wikipedias.

Amire80 (talkcontribs)

In a usual working scenario, your previously selected languages are supposed to be remembered. If you are using Tor or other anonymizers or proxies, the system cannot know anything about you, so it is showing the biggest global languages, and Indonesian happens to be one of them. If you are using a private browser window, you also won't see any of your previous selections

You can configure your own preferred languages in the browser according to the instructions in Universal Language Selector/Compact Language Links.

Configuring preferred languages per specific project, as you suggest, will be possible very soon. See https://phabricator.wikimedia.org/T138973 .

Madglad (talkcontribs)

What is "a usual working scenario"?

I guess a typical scenario is a not logged in user, visiting one of the versions of Wikpedia, possibly not English. The IP is placed somewhere on the planet in region where the Indonesian languages etc. are not known, but the languages in the region are.

Quote: "Configuring preferred languages per specific project ... will be possible" - this gives me the impression, that this change is designed for en-wiki, and is not ready for implementation in the other wikipedias yet. Roll it out when it is developed and tested. Roll back for now.

Höyhens (talkcontribs)

Yes. This is a change that should not to be done.

Gamliel Fishkin (talkcontribs)

There is one more topic. I see no problem if the system uses IP-address and other current information about an unregistered visitor. But if the system not only uses current information, but also remembers pages visited by this human being, it is a privacy gap.

Amire80 (talkcontribs)

No, the Compact Language Links feature doesn't remember this information.

Madglad (talkcontribs)

If a user visits the Danish language Wikipedia from a Danish IP address it will be reasonable to assume that interesting language versions to the person would be:

*sv=Swedish (neighbour country, mutually intelligible with Danish)

*no=Norwegian Bokmål (neighbour country, mutually intelligible with Danish)

*nn=Norwegian Nynorsk (neighbour country, mutually intelligible with Danish)

*de=German (minority language in part of Denmark, neighbour country, language taught in in Danish schools)

*en=English (language taught in in Danish schools)

Languages spoken in overseas countries of The Danish Realm:

*fo=Faroese

*kl=Kalaallisut

Languages taught in some schools:

*fr=French

*es=Spanish

Now, this example focused on Danish (and Denmark proper) can probably be generalised to most languages; languages, which have no contact with Indian, Indonesian, Chinese languages, but have contact with a lot of neighbour languages.

Amire80 (talkcontribs)

If the IP is identified as Denmark, then German, Faroese and Kalaallisut are supposed to be shown in the initial list (if, of course, a corresponding article in these languages is available).

Your IP probably wasn't identified as Denmark, which is quite possible if you used something like Tor, so the world's largest languages were shown.

If you don't see a language that interests you in the initial list, you can click "X more" and select the language that you need, and the next time it will be shown in the short list.

I know that Danish is similar to Norwegian and Swedish, but do you have data about the number of people in Denmark who are actually reading in these languages?

C933103 (talkcontribs)
  1. IIRC wikipedia have its data about percentage of visit per language version per country? It should be possible to use the data in reverse to find out percentage of visit on specific language version in a specific country or subregion.
  2. You can also check the mediawiki language fallback tree?
Madglad (talkcontribs)

Many times more read Swedish and Norwegian, than distant languages like Chinese. Especially if the article is better than the Danish one. Exact numbers unknown, but almost nobody in Denmark is able to read Chinese, almost everybody is able to read Swedish.

But I think Danish/Denmark is just an example, the problem is general for language-wikipedias. The solution is usable for en-wiki, not for the Wikipedias of other languages, and should not be implemented in these, in the current form.

Höyhens (talkcontribs)

I must admit to be extremely worried and sad for this attack against Wikipedia as a free dictionary. Cancel it as soon as possible, please.

Madglad (talkcontribs)

I now see that part of the problem is the guess is made based on the number of native speakers in a country, not the number of readers, which is a very big mistake.

An important issue is the assumption is that everybody is logged in, and has set up their browser languages etc. Setting up browser languages manually is what nerds were doing in the Netscape times. We are writing 2016. And most users btw. are not registered Wikipedia accounts.

And finally, assumption should be made on basis of the language the Wikipedia is running, the IP solution is developed for en-wiki.

This experiment should be rolled back on all language wikipedias exept en-wiki, until an acceptable algorithm is found.

Amire80 (talkcontribs)

The guess is not based on the number of native speakers. We use the data from CLDR, which clearly doesn't refer only to native speakers—for example, the entry for Denmark puts English at 86%, which is obviously not the number of native English speakers in Denmark, but probably the number who know it in one way or another. If you can cite data about the number of people in Denmark who can read Swedish or any other language, you should add it there by clicking "add new" in the table.

Also, the software really doesn't assume that everybody is logged in. Obviously, the vast majority of readers are not registered. The languages that they click in the the "more" panel are automatically added to their preferred languages, and the research that we conducted showed that it works for casual readers.

As the FAQ says, the languages defined in the browser and the languages identified by geolocation are secondary to what users actually click. Once you click Swedish for example, you will see it in the compact list.

Madglad (talkcontribs)

Technical question: How are »The languages that they click in the the "more" panel« added? Cookie? IP-address? Or?

Nikerabbit (talkcontribs)

Using LocalStorage.

Amire80 (talkcontribs)

Our research shows that Swedish (and any other language) is less accessible when it is part of a long list than it is through the panel that opens when you click the "more" button.

The Nynorsk Wikipedia defined other Scandinavian languages as preferred in , so they would appear at the top of the long list. The same can be done in the Danish Wikipedia, and Compact Language Links will pick it up (not yet today, but soon).