Talk:Universal Language Selector/Compact Language Links

Jump to navigation Jump to search

About this board

This is the feedback page for the Compact Language Links feature.

See the Frequently Asked Questions.

All feedback is welcome. You can write in any language.

You don't have to do this, but it will be helpful if you mention the following things:

  • The browser and the operating system you are using, including version numbers.
  • The language of your operating system.
  • The country from which you are reading the site.

See also:

Make this optional, not default

99
Pepparkaksgubbe (talkcontribs)

This new idea is strange. Links to lots of languages are hidden away, instead of being shown in the list where they are easily found.

If someone like this idea, let them have it as an option. For all others, including people who are not logged in, the default should be the real, full list of languages for every article, so you can find them. ~~~~

~~~~

83.134.162.97 (talkcontribs)

I support your remarks for 100%, but is there still hope for a multilingual list as a default choice ?

DigitalHamster (talkcontribs)

Yes, I completely agree. For instance, if you want to see how many languages a page is in, or you want to translate a page and are seeing if it already exists, this setting would make it difficult to check.

Whatamidoing (WMF) (talkcontribs)

I think it's the other way around. Compact Language Links gives you a short list of languages that are most likely to interest you, plus an actual, numeric count of the others. For example, https://simple.wikipedia.org/wiki/Bergen lists nine languages for me and tells me that there are 86 more. I can quickly determine that 9+86=95 (plus the page that I'm on). https://en.wikipedia.org/wiki/Bergen (where I have CLL turned off) gives me a long list with no count. With CLL, I can tell you that 96 Wikipedias have an article on that subject. Without CLL, I will tell you that the answer is "lots".

If you are translating a page, you've probably been reading in that language, so CLL will put the target language in the short list (not somewhere in the long list of 95 other languages).

Even if it's not in the short list, with CLL, you can search for that language by name and ISO language code just by clicking the button that says "85 more". For example, if you want to know whether that article exists in Finnish, you can find it by searching for fi (language code), Suomi (Finnish name) Finnish (English), ffinneg (Welsh), fiński (Polish), or many other languages.

Without CLL, you have to know that the Finnish word for that language is "Suomi", and you have to scan through about 85% of the long list until you find that word (because the fi language code is alphabetized in the S's). This does not sound like an improvement to me.

DigitalHamster (talkcontribs)

Actually, thinking back on what has been said and seeing other comments, if this list was expandable (just by clicking the "x others" button that is actually a very nice feature

Whatamidoing (WMF) (talkcontribs)

If you haven't used it for a while, then I encourage you to try it out for a few days. There's nothing more effective for discovering its strengths and weaknesses than trying to live with it.

As I said, I've got it turned off at the English Wikipedia, and that's not an accident. I need to do things like open every language's equivalent of the Village Pump (technical) in tabs more often than I need to find articles in languages that I can actually read. It's good for finding languages that I visit often; it's not so great for opening 62 tabs. (Also: Oh, how I miss Linky.)

For power users like us, whether CLL is more efficient or less efficient is a pretty personal thing, based on the kind of work that you personally do. You'll know whether it fits your personal work patterns after interacting with it a few days.

Madglad (talkcontribs)

I really miss an understanding that "power users like us" and registered, experienced user is a very small minority. Wikipedia is an open encyclopedia for everybody, and should work also for ip users in the whole world. CLL should be rolled back, until acceptable algorithms are devoloped. The algoritms take into consideration dialect continua and neighbouring countries. The current algorithm seems to have absolutely no understanding of this.

Whatamidoing (WMF) (talkcontribs)

The current algorithm is based on an internationally recognized database (whose name escapes me at the moment). I believe that the Language team would be perfectly happy to see that database updated to recognize that several Scandinavian languages are divided more by politics and history than by actual linguistic elements. If you've got the data that says, for example, that most Norwegians and Swedes and Danes can read each other's "separate" languages, then I'll find someone who can help you figure out how to propose a correction to the database.

Madglad (talkcontribs)

It seems that the purpose of the mentioned database is not the cover mutually intelligible, but to cover official languages. Mutual intelligibility is not only restricted to Scandinavian languages, see .

Mutual intelligibility is not the only issue. Schools tend to educate also in other important languages in the region, eg. many schools in western Europe teaches German and French etc.

Pepparkaksgubbe (talkcontribs)

Whatamidoing, I am never only interested in just a couple of languages which "most interest" me. I am interested in all languages and in how many languages there are where a subject has an article.

Whatamidoing (WMF) (talkcontribs)

I understand that you are interested in all the languages. Do you think that you are a perfectly typical reader?

Pepparkaksgubbe (talkcontribs)

The typical reader will be interested in different languages for different articles. If you read about a subject endemic to another country, you would probably be interested in seeing the article in that country's language, whether you know the language or not, because it will probably have more images, maps, tables etc. You will not be interested in exactly the same foreign languages in every article you read.

DigitalHamster (talkcontribs)

aues that’s a good point - the list should say what the original language was, and make this appear in the list of top languages anyway

Leofil2 (talkcontribs)

Hem... So Wikipedia does have the answer to what a perfectly typical reader is?

Why doesn't it offer, at least, the option of a list of all available languages? That would be user-friendly for everybody. A list, nothing more, accessible with one click. You choose the pre-digested "my small world as designed by W" selection, or you choose the list. Instead we have a multi-click procedure which allows us to look for nothing but what we already know.

I don't want a Facebook-like Wikipedia that shows me what its algorithm thinks I was made to see. This is the opposite of encyclopedic spirit.

Serendipity, guys, for goodness' sake. Your ULS kills it.

Madglad (talkcontribs)

It is common here in Europe that you learn several languages besides your native language. In Denmark eg often English, German and French. Being native Danish, means that you will also understand Norwegian Bokmål, Norwegian Nynorsk and Swedish. That totals to 6 languages besides Danish. That is perfectly typical.

And: Faroese and Kalaallisut (Greenlandic) have some status as minority languages in Denmark and is relevant to be shown always in dawiki, even though the languages are only understood by a minority of Danish speakers.

People having learned French will possibly also be able to understand some Spanish, possibly also Italian and Portuguese.

Amire80 (talkcontribs)

Faroese and Kalaallisut are supposed to be shown by default. Is there an article that has a corresponding article in Faroese or Kalaallisut, and which doesn't show links to them?

Pepparkaksgubbe (talkcontribs)

They are not shown by default in articles in Swedish, but I'd like to see them when reading about things to do with the Faroe Islands or Greenland. Why do you want to stop me from finding them easily in the extensive language list in the left margin? They do not bother anyone there.

Madglad (talkcontribs)

I guess not. It which just supplementary comment when counting languages. But of course not relevant to which languages, a random Danish person will understand.

Whatamidoing (WMF) (talkcontribs)

I freely grant that some people speak multiple languages. However:

  1. The ability to read and write a given language doesn't make you interested in reading that language's Wikipedia. (See, e.g., the many editors who speak a non-English language natively, but exclusively edit the English Wikipedia.)
  2. The fact that some people can do this does not mean that typical people can do this. IMO it makes more sense for the default to work for the typical reader, rather than the unusually accomplished linguist.
Pepparkaksgubbe (talkcontribs)

How can you be so sure about what the 'typical' person is interested in?

  1. It is not the ability to read and write the language which makes you want to seek out information in that langauge's version of Wikipedia. It is the interest for information. The interest which took you to an encyclopedia in the first place.
  2. Typical people will likely want to find information from any language where something is written on the subject, whether they know the language or not.
Madglad (talkcontribs)

As explained above Danish people will very often understand 7 languages, possibly more, some will understand only 5 or 6, or in rare cases fewer. And yes, the article about for example Stockholm will probably be more elaborate in Swedish than in Danish, thus it would be relevant to read the Swedish article if you want are more thorough covering of the subject. But that is not limited to cities, it's sometimes a little bit random, how well a given subject is covered in a given language. Replace 'Danish' with another Scandinavian language, get the same result.

Why does the WMF enforce this language selector on the Scandinavian Wikipedias, when it is not suitable for use in this region? I think it is not an issue which only relates to Scandinavian languages, but similar issues probably applies to Romanic and Slavic languages. Its the combination of mutual intelligibility combined with the language education in Europe, which makes this solution undesirable. Why is this selector not enforced in dewiki BTW?

To put it short: The algorithms used in this selector is not suitable for use on Northern Germanic languages due to the circumstances in Northern Europe.

DigitalHamster (talkcontribs)

good point actually, and i didn't know that it tells you the number. Nevertheless, the proposed feature makes searching for a language "irrelevant to you" more complicated and cumbersome, and I still think it should be optional for added customisability.

Pepparkaksgubbe (talkcontribs)

This was written by me at ~~~~

Pepparkaksgubbe (talkcontribs)

Why can't I sign my messages here? ~~~~

Amire80 (talkcontribs)

You don't need to sign the messages here, because the signature is automatic :)

Making this feature on by default makes it easier for most people to find and click the languages that they need. This was proven by experimenting with real users during the design stage, and also by the data about clicks that has been collected since the deployment of this feature started in June 2016. As you can see in the data, the number of clicks in all languages went up, and in many languages it more than doubled.

People who prefer to see the full list all the time can turn it off in the preferences.

Pepparkaksgubbe (talkcontribs)

(There is no full automatic signature, since there is no time stamp next to it. Now I see there is another kind of time stamp to the right.)

This feature hides most of the language links, making them harder to find and harder to go to. I don't believe in the proof, it is certainly sqewed because just the changed appearance made people click more, but with this feature it is harder to find the languages you are specifically looking for. The old, full list should be default.

Whatamidoing (WMF) (talkcontribs)

If it was just the changed appearance, then we should see a one-time spike followed by a decline later. I don't think that the data supports that theory; it seems to support a sustained higher level. The fact that it learns the languages you're most likely to visit probably affects this; instead of seeing a list of 100 languages, it shows a short list of the languages that you're most likely to visit.

That said, I've got it turned off on several wikis, because I have an unusual need to get between many dozens of language editions of Wikipedia. But I recognize that my work pattern is a bit different from typical editors. Most editors are only interested in languages that they can understand; they rarely need to leave a message or check a feedback page on wikis that they can't read. I also imagine that one of my colleagues, who can read about 20 languages, would find it a bit limiting. But we're the outliers, and the average editor seems to be better served by a tool that focuses on what you need (and lets you search by name or language code for anything that's not shown).

Madglad (talkcontribs)

This thing doesn't show the languages of our neighbourcountries, that everybody understands.

On the other hand it shows us languages, which may be big on the other side of the world, but languages almost nobody in my country understands.

It's algorithms simply aren't professional, it's data material also not. Please remove it from da-wiki, it's not suitable for our country.

Whatamidoing (WMF) (talkcontribs)

What language do you have your user interface set to (in Special:Preferences)?

Madglad (talkcontribs)

da - dansk

81.230.62.175 (talkcontribs)
Bandy Hoppsan (talkcontribs)

I agree, make this optional. And since this is all new, it was implemented just days ago, you can't have statistics for any significant amount of time yet to know wether the spike is a spike or not.

Amire80 (talkcontribs)
Madglad (talkcontribs)

So when neighbouring mutually intelligible are not shown and not easily accessible for the average reader, this is recorded in the statistics as that there is no interest to read the articles in the mutually intelligible neighbour language. Strange logic.

The best way to collect statistics is first turn the system off, collect the statistics, then implement the system (after testing).

Amire80 (talkcontribs)

That's exactly what was done, as you can see on the statistics page. Statistics collection started before the system was deployed.

In June 2016, the number of clicks on the interlanguage links in the Danish Wikipedia out of the number of pageviews was 0.48%. In February 2017 it was 1.84%. Here's the full data:

June July August September October November December January February
0,4801% 0,6607% 0,4995% 0,4867% 0,9838% 0,9653% 2,2078% 1,8352% 1,4091%

The deployment to the Danish Wikipedia was done on July 14. It took a couple of months until the percent actually started growing, but it jumped up in October, and even further in December.

The numbers for most other languages in the same range of pageviews as the Danish Wikipedia have a similar growth trend. Here is Greek, for example:

June July August September October November December January February
0,2812% 0,2803% 0,2444% 0,2914% 0,6072% 0,8320% 1,4014% 1,2741% 0,8271%

(And here is the pageviews data.)

Both the user testing and the subsequent clicks data show that the compact list makes language links more easily accessible for most people. People who find it inconvenient can disable it.

Madglad (talkcontribs)

It might be that there is an increase in interlanguage clicks with fewer languages to select from.

BUT:

This is not the full statistics. A year is 12 months and a statistic examination should contain data for at least 5 years to make a reliable test in the multinomial distribution of the hypothesis that the increase is due to the new system. Another test that could be interesting to do is the hypothesis that there is seasonal changes, it might be your numbers are just reflecting this. You need a professional educated in statistics at master level. Just looking at some numbers and say "There is a tendency" is somthing we should leave to the tabloid journalism. In other words the claim "the subsequent clicks data show that the compact list makes language links more easily accessible for most people" cannot be scientifically justified.

Most importantly: Your data is lacking info of the decrease (my assumption, for obvious) in clicks on nn, nb and sv from da-wiki, after these languages have been removed.

The claim "People who find it inconvenient can disable it" relies on the false assumption that the majority of readers are experienced, registered users, not ip-readers.

Do you have a link to the full data set, you are using?

Bandy Hoppsan (talkcontribs)

@Amire80, I have been around many language versions and never seen it anywhere until some days ago.

I also like to have different languages for different kinds of articles, if this has to be mandatory (which I still think it shouldn't be). When I write about bandy, I want to have links to Swedish, Finnish, Russian and Norwegian. If I write about Japanese manga, I want to have links to Japanese, English and some other languages. If I write about something in Switzerland, I want to be able to easily compare German, French and Italian. This seems to be impossible with this system, which is one more reason to be against it.

Amire80 (talkcontribs)

@Bandy Hoppsan, I'm not sure why you haven't seen this anywhere until recently. This existed as a beta feature since 2014. It was enabled as a default (non-beta) feature in Wikipedia in all languages except Swedish, Dutch, French, German, and English in August 2016. In February 2017 it was enabled also in French, Dutch, and Swedish. German and English will soon follow.

People who want to always see the long list, can disable the compact list it in the preferences (Appearance -> Languages; Utseende -> Språk).

Madglad (talkcontribs)

"People who want to always see the long list, can disable the compact list it in the preferences" - under the false assumption: All readers are experienced registered users and there er not many ip-readers.

Bandy Hoppsan (talkcontribs)

It should be disabled by default.

Madglad (talkcontribs)

Tell me, if an ip-reader wants to read about the Norwegian city Bergen, there will be a good chance that there will more extensive articles in Norwegian Bokmål (no, which is almost identical to Danish) and Norwegian Nynorsk (nn, which is mutually intelligble with Danish). These languages are not listed. As not logged in I get this list:

Another example, the Dutch city Utrecht, as IP:

In Europe people in general understand many languages, and these languages should initially be presented based on neighbouring countries and etymylogical relations, not on wrongly collected statistics. It make no sense to present languages big elsewhere in the world, but with almost zero understanders in europe.

This system is unusable in Europe.

Pepparkaksgubbe (talkcontribs)

So, when will this idea with compact language links be scrapped? If you don't want to make compact links optional, you shouldn't implement them at all! This forcing compact links upon everyone because of subjective and unscientific interpretations of statistics is ridiculous and insulting,v

Whatamidoing (WMF) (talkcontribs)

Compact language links have always been optional for logged-in users. Go to Special:Preferences to enable or disable this tool.

Pepparkaksgubbe (talkcontribs)

So? The problem is, that this is something you have to opt out from. If it is to exist at all, it should be something you could opt in to. Compact language links should not be forced upon people who are not logged in or who haven't discovered how to get rid of it.

Madglad (talkcontribs)

Maybe, but this software obviosly doesn't work in all regions, so disable it per default. What works for registered users is less important than making it work for all users.

Pepparkaksgubbe (talkcontribs)

The extensive link of languages cannot disturb anyone, as it lies in the left margin and each of the languages is easy to find since they are in alphabetical order and you also can see what they link to by hovering over them if you want to.

When the language links are cropped together, you have to go search for them in a square which opens over the text in the article, thus overlaying part of the text you are researching at the moment, and in this square you have to scroll and the links are not easy to find, partly because of the scrolling and partly since they seem to be in some random geographical order.

Whatamidoing (WMF) (talkcontribs)

You don't have to scroll. There's a search field at the top of the box. Start typing in the name of the language you want, and it will find it for you.

Without CLL, Bergen gives me three full screenfuls of languages in the sidebar.

Pepparkaksgubbe (talkcontribs)

So? It's still much harder to find the wanted language than to just have it in the left margin in the full list of languages, where it can be clicked immediately without searching or scrolling.

Whatamidoing (WMF) (talkcontribs)

If you can actually look at a list of 95 items in multiple scripts and "immediately without searching or scrolling" find a single item in it, then I'm impressed. Please consider uploading a screencast of how you "immediately without searching or scrolling" find items that are below the scroll; I'll send it to Design Research.

Madglad (talkcontribs)

No need to search, the language links are actually in alphabetic order. Scrolling through an alphabetic list is easier and more intuitive than searching through a a pop-up list ordered by first some arbitrary geopolitical grouping, next by the alphabet of the language, and third by the name of the country.

This system should be turned off by default until it can be reviewed by people having studied linguistics and/or computer science subject like HCI and user interface design.

Whatamidoing (WMF) (talkcontribs)

I believe that in the actual user research (which was, in fact, conducted by people who have studied human-computer interaction and UI design), words like "more intuitive" and "alphabetical" never appeared in descriptions of the older system.

This is hardly surprising, because very few people can correctly tell you whether Í precedes or follows I, or whether the Hebrew ע‎ precedes or follows the Arabic equivalent – or, for that matter, whether ﻉ precedes or follows itself, since there are at least three different systems for alphabetizing that letter. It is alternatively the 16th, 18th, and 21st letter of the alphabet, depending upon whether you're writing Arabic or Persian, and which of the two Arabic systems you are using for ordering.

So it might "be" alphabetical, but it "seems" like a random, unsearchable, unintuitive jumble to most readers, which is doubtless why so many strongly preferred a system that let them search in their own languages.

Madglad (talkcontribs)

Is this research documented somewhere?

Pginer-WMF (talkcontribs)

We have been doing research around language selection at different stages. For example, you can check the recordings of the research we did with the initial prototypes where we evaluated the general idea. The approach for language selection has been part of content Translation where different research studies have been organised in the context of campaign support and template translation in those studies language selection has been a common activity.

We also used secondary research such as the study on multilingual users to have a better understanding of the number of languages users use. After the initial deployments we are measuring data about the effect on cross-language navigation to better understand the effects on cross-language navigation.

Overall, it seems that finding a language you are looking for is easier in a short list than in a long list, but even for the cases where the language you look for is not there (which is an edge case that will only happen the first time but not the hundreds of times you'll access that language later) users can figure out that more languages are available and search helps them to find the language they look for. Data confirms that the changes not only do not introduce any barrier for users to navigate across languages but that such navigations increased after the change.

As mentioned above, the pattern of language access normally involves accessing a few languages many times, the main time-saver is the fact that after the initial selection, you will have the languages you use in a short and convenient list. Trying to optimise the fist-time experience is beneficial, but I don't think it should focus the discussion about the feature since the purpose is to improve the much common recurrent use.

Madglad (talkcontribs)

It seems that Danish Wikipedia has now been set up to automatically include Norwegian Bokmål and Swedish, which is reasonably enough.

But some questions:

Who has taken this decision? I haven't seen it discussed on dawiki.

Why is isn't Norwegian Nynorsk included on this list of default languages? Norwegian Bokmål is just one of the two Norwegian writing standards.

German is a minority language in Denmark according to the European Charter for Regional or Minority Languages, besides that most pupils in Denmark learn German in school. Same questions: Why is isn't German on the list of default languages?

Amire80 (talkcontribs)

German is definitely supposed to be included in the initial list, because it's defined as one of Denmark's languages in CLDR, along with Faroese and Kalaallisut.

I'm not sure what do you mean by this part: "Danish Wikipedia has now been set up to automatically include Norwegian Bokmål and Swedish". It has been possible for a long time to define preferred languages for a wiki. This existed before the Compact Languages Links feature, and Compact Language Links respects this setting. To see the current definitions, go to https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php and search for "wgInterwikiSortingSortPrepend". I don't see a definition for the Danish Wikipedia there. It can be added if the Danish Wikipedia community requests it, although I suggest to add no more than three languages there because it will take over users' selections.

Dyveldi (talkcontribs)

DLL was turned on at Norwegian WP almost a year ago. I did not turn it off. I wanted to see what happened. The system has never been able to figure out which languages I visit most.

The system still tries to convince me that I visit Arabic, Chinese and Russian. Which I might have done due to the fact that the languages are presented first. The system influences my choice, but I do not seem to be able to influence the system. The system in this way is self confirming, it makes choices for me and then assumes it is my choice.

The system has figured out that I visit English, but I guess I have that in common with almost everybody reading Norwegian so it is not a personalized choice. It is everybody choice.

The system has not been able to figure out that I visit Italian at least as often as French. The system "thinks" I visit Spanish and Portuguese in spite of my more frequent visits to Italian.

The system has not been able to figure out that second to English I visit German. If I look at an article in another language I most often look at the English article and second I visit the German article. I visit German almost as often as I visit English.

The system does not reflect my personal work pattern. The system is not personalized to me. Calling this personalization is a sham and a shame. I have no idea who it is personalized to, but it most certainly is not my person and it seems to be automated guesses based on centralized assumptions of what someone guesses I might do without consulting me. and they have nothing to do with being personal.

If I read and do not log in it gets worse. It most certainly does not reflect the languages that are often used by persons reading from Norway. The system cannot possibly reflect work patterns from ip-adresses in Norway.

Amire80 (talkcontribs)

It doesn't guess your most frequent choices, but your last choices. It doesn't keep a count. So if you visit Italian frequently, but you clicked Spanish recently, Spanish will be remembered, and Italian may be removed from the list of recent languages. It's actually very simple, and there's no super-smart AI behind it, the behavior is documented in the FAQ, and of course all the code is open. It's possible that it can start counting the frequency, too, but doing it in a way that respects users' privacy may be tricky.

Do you have data about what languages do people in Norway use? Currently the CLDR data lists only Bokmål, Nynorsk, and Northern Sami, and I agree that this list is definitely too short. It should probably have at least English; for example, Denmark has English, even though it's clearly a second language for most people there. If you have a census with information about languages that are spoken in Norway, it should be submitted as a fix suggestion. (Nemo bis, didn't you do something like this? Or may Norway wasn't included because it's not in the EU?)

Dyveldi (talkcontribs)

I have looked at the list since this was introduced on Norwegian WP. It has never reflected my last choices. It keeps including languages I have not used recently, some of the languages included I might have visited once or twice the last year. Languages I have used recently are to a large extent not included. I have kept an eye on the list for almost a year and it has never had anything very much to do with what I do.

Today I even had a list of languages where English was excluded. I visited Mozilla Firefox and English is not one of the "chosen" languages on todays "menu". Since English is the language I by far visit the most this goes to show how little the links have to do with my user pattern.

Pepparkaksgubbe (talkcontribs)

When will the compact list be done away with? I see absolutely no good reasons given above for this to be forced upon the readers of Wikipedia. Wrongly interpreted statistics and unreferenced asumptions about what the "typical" reader wants are no good reasons. If some people still want this, then do it an opt-in option. Don't force it upon all and anyone.

Eduarodi (talkcontribs)

I use Wiktionary because I love languages, and I am interested in seeing how words are said in different languages. So I'm finding it rather tiresome having to deselect this option in every language Wikipedia I visit. I wish it could at least be possible to switch this option on and off for all languages at the same time.

Pepparkaksgubbe (talkcontribs)

So, when will you make this optional all over the line of all Wikipedia language editions? We are waiting. Noone but a few people like the extra clicking this force you to.

Fano (talkcontribs)

I strongly support to make this feature optional, not default. For reading it prevents a good overview and hinders cross wiki comparison. For writing it makes small cross wiki improvements (where no language skills are needed, e.g. Pictures, numbers,...) and search for good sources (for improving the articles in the languages we write in) more complicated. --~~~

Amire80 (talkcontribs)

Statistics show that when this feature is enabled by default for anonymous readers, they click the links more frequently in all languages.

If you find this inconvenient, you may disable it in the preferences.

Madglad (talkcontribs)

What is the formal procedure to start a referendum about this question?

Mautpreller (talkcontribs)

Same question by me. I think that this feature is counter-productive especially for new users. It may be useful as opt-in for persons who know how to deal with such an algorithm, it is definitely patronizing for persons who don't know. A "Meinungsbild" (RfC) should be started in all Wikipedias if the communities really want that or not.

Pginer-WMF (talkcontribs)

One of the main aspects we wanted to check during the user research we did for this feature was whether users were able to find the languages they are interested in even when those were not initially surfaced in the short list. We find out that there was no problem in discovery and features such as the flexible search (where you can just type "español" or "spanish" without having to figure out whether your language is next to those starting with "S" or with "E" in a long list) were helpful.

Users often look for the small set of languages they know. These users know which are those languages, so the feature is unlikely to drive them to a different set of languages. For example, users not speaking Italian won't be likely to go and read the Italian Wikipedia version of an article even if it appeared in a list (in the same way that non-Afrikaans readers rarely feel tempted to read an article in Afrikaans just because it appears on top of the alphabetical list).

What the feature does is to acknowledge that multilingual users switch often across the small set of languages they know and make them easier to find, without having to visually scan a potentially really long list of languages ordered by their ISO code and combining scripts they may not know, every single time.

the feature also includes other fallback approaches when the user has not made previous choices to make the list more meaningful than just alphabetically ISO code-based, but optimising for repetition is the priority and what makes the whole cross-language navigation more fluent. The old list ignores that and this is harmful for small languages. When articles are available in small languages, they also available in many others. So users most of the time have to san a long list to discover their language is not there, discouraging them to keep trying when the article happens to be available in their language.

C933103 (talkcontribs)

But it would hinder user's capability in getting linked to languages that they was not initially interested in.

Mautpreller (talkcontribs)

"One of the main aspects we wanted to check during the user research we did for this feature was whether users were able to find the languages they are interested in even when those were not initially surfaced in the short list." How do you know which languages the users were interested in? Did you ask them?

"For example, users not speaking Italian won't be likely to go and read the Italian Wikipedia version of an article even if it appeared in a list." This is a misguided assumption. Why shouldn't they want to see if there are other sources, or pictures, than in their own language?

Instead of simply improving the search for the language versions, you prefer to tell the users what "they want". Top-down instead of bottom-up, patronizing instead of empowerment.

Pginer-WMF (talkcontribs)

Thanks for sharing your thoughts, @Mautpreller. I'll provide more detail about the different points you mention:

How do you know which languages the users were interested in? Did you ask them?

Yes. We have tested the feature in different stages and contexts: we tested initial prototypes, the developed feature supporting interlanguage links and also its use for language selection in Content Translation. When recruiting users for our research sessions we asked for the languages they know to get a diverse group of testers. In some tests we proposed some specific languages including some of those they know and in other tests we just ask them to look for their languages in general.

This is a misguided assumption. Why shouldn't they want to see if there are other sources, or pictures, than in their own language?

Note that I wrote that "users not speaking Italian won't be likely to go and read the Italian Wikipedia version". I'm aware that there are many other activities where it makes sense to visit versions of an article in a language you don't understand. However, I think that cross-language navigation to read content is an important enough scenario to provide some better support for it.

Instead of simply improving the search for the language versions, you prefer to tell the users what "they want". Top-down instead of bottom-up, patronizing instead of empowerment.

I don't understand this point. The languages that are surfaced do not came from "us", but they come mainly from the users themselves in different ways:

  • The main criteria for surfacing languages is the explicit previous choices the user does. Each time the user navigates across languages, the user is selecting the languages that will appear next time.
  • The languages of the user browser are also considered. Those languages are selected by the user directly (through configuration) or implicitly (by installing a specific language version of their browser and keeping the defaults).
  • The user Babel box is considered also as an indicator of the languages the user knows. This is aded by users in their user pages to communicate the languages they speak.
  • Each wiki can configure some relevant languages to be considered. This is a community decision, and it was already happening with the old system where some communities showed related languages on top of the list.
  • When there are no other clues, geolocation is used based on the language information on CLDR, which is a crowdsourced language-related repository where everyone can contribute to.
  • Finally there are several other low priority criteria that came from the article content (e.g., being a featured article) or in the case of the lack of the above based on statistics on the most spoken languages.

In contrast to the previous situation where languages that appear at the top of a long list are based on the ISO code, I think the current approach takes much more user input into account for them to make a choice.

Dyveldi (talkcontribs)

Pginer-WMF  claims that "

  • The main criteria for surfacing languages is the explicit previous choices the user does. Each time the user navigates across languages, the user is selecting the languages that will appear next time.

"

This function does not work. I have tested your system for over a year and the system have never been able to understand that I visit German second to English. Some days I visit German several times a day. In spite of this German does not show up in the sidebar (with a few exceptions, i. e. less than once a month). The system does in no way reflect which languages I have visited. It does however show several languages which I do not at all visit.

Amire80 (talkcontribs)

This might be a bug, but I need more information about this. Did you try testing it according to the instructions that I posted in another reply?

Mautpreller (talkcontribs)

Last point first: The crucial point is not which data you are using. They may "come" from the users but the users lose the autonomy to do what they want with them. The crucial point is that the readers themselves should be able to make a free decision (empowerment). To do this they need all information available. Which means: they absolutely must have access to a complete list of language versions. My problem is not that you try to make this choice easier by making it better overseeable or improving the search, my problem is that you deny them access to a simple list. This is patronization pure and clear.

As to the testing: I still do not understand. Did you try to find out what the users want to do with the interwiki links? Did you ask them: how do you use these links, for what purpose? Did you ask them whether this purpose was better served with a complete list or an automated suggestion? In short: did you use users' explicit interests as starting point for your testing or did you use your assumptions what they might want? I gather that it was the second alternative. Did you ever try something like en:Action Research?

Another question: if the initial list is unsatisfactory (which will usually be the case), why don't you offer a choice "see complete list", alphabetic, in the language of the wiki they are just using, without any categorizations as to continent, in addition to a search function? You seem to think that users are always "looking for" a special set of languages. However, this assumption is not justified. They simply could want to see in which languages an article exists. They might like to discover articles and language versions they didn't know about. They could do this in a simple way in the current layout, now they are forced to adapt to your assumptions what they "want to do".

Pginer-WMF (talkcontribs)

I agree that readers should be able to make a free decision, but I don't see how we are limiting that. One of the most effective ways to hide something is to surround it by hundreds of other things. With the previous approach, many users had the few languages they are interested in buried around hundreds of other options that are not relevant to them. I don't think that a language in the middle of a flat list of hundreds of words is providing the chances for users to make a free decision.

Research showed that users are able to understand that the article was available in more languages than the ones surfaced initially and they were able to reach the full list. So they can reach the path to any language they decided to select.

Regarding your research questions, we focused on content consumption, which is a common need by most of our users, but we also tested it with translators that are often advanced editors. We learnt from their needs, but given our diversity of users we are open to hear about new usecases and needs to adapt to.

I don't think that "users are always 'looking for' a special set of languages", but we know that multilingual users move across languages frequently to read and edit articles (this is an interesting study on the subject), and helping them to make their repeated use easier facilitates and encourages that navigation, as data shows.

People that "want to see in which languages an article exists" they can easily know in how many languages it is available because they have a counter now while they had to manually count the links in the previous design, and they can still access all languages through the panel and knowing that "Wolof" is in the "Africa" group does not seem to hurt that curiosity but to provide context for it if the user has not heard about it before.

In any case, there still is a way to get a flat list of language links by clicking on "edit links" that can serve other usecases.

Mautpreller (talkcontribs)

Somewhat strange, your answer. I hope I understood correctly that you did not ask the users what they want to do with the interwiki links, or, in terms of action research, you did not permit them to do their own relevancy settings. You took your own assumptions as to the "use cases" as a starting point for your research-and-design activities.

It goes without saying that users know that there are more interwikilinks if you write "68 more languages". But it is not true that there is a simple way to get a complete list. "Edit links" is about the least intuitive command to get a full list, let alone that you are directed to a totally different project (viz. Wikidata) with totally different features. Why not simply write "complete list" and show this complete list in the sidebar? In this case especially new users have the choice. They don't have this choice if you hide it from them.

For experienced users there are many possibilities to deal or cope with the feature, to find workarounds, to switch it off, etc. New and inexperienced users are forced to take what they get. The simple interest to know in which languages (not: in how many languages) an article exists is not addressed, you make it as difficult as possible to get the answer.

I confess I do not like the translation tool either. Not because I have problems with translation but because it is a source of international homogenization. It is good (and not bad) that different language versions provide different solutions to the task of writing an article. But that is not important for my argument.

Pginer-WMF (talkcontribs)

Sorry if I mixed several concepts in my answer and created confusion. Just to add some clarity:

- We asked users about their different usecases and how those were supported. In fact, this specific project started from ULS a more general project to support language-related settings and support. Accessing content fro reading and editing was a very significant usecase to focus on and improve.

- When I mentioned that users can easily access all languages the article was available in, I was referring to accessing them through the panel that is available through the "X more" button. All languages are there for users to look, and research shows that people interested in it can reach the list. What I wanted to clarify at the end, was that even if there are some usecases for which a flat list is preferred, such list still exists through Wikidata.

I think that in order to understand problematic usecases that we are not solving well, it would be helpful to have some specific examples, illustrating which options were provided, what a user tried to achieve, and in which way that was not possible or was problematic to do. If you have experienced some of those, it would be useful to hear the details.

Thanks

Mautpreller (talkcontribs)

I can tell you some "use cases". One "use case" is that you want to know if any language (any!) has interesting pictures, possibly locally stored, that could be of use. Another "use case" is that you want to know about sources. Are there really sources about Mozart in azerbaidshanic language? What might that be? Or did they simply use the English ones? Or do they read German? Is there anything like a language bias? Or: I know that one source I read is bullshit. But it is easily accessible. In which languages, I wonder, will this source be used? Another, very realistic use case: if you do translations in your job, you often want to know how a word (lemma) is called in another language. Interwiki links are a fantastic source for this. However, they are not if you limit their use beforehand.

For all these "use cases" and for most others, your initial selection is simply useless. I, for my part, was born in Bavaria and I am living and working in Bavaria now, but I have no interest whatsoever in Bavarian articles. But, depending on the subject, in a great mass of other languages.

Howver, by far the most important "use case" is sheer simple idle curiosity (which is, I want to add, the primary source of all knowledge acquisition). If I see a long list of Mozart articles in languages that I never even heard of, I want to have them at my display. I never knew that there are two variations of White Russian. Will they be different from each other? In what respect? There are obviously articles about the Second World War in all languages of all states that were involved in this war. This is interesting. How might they tell the story? Now, any new user sees this at first view. He can browse, he can go on a discovery trip. With your tool, he won't see this at all. If he comes from my country, he will be reduced to the dull selection your algorithm gives to him, Bavarian, Plattdütsch, English and so on. Idle curiosity is, I admit, not really a "use case", but I can't imagine anything that is more important to the Wikipedia projects. I, for my part, am totally capable of switching your patronizing feature off, working around it, using Wikidata. But no one who does not have intimate knowledge of the Wikipedia system can do this. You take this simple possibility away from him. I call this a disgrace. If you give your program to the users' disposition if they want it, everything is fine. If you force new users to use it, nothing is fine.

I understand that this feature will be activated no matter what anybody says. Well, why do you not provide a button "see complete list", in alphabetic order, without preconceived "categories"? In this case, it would still be relatively simple to take a look at the wealth the worldwide Wikipedias have in store.

Amire80 (talkcontribs)

OK, this turned to be a very a long reply, because I wanted to take the proper time to articulate it, as I appreciated your words and your tone. Here’s a summary:

  • All these use cases are relevant, but they are not hindered by Compact Language Links, at least for most readers.
  • Logged-in users, who have advanced needs about languages and who feel that this is inconvenient can disable the feature.
  • Your suggestions towards the end are valid, though probably only useful for some particular groups of users. I filed them as a feature suggestion to explore later.

For all of the use cases in the first paragraph—the languages are all accessible. I agree with you when you say “Interwiki links are a fantastic source for this.” This is why they are just compacted, certainly not limited.

Moreover, for most people they are more accessible. Let me explain why. We can compare three ways of showing interlanguage links, and this is backed by data:

  1. Showing them as they are now: the whole list immediately when the article is opened.
  2. Compact Language Links: Showing them as a compact list of up to nine languages, with a button that shows the rest with a search box, in which you can search in your language.
  3. Hiding all of them until the user clicks a button to show the whole list.

If option #3 doesn’t sound familiar, it’s because it doesn’t exist any longer. It was used for several weeks in 2010. Analysis proved that showing no languages immediately when the page is loaded causes a ~75% drop in the number of clicks. This drew severe, and justified, criticism, and after several weeks the code was changed so that by default the whole list would be shown, with the option to collapse it. Later the option to collapse the whole list was removed completely.

At the same discussion that caused the reverting of the full collapsing feature in 2010, a proposal was made to show a compact list. Very briefly, the hypothesis was that not showing any languages initially makes it harder for readers to realize that any other languages are available at all, but showing a short list of languages makes the readers aware that some languages are available. It’s impossible to guess the perfect languages to show initially without any user input, but the point is not to guess them correctly; the point is to make the user aware that some other languages are available and to make it easier for the user to find the language they need.

In 2017 this is no longer a hypothesis. Now we know from the data that we collected that showing a compact list makes more people click on links. This applies to all the languages: "small" ones and "big" ones. All languages now receive more clicks, whether they are shown in the initial compact list or not. In some languages, the percent of people that clicked on language links went from ~0.3% to above 1% or even 2%. (FWIW data is open and everyone can freely analyze it again.)

So, of the three options above, we know that the one that option 3 is the worst, option 1 is in the middle, and option 2, the compact list, is the best, because it's the one that is the most convenient for millions of readers. We know this; it's not a guess, an impression, an opinion, or a hypothesis. It's data.

Hence, I am saying, that for most people the languages are not just accessible, but more accessible.

Of course, if you feel that this is less convenient for you, you don’t have to use it, which is why it is a preference that can be disabled.

Showing languages in alphabetical order is impossible because they are written in different alphabets. Showing all of them in German will make it possible to show them in alphabetical order and it will possibly be convenient to somebody who knows German, but it will make it completely unusable for people who don't. I filed this suggestion as a feature request, because clearly it will be useful for at least some people, for example the case for a German Wikipedia reader or editor that searches for images, or checks if the article is long or has sources.

Mautpreller (talkcontribs)

Hi Amir, three points: First, you say that a rising number of clicks proves that people find the language they are looking for more easily, that this is more "convenient" to them. I don't agree. This might also mean that they more often click on languages they are not interested in, only to find that these language versions are not interesting. To measure successful search means to ask people: did you find what you wanted? Was this way to present the languages more convenient to you than the former one?

Second, I can't see that you tested the combination of search function and plain list, or else the combination of compact list, search function and plain list. The problem about Wikidata is that you have to understand it (which is hardly possible for newcomers) and that the list given there not even provides mouse-over information but only a cryptic abbreviation. Your argument about the alphabets ignores that there are transcriptions and transcription rules. Every library catalogue uses them since many years. Why shouldn't a plain list in alphabetical order be offered?

Third, the initial selection is hardly useful and the presentation according to continents next to useless. There are extremely few persons who really want to read articles in a German dialect - a dialect that is usually not written at all but only spoken. Of course you can say: let's help small languages, but this is in conflict with the usability issue. Morover, the continent classification does not help at all. You find English in every continent division - but why should you look for English under "Middle East"? You got it already under "Worldwide". This is only confusing, much more confusing than a plain list. Of course you can always say: not matter how good the initial list is, it's only important that there is a small list. But you can do those things better or worse. You could include the "featured articles" in a first selection. You could use the article categories. You could use the languages that are really most spoken in this country. And most of all, you could throw away that "continent" display and simply show all other languages. Maybe like in Wikidata but with mouse-over to show their names.

The search function, however, is the best thing of this program.

Amire80 (talkcontribs)

About the the first point: On one hand, I have to disagree with the general premise: Why would people click more frequently on languages that don't interest them? Common sense says that people click on links that do interest them, especially when we're talking about languages. People usually have no interest in clicking on languages they don't know.

On the other hand, I do agree that more qualitative research has to be done, and in fact, the search engineers are already experimenting with such things, as you can see in this recent blog post: Admittedly loopy but not entirely absurd—Understanding our Search Relevance Survey. This is about search results and not about interlanguage links, but yes, we should do something similar with language links, too. There are always things to improve.

About the second point: testing the search function with a plain list is a valid idea and we may do it some day. In general, however, a plain list has several problems, which were already mentioned in several places, for example that it is sometimes longer than the article itself, and that it mixes many different writing systems.

Another problem with the long plain list, and this is related to your third point, is that it's generally hard for most people to perceive and process, and this is something that was, in fact, tested in live sessions with a selection users. The primary idea of the compact list is not to show the most likely languages; we do try our best to do this, but it's just impossible. The primary idea is to make the reader aware that the article is available in some other languages. The compact list has a more-or-less predictable size and location on the page, and it self-adapts to the user after the user clicks on languages that interest them.

The same goes for the division to continents: it's not so much for finding a language under a particular continent, but for dividing the list into smaller parts that will be convenient to read, and most people don't scroll through the list anyway, and instead use the search box.

Mautpreller (talkcontribs)

You didn't get my first point right. Users may not be "interested" in clicking upon a link but they might do it all the same because it is there or because they are confused about the continents-thing. More clicks cannot be equalized with "interest better served". It is only a very weak indicator.

The division to continents is not simply a "division". The big languages are several times repeated! This is a virtually unusable feature. Again: why don't you give the choice to show a complete list? As an IP it is impossible now to get it. And the alphabet can be used also if there are multiple writing systems. Every library catalogue has been doing that for many years. There is a thing called transcription.

Amire80 (talkcontribs)

I have to stand by what I wrote earlier: Why would people click on languages that don't interest them? I get it that Wikipedians love clicking curious things —I certainly do—, but most people aren't Wikipedians. Most people just want to read what interests them, as quickly as possible, and most people aren't in the business of randomly clicking languages they don't know. And people who are curious about different languages can still click on all of them; the compact list makes it easier, not harder. So, we cannot know this for sure, but it makes much more sense to me that if more people click on language links, then it's because it's easier for them to find the language that they need. Yes, correlation doesn't apply causation, but the difference between what happened with the complete hiding of the list in 2010 and with compact language links is too striking to be dismissed: complete hiding brought the number of clicks down 70%, and compacting brought them up.

I disagree that the fact that languages can be repeated in several regions makes the feature unusable. The language name is seen, and it can be clicked. It does look quite ugly, however, so I agree that it should be fixed to make it nicer. See https://phabricator.wikimedia.org/T41921

Making it possible for anonymous users to see the long list is also filed as a request. We should consider it, but an important point here is that "just adding a button" will cause visual clutter, and clutter is a problem, especially in a feature for people who may not know the language of the wiki at which they are looking. Despite this, I do plan to do design research about this, and maybe make it possible if a good design solution is found.

Finally, about alphabetical sorting. Yes, library catalogues do it, in very large and multilingual libraries, and there are transliteration systems used by linguists. But the vast majority of Wikipedia readers, and web users in general, are neither librarians nor linguists. The goal of interlanguage links is to make wikis in all languages more accessible, and compacting the lists achieves this without teaching hundreds of millions of people about transliteration and cataloguing.

Madglad (talkcontribs)

Yes: I am sure that for example the two Indonesian Wikipedias get more hits from Denmark now, because people hit the wrong line. But people will hit the back button next, as almost nobody in Denmark understand Indonesian,

And: I know I can etc... - but I am talking about ip readers, so I will still not use this feature, I want see which trouble it causes to the average reader.

The searchbox: If I need another language version, go via the English or German W. Noone has dared to install this tool there.

Let's take look at de:Deutschland there is a good list of links. But not in the right alphabetical order. The seems to be sorted by the English name of the country, but should be based on the language of the Wikipedia.