User:Henning (WMDE)/Wikibase/Concepts/Aliases

Essence: The term "alias" as used in Wikibase is wrong, at least misleading. The concept of Label and Alias(es) should be changed with the term "alias" getting removed.

What is an "alias"?
In general, an alias is defined as "false or assumed identity" as demonstrated in the following examples:
 * Being an alias of Eric Arthur Blair, George Orwell may be regarded such a false or assumed identity.
 * Officially being named Federal Republic of Germany, naming the country Germany is an assumed identity as Germany may, depending on the context, not refer to the current political instance, but to some historic one.

This, however, has its limitations when referring to instances that do not have a documented identity. How would one be able to specify some kind of identity of the Earth which may also be called the world or Gaia etc.. Earth may be the most common name for the planet -- but why should the other names of the Earth be regarded aliases in the meaning of "false or assumed identity"?

Even more, qualifying the initial examples, who decides on what is a real or "official" identity and who gets to decide on what is the "most common" name?

In computing, the term "alias" is of a generic nature: An alias is an "alternative name". In that generic sense, there is no actual difference between the original name and an alias. Both are equally valid. Obviously, this is the origin for using the term "alias" in Wikibase.

Wikibase and aliases
With the intention of being able to ease searching and identifying Entities, a Label is intended to be the most common name the subject represented by the Entity is known under. This, however, results in a discrepancy between the real-world perception and its digital model since the consequence is that mapping real-world aliases to Wikibase Aliases is not intended:
 * Since George Orwell is the name Eric Arthur Blair is most commonly known under, an Item representing the author would, following the intention, receive the label George Orwell while Eric Arthur Blair would be listed as an Alias. This, however, contradicts the common perception that an author's pen name is an alias to an author's birth name.
 * In the same sense, Germany would be the Label of an Item representing the Federal Republic of Germany with the latter being listed as Alias.

The result of the discrepancy between reality and digital model is confusion: Users of the software are forced to understand the concept of aliases as it is perceived in the limited context of computing.

Oddly, the technical implementation does not resemble the term's technical origin. If it would, a Label would be a pointer to a map of one or more names or every Alias would be technically equal to a Alias. Instead, the Label points to a dedicated object while the concept of Aliases points to another single object that contains a list of alternative names.

Why aliases?
With Aliases just being a container for alternative Labels not being the most common name of a subject, the intention for having the concept of Aliases is obvious: Instead of having a whole list (maybe even a randomly ordered map) of Label alternatives, emphasizing the one particular name most commonly known, eases finding and identifying an Entity in the user interface. Usability benefits considerably from having one particular name stand out.

Avoiding the term "alias"
The current status of the term "alias" in Wikibase is rather awkward. The actual code makes extensive use of the term as Aliases are a dedicated concept (see Alias Group). Being aware of the problematic term, the user interface tries to, well, use aliases for the term "alias", most prominently the phrase "also known as". By exclusively using such wording in the user interface, however, the discrepancy between user interface and, for example, API usage remains.

Struggle for consistency
'''The current implementation of aliases in Wikibase not only foils its own intention. It also conflicts with user expectation.''' There should be no need to use the term "alias" or "also known as" since the phrases captured in that concept may be assumed to be Labels as well -- just (probably) not the one most commonly used (but, again, who is defining "most common use" anyway?). It is fond, probably even naive to only allow specifying just one single term as Label as it contradicts not only the technical intention of aliases, but also the real-world perception of aliases -- each on a different logical level of the software.

Technical alternative
The technical solution is actually a simplification. Instead of having a dedicated object for the Label and a dedicated object for a list of Aliases, the concept of Label should point to a simple list of names. This structure would already demonstrate that the term "alias" is by no means about real-world aliases and would force the term "alias" to disappear. With Label pointing to a list of names, still, the first name emphasis may be put on. Emphasis in that context is solely user interface related.

Imposing order on the set of names, however, may result in conflicts. To address that, the concept of Label may rather point to a map with one item being flagged with some neutral "emphasize". However, rendering multiple names will always result in some visual order (even though it may be different every time the page is loaded). And since the sole concept of Label is to ease finding and identifying Entities, the implementation of Label should orientate on the visualization which can only be a list.