User:Yurik/Wikidata OSM questions

From MediaWiki.org
Jump to navigation Jump to search

This page outlines why OSM objects should be tagged with Wikipedia and Wikidata tags, what benefits it brings to data quality and data consumers. Then there is a list of all open questions, seeking to establish best practices for using those tags.

Data quality benefits[edit]

  • Get all Wikipedia links in all languages for a given object
  • In case a Wikipedia article is renamed or deleted, Wikidata is more likely to stay intact. It will link to other languages or simply provide a placeholder for the future articles.
  • Easier to validate Wikipedia links, e.g. detect that link points to a disambiguation page instead of a place page when the original wiki page got renamed
  • Easier to validate administrative boundaries and admini_levels, to see that both Wikidata and OSM area hierarchy matches
  • Add multilingual names to any OSM objects

Usage benefits[edit]

  • Wikipedia can show object outline in various articles based on Wikidata ID
  • Data consumers can rely on Wikidata to get more international labels

Questions[edit]

General[edit]

  • Same WP/WD tags are set on both the relation and on the member way(s)
  • Same WP/WD tags are set on both the relation and on member nodes
    • e.g. a city boundary relation contains an admin_center or label node with the same tag(s)
  • Same WP/WD tags are set on both the relation and on the super relation

Rivers[edit]

  • a river could consist of multiple relations (multipolygon water and waterways) and simple ways

Maritime borders[edit]

  • Some objects have both maritime and coastal borders. Which should have WP/WD tags?

Tunnels[edit]

  • Some tunnels are marked with tunnel:wikipedia and also have wikipedia tag that sometimes match (and I feel should be deleted), or when it doesn't, it is due to main tag describing the whole route rather than the way portion. I suspect it should be using relation instead.

Disambiguation pages[edit]

  • OSM objects should never link to Wikipedia Disambiguation pages. If there is no good Wikipedia page specifically about the OSM object, it should not be added to OSM. Note that there might be a Wikidata page instead that matches the object, or it is usually very easy and quick to create one if one isn't there.

List pages[edit]

  • OSM objects should not link to Wikipedia List pages, especially if a more specific article exists. Lists tend to contain a lot of unrelated information. If no good article exists, we might want to use wikipedia_list or wikipedia_section to indicate that the link describes many similar objects.

Dealing with "wikipedia" tags when there is no perfect 1:1 match[edit]

There is a pending proposal for wikipedia_section and wikipedia_list tags

Sometimes, Wikipedia tag does not point to a perfect 1:1 -- a Wikipedia page often describes more than one OSM object. OSM has subject:wikipedia and other tags to indicate that the article is not about the object, but instead only relates to the object, but we may need to expand that:

  • Wikipedia tag points to a list of locations, e.g. list of subway yards or list of cathedrals
  • Wikipedia tag points to a redirect, that in turns point to a list, e.g. Concourse Yard
  • Wikipedia tag points to an anchor within an article (similar to above, but the "#' is explicit in the tag value)
  • Wikipedia tag points to a concept larger than the object, e.g. there is a "historical" (area + city center) and "administrative" (area without city center) boundary of a parish, but both use the same article about the parish.

Interesting cases[edit]

  • Disney land parks around the world have the same ride - Soarin', in OSM 404349958
  • A building has two Wikipedia articles - one for the building itself, one for the organization it contains 404349958
  • A city wikipedia:be-tarask:Сяхновічы is a disambig page in be-tarask wiki but has a full article in pl-wiki. Relation 6983524 includes two villages, each of which points to their own wikipedia/wikidata pages. wikidata:Q9336328 indicates that it is indeed a disambig page.