MediaWiki Developer Meet-Up 2009/Notes/Mapping

From mediawiki.org

What we need to get embedded maps on Wikimedia sites[edit]

  • A database server that mirrors the Planet.osm file
  • A tile rendering infrastructure that works with the database server
  • A way to embed OSM maps in Wikipedia pages

Written notes[edit]

Here are notes written down during the session as-is:

Wikipedia & OpenStreetMap Map project[edit]

  • Maintainer: Ævar Arnfjörð Bjarmason
  • WMDE buying servers (JensFrank)
  • Loading OpenStreetMap data
  • Rendering map tiles
  • Map (e.g. Slippy map or Static map) extension for MediaWiki
  • Coordination needed!

Here's a blog posting about this part of the project, and a follow up discussion on osm-talk.

Geographic data from OpenStreetMap / Wikipedia and other sources[edit]

  • Maintainer: Jochen Topf
  • How to bring them together
  • Databases
  • APIs

Wiki projects to improve geodata quality[edit]

  • Maintainer: LA2
  • Improve coord templates, standardize & internationalize
  • Exchange experience between language communities

This session was held 12:00--13:00 on Saturday April 4, 2009 just outside of c-base.

How should coordinate improvement be organized?
Participants brought experience from geographic coordinates in the English, German, Dutch, Swedish and Icelandic Wikipedias. The German Wikipedia has a "WikiProjekt Georeferenzierung" which seems to be in active use, but most other languages don't have anything similar. Since coordinates for the same place should (hopefully) be the same in all languages of Wikipedia, perhaps a more standardized international approach will be useful?
Action: LA2 intends to extract coordinates from all languages, and present statistics on quantity and quality.
How many articles have coordinates?
We know very well how many articles Wikipedia contains in each language, but how many articles contain coordinates? We know that the 28 % of all articles in the German Wikipedia are biographies, because they are categorized as men/women. But what percentage of articles can be connected to a geographic location? Some 80,000 articles in the German Wikipedia have coordinates, roughly 10 % of all articles. But is that a reasonable number? It turns out that 160,000 articles in the Dutch Wikipedia or 32 % of all articles, have coordinates. Maybe that is an upper limit, or maybe the percentage can be even higher. The Swedish Wikipedia has coordinates in 9,000 articles, or merely 3 % of all articles.
Action: We should survey all languages of Wikipedia regularly, to compare languages to each other and follow the improvement over time.
How do we find articles that have coordinates?
Coordinates are typically entered by a template. But the design of these templates can vary between languages and parsing template syntax from the database dumps can be complicated. user:Kolossos showed how he had parsed the database dump for all calls to templates that belong in the category:coordinate templates. But this method only found 96,000 coordinates in the Dutch Wikipedia, which is known to have 160,000. The method missed templates that belonged in a subcategory, and perhaps also infobox templates that called a coordinate template. Parsing template calls in database dumps also suffers from the lack of regular database dumps. Independent of template design, however, all coordinates are links to the Geo Hack page. These links can be found through the external links table and all relevant parameters can be extracted from the URL syntax. This is a more universal and reliable method to find articles with coordinates.
Action: Parse Geo Hack URLs in the external links table on the toolserver.
Interwiki bots for coordinates?
Once coordinates are extracted from all languages of Wikipedia, they can be compared via the interwiki link relationship. Two languages should (hopefully) have the same coordinate for Paris. If coordinates differ or are missing in one language, perhaps a bot can fix this. Apparently, user:The Anomebot2 already does this on the English Wikipedia? Perhaps each language should form a Georeferencing WikiProject that can decide what bots should do?
Action: ?
Finding coordinates for more articles
When existing article titles are shown on a map, it's easier to see which articles are missing from the map. If a place name is missing from the map, it could be because an article is missing or because an existing article doesn't have a coordinate yet. It should be made easier to add coordinates to articles. One tool for this is http://toolserver.org/~multichill/coordinates.php However, automatic interwiki bots that add coordinates to existing articles should probably run first.
Action: ?
How should the coordinate templates be designed?
Some German wikipedians are strongly convinced that they have the best coordinate template design. user:Prolineserver observed that the Swedish template (coord) doesn't use a span+class tag, which are needed by his Javascript link enhancements. Maybe we should write a specification for coordinate templates, that all languages can follow? Should we make a survey of existing coordinate templates?
Action: ?
Future development
Could MediaWiki have a separate table for coordinates, similar to the external links table? This is not necessary in the near time, but could be useful for instant geographic proximity search.
Action: Nothing for now.

Mailing list[edit]