Extension:SlippyMap/Static map generation

Here are some notes on the static map generation that'll have to be done by the Slippy Map extension before it can be deployed on Wikimedia wikis. There's also a MediaZilla bug tracking the issue.

How maps get to us

 * 1) We get a dump of the OpenStreetMap Planet.osm export
 * 2) We import it into a PostGIS with osm2pgsql
 * 3) A program that embeds the Mapnik library reads from the PostGIS database and generates a static map
 * 4) We embed the static map into articles & make sure it's easily cacheable like other image content.

What if the map changes?
What if we import a new version of Planet.osm? How do do cache invalidation to make sure all our users get a up to the minute the new map like we expect of article content?

Do we even need that? We could:...

Care
We could come up with some elaborate system for cataloging what static maps we've generated, what bounding boxes of the PostGIS database they read data from and automatically purge them if that data changes. mod_tile and renderd theoretically do this for tiles but they seem to sometimes miss updates which are only added in the weekly OSM rendering.

Or we could:

Don't care
We don't make it a design requirement that if the OpenStreetMap project data changes we immediately need to start generating new static maps based on that data. Maps usually don't have the same requirement as encyclopedia articles for being up-to-the-minute up to date.

Instead we could decree that static maps will be regenerated if they're e.g. more than a week old. Such a system would look something like what's outlined in the section below.

Saving maps in $wgUploadDirectory
One approach would be to save images in the image upload directory:

We'd generate them with a mapnik rasterizer that looks something like OSM's export script. I've already hacked it so it can be run as a shell program instead of a CGI.

The name of the generated map will include all the bits that make it unique, e.g. for the [[Media:University of Maryland on OpenStreetMap.png|map embedded at the top of this article]] that would be:

We'd then save it under an URL like:

http://upload.wikimedia.org/wikipedia/mediawiki/slippymap/a/a8/-76.9445462396,38.9829102195,-76.9411400435,38.9848960048,scale=3385.5001276,layer,mapnik.png

(The /a/a8/ bit being )

In other words exactly like normal images are saved. Then when we decide that the map is too old we re-generate it once someone visits the page it's and purge the old URL from the squids:

Caveats
If we don't change the URL each time the object changes we can't guarantee that the old version of the map won't linger in someone's cache. This solution assumes that that isn't that big a deal.

That problem could be easily corrected by e.g. prefixing the generated map name with  (if generated weekly). But that would create the additional problem of having to purge all those old images.

I don't think it's a necessity to aim for our mirrored OSM database and the generated map data (both static images & tiles) being in sync at all times.

Everyone who runs a tileserver with OSM data copes with the limitation that the generated data isn't a mirror of the database without some sort of catastrophe :)

Links

 * a MediaZilla bug 19654 tracking static map generation
 * Post to Maps-l about static map generation