Wikimedia Maps/2015-2017/Tile server implementation



At this point, this is an incomplete list of ideas and technologies that Maps team is considering.

Introduction
Our current prototype implementation imports data from OpenStreetMaps into PostgreSQL database with PostGIS 2.1+ spatial extensions. From there, vector tiles will be created using open source components from Mapbox and mapnik 3 engine, and stored in a local storage (TBD). Upon request, the vector tile will be converted to a PNG image on the fly and served by Kartotherian service via Varnish caching.

= Storage = This table bellow summarizes our tile storage needs for one style. For the first iteration, we plan to store up to level 14 (total ~360 million tiles). There are two ways to significantly reduce this number - de-duplication and over-zooming.
 * Over-zooming is when we know that a tile does not contain more information than the corresponding piece of a tile in the lower zoom level, so instead of storing a tile, we simply extract a part of the tile above when needed.
 * De-duplication is when all files with the same content are stored once. Zoom level 9 contains 70% duplicates, and it is likely to grow for higher zooms. We have not yet found any off-the-shelf storage that can de-duplicate data. Even if available, theoretically, storing only level 15 (2^30 tiles) would need to use 32bit number for tile identification (tile position->unique tile id). thus would need 4GB (4 * 2^30). Assuming 99% duplicates and 1.5KB average per tile, we need 16+ GB (2^30 / 100 * 1024 * 1.5) to store the actual tiles. Once we add the per-item storage overhead, the numbers will be significantly higher.

Tile States
In theory, each tile can be in four states: uninitialized, stored, stored-but-dirty, and use-over-zoom. When requested, the server has to decide to either use the stored vector tile, or go to the lower zooms until it finds a tile that exists and extract a piece of it (over-zoom). To decide this, server could attempt to get tile from the storage, and if missing, proceed to over-zoom, or it could store a large binary blob, with each bit showing if tile exists. It could even store 2 or even 3 bits per tile to minimize the subsequent checks - with value 0-7 indicating how many levels to zoom out (0=tile exists, 1=zoom out one level, ...).

No-extra-storage approach makes the system simpler, but requires progressive zoom-out one level at a time - each tile request would result in N requests to the storage system, each of which could be relatively expensive.

Storing over-zoom bits in Redis, which supports bit operations and is considerably faster, would reduce the number of storage requests to just one.

Dirty Tiles
Some tiles could get marked as dirty during the OSM import, or as part of the data layer SQL adjustments. There should be a background job going through all dirty tiles and regenerating them. Additionally, server could regenerate dirty tiles on the fly if user requests them. This is especially relevant to OSM itself, where the result of editing should be immediately visible, but could be highly beneficial to WMF as we increase the OSM pull frequency.

We could use Redis flags (bits) to indicate if the tile is dirty. This way server could request tile regeneration ahead of the job queue, and if regeneration does not finish within a timelimit, return stale result.

Tile ID
Each tile is identified by four parameters: style version, zoom level, and x-y coordinates. For storage, one 32 bit value should be enough to encode 14 zoom levels and 8 different styles. If we reduce the number of available styles to just 2, we could encode level 15 as well (for more styles, we could create a separate storage instance). zoomId  styleId 0-7    X-coordinate         Y-coordinate         extra bits 14:  1      000       0000 0000 0000 00    0000 0000 0000 00 13:   01     000       0000 0000 0000 0     0000 0000 0000 0      0 12:   001    000       0000 0000 0000       0000 0000 0000        00 11:   0001   000       0000 0000 000        0000 0000 000         000 10:   00001  000       0000 0000 00         0000 0000 00          0000 ...

Approach: Store One Tile per File
This is the simplest approach - each tile is stored as an individual file. We have been told of an existing implementation that stores tiles as files up to level 14 (max 358 million), by increasing inodes count. By default, a 800GB partition has 51 million inodes. We are worried about the overhead here.

Approach: Store Multiple Tiles per File
There is an existing implementation called mbtiles, that stores multiple vector tiles together in one sqlite file. From what we heard, even Mapbox itself sees this as a dead-end approach for server-side tile storage, even though it is used heavily by Mapbox studio for storage and uploading.

Approach: Store Tiles in NoSQL
This approach might offer the most beneficial path forward. Depending on future performance studies, we could use Cassandra, Redis, or a large number of other nqSql implementations.

External links: nosql review, Cassandra vs Redis.

Tile Invalidation Approaches
During OSM DB import, importer generates a list of obsolete tiles.

TODO: discuss possible approaches.