Readers/Web/Team/Etherpad/WMF OSM Hack Session 2013

IRC: #osm-wmf-hacksession https://www.mediawiki.org/wiki/Events/Wikimedia_Mapping_Event_2013

(note: this is mostly about the OSM tileserver to be hosted by Wikimedia in production, and not the OSM database to live in Tool Labs)

OSM request flow/rendering:
 * a few osm subdomains, intended for bypassing browser limitations for request limits
 * caching layer (usually squid)
 * apache, with mod_tile (load from cache, or render on the fly; http://wiki.openstreetmap.org/wiki/Mod_tile)
 * rendering from 8x8 metatiles from geographic data in postgres - rendering daemon (tirex [more flexible, combo of c and perl; multi process; http://wiki.openstreetmap.org/wiki/Tirex] or renderd [multithreaded]) called from mod_tile, which then uses mapnik
 * rendering daemons manage request queues. renderd can be used in a distributed way, but requires shared fs for tiles (not sure about tirex)
 * mapnik queries db; how is failover handled?
 * renderd queueing layer collapses duplicate requests
 * OSM publishes diff files, which get fed into local databases to update data. Granulairty down to every minute, but can be less frequent. WMF probably does not need to be as up to date
 * Update will invalidate stale resource on the backend, but not front-end cache - invalidating front-end cache can be very costly


 * How difficult/possible is it to use MySQL rather than Postgres?

proof-of-concept OSM rendering stack on WMF Labs using the puppet manifests
 * http://osm.wmflabs.org/osm/slippymap.html

Tirex
 * http://wiki.openstreetmap.org/wiki/Tirex

TileLite
 * https://bitbucket.org/springmeyer/tilelite/wiki/Home

Ceph librados
 * http://ceph.com/docs/master/rados/api/librados/


 * Puppet work in progress: https://gerrit.wikimedia.org/r/#/c/36222/


 * Zoom level <=12 should all be pre-rendered
 * Potentially pre render zoom level <=12 everywhere, use varnish hash director for everything else
 * Current database size, ~350GB with about 60% growth per annum
 * osm-db* boxes currently have 2x300GB Intel 320 SSDs; run them in RAID0?

Tile storage
 * Ballpark figure for all tiles ~50TB (with one set of stylesheets - eg not including label overlay tiles)
 * osm-web* boxes only have 2x250GB SATA; osm-cp* have 6x600GB SSD
 * mod_tile caches and can be I/O intensive
 * mod_tile cache is smarter than plain HTTP proxy caching due to meta-tile caching
 * steal e.g. 2 SSDs from cp->web?
 * plug librados into mod_tile and let it store files into the main media storage cluster (Kai says it should be relatively easy)? what about pmtpa? needs alignment with our media storage strategy/future.

Load balancing
 * How to load balance/failover render->database? pgpool/slony (eww)?

Labs database
 * Toolserver vs labs/etc for OSM analytics/non-prod stuff
 * We need a OSM Postgres accessible from labs
 * Setting it up into a Labs instance isn't possible from a resource PoV
 * Potentially use new dedicated box for postgres database
 * Stopgap to use new toolserver hardware

Sunday agenda items Translate wiki Json tiles Projection OSM edit link Tile mill Labs WMA instances Style contest for maps Finalize architectural plan

Architectural plan: - Kai working on abstracting the file backend from mod_tile/renderd - openstreetmap.org uses renderd, a lot of other people use tirex - tirex has support for more file formats - renderd in C, tirex a combination of C/Perl/Ruby/Python http://mlm.jochentopf.com/
 * mod_tile and renderd on same box, write to local filesystem; have varnish consistently hash - but problematic because of metatiles
 * shared FS needed for real scalability
 * renderd vs. tirex
 * multilingual - render tiles with no labels, then render transprent tiles with labels (overlays); overlays get overlaid client side
 * one implementation is one style for each language
 * new implementation on toolserver supports parameterization of styles


 * styles: basic OSM + overlays for languages? Tim says hikebike is perhaps the most popularly requested style on toolserver. We should choose one to start with (talk to design/UX folks) - probably default osm tiles for now
 * support for json tiles (for vector rendering) would be awesome - but currently not supported by renderd (it is by tirex). however, there are other issues with vector rendering (particularly for older mobile devices). it would be great to get support for vector rednering in the future (with adding support perhaps to renderd), but for now, should start with png rendering

Open Questions - prerendering tiles e.g. up to zoom level 12 - cleaning up stale tiles - TTL for Varnish? HTCP purges? Cache coherency issues between changed tiles