User:Tfinc/GPS

Draft Spec for programmatic retrieval/storage of GPS data in MediaWiki

= Use Cases =


 * The Wikipedia Android and iPhone app need a simple interface to retrieve articles near a longitude and latitude. Currently web services like GeoNames.org provide this but charge us for their use and its makes us dependent on a third part service for a core tool


 * Our mobile contribution projects will benefit from clear and quick calls to action for our users. In order to do so we need a way for any application or mobile website to quickly identify missing components based on GPS, Article name, etc

= Requirements =


 * Ability to specify a latitude and longitude and get back an list of articles near them
 * "near" can be tough when places of different scale co-exist. Should 'San Francisco' or 'California' or 'United States' show up on a search for near me since I'm inside them? Or do we usually mean things which are localized, where the entire bounding box is nearby?
 * Ability to retrieve a list of articles missing photos or coordinates
 * There's always going to be helluva lot of articles without coordinates. Max Semenik 20:05, 19 December 2011 (UTC)
 * Ability to specify a radius for returned results
 * Ability to specify the number of results (capped) to return
 * Responses should be in a number of formats : XML, JSON, etc
 * returning data within MediaWiki API would take care of this
 * Ability to specify by postal code and/or country
 * this is geocoding; a whole other sort of problem which is perhaps best left to third-party services specializing in it... What about OpenStreetMap? They must have something of the sort for their searches.
 * Ability to sub divide by types (landmarks, city, etc)
 * semantic data?
 * RSS feed of GPS changes
 * What does this mean?

= Notes =

GeoNames
Find nearby Wikipedia Entries / reverse geocodingThis service comes in two flavors. You can either pass the lat/long or a postalcode/placename.

Parameters

 * postalcode,country, radius (in Km), maxRows (default = 5)
 * Result : returns a list of wikipedia entries as xml document

Example: http://api.geonames.org/findNearbyWikipedia?lat=47&lng=9&username=demo http://api.geonames.org/findNearbyWikipedia?postalcode=8775&country=CH&radius=10&username=demo

GeoLoqi
(this was taken for a chat with GeoLoqi)

Articles with/without Images
We'd like a flag of whether an article has an image or not, without having to scan the articles themselves. This would help us to notify users that a given article needed an image. We'd probably push the message to users based on speed, so that those who were walking would be encouraged to take a picture, whereas those who were driving would be able to get the content without worrying about getting the image. We'd send the item to their activity stream with an "image needed" notification, and then they could go back to the place to take a photo later.


 * What this really needs is some semantic data -- you really want to check for existence of files with 'is-photo-of' relation or something. Presence of images in an article doesn't mean that they're images of the place; they may be icons, graphs, charts, maps, or all sorts of things other than photographs of the thing/place. --brion 03:53, 19 December 2011 (UTC)

Avoiding unnecessary querying
What we'd like to see added is a recent changes feed filtered by articles that contain coordinates. There's no way to get new geocoded articles without re-querying the entire Wikipedia database. We know that InfoChimps probably doesn't want to do that.

To make this more accessible to developers dealing with more than just Geodata, it would be ideal to implement a recent changes feed that lets one access the recent changes feed filtered by articles that contain templates, as the Geocoordinates are a template.

Approximate Object Sizes
One thing that was missing that we had to guess was the approximate size of the object that was being described, so we had to put into some general rules for geofence sizes. If a park, it was size x, if a building, size y and so on. Then we had to detect, for each article, if it was a park or building and change the geofence accordingly. Having a rough size would have helped, although this was not very difficult for us to approximate after looking through the dataset.

Why InfoChimps over Wikipedia API?
Here's the InfoChimps API: http://www.infochimps.com/datasets/wikipedia-articles

The reason we used the InfoChimps API over the Wikipedia API is that there was no way to query for articles nearby a location in the Wikipedia API. We also didn't even know there was a Wikipedia API.

We didn't look at Geonames until now: http://www.geonames.org/maps/wikipedia.html But we realized it did have "types", like a city or landmark that would've made it easier for us to create appropriate radius sizes.

ToDo
You should add coordinates to this list of API queries. API:Properties Using this API method we can get info about the page like categories and images, but it would make sense to retrieve coordinates from the page as well.

Historical Events
We noticed that Geonames also has wikipedia articles that are "events", for instance the Battle of Jutland. What would be incredible both for real-time data as well as education would be able to have an events layer of Wikipedia articles in Geoloqi. We'd love to see events as an item in the GeoAPI. Geocoded events articles that we could push into Geoloqi and allow people to get educated on history as they walked around. InfoChimps is missing this, and Geonames does a great job of it.

Event Formatting
For an event, we'd need the date of the event easily readable. The most important things for this would be a date, the article text, and the geocode. The place would have both the coordinate and the link to the pagename of the place it happened. We'd love to encourage people to geotag events as well, for purposes of education. We'd love to enable ambient education for people walking around town.

Good Stuff
Something that people are expecting that will be a problem (with the Geoloqi Wikipedia layer as is) is that they won't want to subscribe to all of Wikipedia, but subscribe by topic when walking around. We probably have everything we need for that because there are categories of Wikipedia articles we can query. We'd like to split them into topics people can subscribe to. This probably won't require changes.

Looking at the Wikipedia API as-is, we were able to find structured categories and images for each page, so that's great.
 * Wikipedia's category system is insane, it would be hard to squeeze something intelligible out of it both from developer's and end user's POV. Max Semenik 16:07, 21 December 2011 (UTC)'