Mobile web projects/Collections Extract Api Spike

Collection Overview ( A collection page list )

A page listed is in summarized card view with the following information:

Title and Extract which is limited to two sentences or 140 characters with ellipsis

See https://trello.com/c/gyQmr588/7-spike-extracts-api for spike deliverables.

Example Query:

http://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=jsonfm&exsentences=2&exintro=1&exsectionformat=plain&indexpageids=&pageids=18483349|9105985|4030456|39532875|24283208&continue=&exlimit=max

Does the existing API support what we want to do?
Yes. Supplying list of page ids will return page titles and configured extract of the page. The exintro param must be set.

If we clip to 2 sentences how useful are the extracts generated?
The random sample selected above seems to provide reasonable results. Requesting plain format effectively strips wikitext. However, in some cases HTML is present. Though is would not be difficult to strip. The API will not add elipsis but these can be added manually.

Are there any performance problems with this. How many can we safely retrieve on a list page view?
It seems not. Providing we make 1 API request with either a set of page titles or page ids. According to MaxSem, as long as we set the exintro param to 1 there should be no performance issues.