Evaluating and Improving MediaWiki web API client libraries/Status updates/API:Tutorial notes

=Tutorial for MediaWiki's RESTful web service API=

Speaker: Roan Kattow, maintained the MediaWiki API 2007-2009. Event: 2012 hackathon.

API: RESTful web API Other APIs -- API is overloaded. This is the web API.

Why should you use the web API? Bots (automated edits), AJAX (programmatically looking stuff up, Javascript features (gadgets?), Gadgets, other things.

Roan says: generally any Ajax feature is going to use the api.php entry point. But right now the easiest thing to do is to write a bot or to use the API clients.

Definitions

 * REST API for MediaWiki
 * exposes things MediaWiki has in the database or otherwise understands
 * does not include semantic stuff like "definition of a word in Wiktionary" or even "lead paragraph of an article"
 * usage: send HTTP requests (GET or POST) to the  URL, receive XML or JSON or other formats.  You'll usually want JSON or XML.
 * JSON and XML and Representational state transfer (RESTful)

There are other things that also get casually called the MediaWiki API, like the internal interfaces that extensions and special pages can hook into. We're not talking about that right now, just the web API.


 * (possibly talk about how it works from the back end, if people ask)...

Everything in the database that's not private is exposed (public userpages too). Data and metadata; links between pages, images used on pages, history metadata, and more.

Don't have semantic stuff like "what's the definition of this word in wiktionary?" Does retrieve page text, or history of page, or etc. (Doesn't parse pages b/c pages are freely structured.) Includes geodata.

So, you send GET or POST HTTP requests [elaborate here] and get JSON or XML format back (XML to be deprecated). (can ask for others).

(says jsonP needs to be documented better).

no such thing as plaintext, just wikitext

For this, using w:en:Special:ApiSandbox {make a redirect? Consistently name these across all wikis???}

"not optimized for educational use just yet"

Basic way things work: [blah.api.php][query string]


 * there are a bunch of wikis running MediaWiki software (the thing that runs wikipedia, etc. etc.)
 * each of these will have its own blah.api.php page
 * in this talk, we're using en.wikipedia.org/w/api.php.


 * MW writes code, deploys it to WMF site, releases it as a tarball after problems have been found.
 * So different wikis _will_ be running different versions of the MW software and diff versions of the API, so use api.php as authoritative documentation
 * But there are not usually changes that break existing usage, any such changes will be announced on the api-announce list.

GIANT YET TERSE AUTODOCUMENTED API PAGE useful bits: what parameters and what usages are accepted.

but you don't have to do that! In this case we will be using the API sandbox. It has stuff like example queries though it is not great.

Format: usually want json. Demo uses xml, but this will be deprecated!

Action parameter is most often 'query'. There are tons of other ones but when in doubt look in query first. "Asking" for data from database.

Most of the parameters you will never use! Generally ok to leave them blank.

can click "examples" and it will fill out example for you.

fill it out, click "make request"

If you max out the limit, the query continue element will tell you how to get more.

imglimit will take "max" though it should otherwise be a number. Max depends on account's limit.

How many can you make? No limits on read/second, but they reserve the right to block you. Some limits generally on edits/second (not API-specific). Community will probably block "editing rampages"

Searching for all images; request the maximum, not enough. " tells you how to continue, in this case set "aifrom" to "[imagename].jpg" stopped at 21 min

How to use it
Follow along by using w:en:Special:ApiSandbox -- query is what you will usually want.
 * Entry point: http://en.wikipedia.org/w/api.php (see API:Main page)
 * or any other wiki
 * Talk about versioning and how non-WMF wikis might have different version of MediaWiki and thus the API
 * https works too!
 * Parameters are passed in query string. Not passing any will give you the help page with the autogenerated documentation.
 * Example query: ?action=query&titles=San_Francisco&prop=images&imlimit=20&format=jsonfm
 * is used for most read actions, separate action= modules exist for write actions
 * takes one or more titles for the query to operate on
 * lists the images on a page; lots of other stuff in prop=, list=, meta=
 * sets the max # of results. Default is 10, 'max' works
 * popular values for format= : xml, json, xmlfm (default), jsonfm
 * If you want to find sections from the table of contents, use  using the   property, and you can call 0 for the wikitext that comes before the first section header.
 * State-changing actions (e.g. editing)
 * POST requests only
 * two-step process involving token
 * details of individual actions are complex, read the docs
 * example: ?action=query&titles=Foo&prop=info&intoken=edit for obtaining edit token, then POST to

If you want to make a lot of API calls, and perhaps run very busy and active bots et al., please talk to the admins of that wiki ahead of time so they do not block you. Also run your requests in serial, not parallel. ''resource for contacting them to go here. TODO''
 * There are limits in the software on how many edits per second you can make.


 * Example nouns to look up:
 * Kanichar
 * Kolar_Gold_Fields
 * Cooperative_principle
 * MS_Riverdance

Magic recipes
Nonobvious and very useful.
 * Things you'll definitely need:
 * for basic page info
 * for page history
 * for page wikitext
 * for page HTML
 * Doing crazy stuff
 * multiple titles with  (This will make multiple calls count as one for the purpose of rate limiting)
 * This works for pages but not revisions. Read the documentation via the Sandbox or via   autodocs.
 * multiple modules with
 * generators (kind of like UNIX pipes) with

Resources

 * Getting help
 * Autogenerated documentation:  with no parameters such as https://en.wikipedia.org/w/api.php
 * Documentation on mediawiki.org: API:Main page (details about specific modules/parameters often outdated, autogenerated docs are authoritative)
 * The API Sandbox -- example w:en:Special:ApiSandbox
 * mediawiki-api -- mediawiki-api@lists.wikimedia.org
 * mediawiki-api-announce@lists.wikimedia.org - PLEASE subscribe because we tell you about breaking changes, which happen a few times every year. mediawiki-api-announce
 * #mediawiki on irc.freenode.net
 * me! (Roan Kattouw)

You may actually want

 * The dumps of all of Wikipedia so you can work with them locally - http://dumps.wikimedia.org/
 * Offline Wikipedia readers such as Kiwix http://www.kiwix.org