Evaluating and Improving MediaWiki web API client libraries/Status updates

19 May 2014
Today I officially start my OPW internship!


 * Things To Do:
 * put Evaluating_and_Improving_MediaWiki_web_API_client_libraries/Status_updates/Search_results into API:Client Code
 * begin evaluating libraries against the following criteria:
 * Has it been updated in the last 12 mo?
 * Does it have a lot of open bugs/pull requests, especially compared to the number closed?
 * Does it have documentation, code samples, and tests provided?
 * does it, at the minimum, handle logins/cookies/continuations? (even "syntactic sugar" libraries should do these things)


 * Results, resources, misc from an IRC meeting with Sumana, Tollef, et al.:
 * Reminder that github graphs exist, like: https://github.com/dreamwidth/dw-free/graphs/contributors
 * Data for Wikimedia traffic:
 * https://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm
 * http://stats.wikimedia.org/wikimedia/squids/SquidReportCrawlers.htm
 * Breaking changes to the API (and therefore a timeline of changes that API client library developers should have taken note of) are (very much should be) mentioned in the release notes Release_notes/1.22, on http://lists.wikimedia.org/pipermail/mediawiki-api-announce/, and in HISTORY https://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/HISTORY.
 * the support-matrix on Wikia (http://api.wikia.com/wiki/Client_libraries#Notes) was last updated in 2011 (http://api.wikia.com/wiki/Client_libraries?action=history), which I believe is after Wikidata was started
 * Python's requests library handles cookies: http://docs.python-requests.org/en/latest/user/quickstart/#cookies
 * IRC:
 * http://en.flossmanuals.net/GSoCStudentGuide/ch014_communication-best-practices/
 * pastebin for sharing multiline code/error/results things: http://tools.wmflabs.org/paste/
 * http://www.harihareswara.net/sumana/2014/02/26/0
 * Wikimedia mailing lists: Mailing_lists/Overview
 * commentary on localization: http://aharoni.wordpress.com/2011/08/24/the-software-localization-paradox/ (came up when discussing the [lack of] API localization)
 * from commentary on pywikibot, Python 2 vs. 3 as another slow deprecation process:
 * http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html
 * https://wiki.python.org/moin/Python2orPython3
 * http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html#why-is-python-3-considered-a-better-language-to-teach-beginning-programmers
 * http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html#slow-uptake

2 April 2014
Currently reading: http://aosabook.org/en/index.html.

Evaluating and Improving MediaWiki web API client libraries/Status updates/Search results

http://wikiconferenceusa.org/wiki/Submissions:Using_web_API_client_libraries_to_play_with_and_learn_from_our_%28meta%29data

http://notabilia.net/

http://journal.code4lib.org/articles/8962

http://blog.hatnote.com/

http://seealso.hatnote.com/

Wikimedia research hub: m:Research:Resources. List of tools: http://wikipapers.referata.com/wiki/List_of_cross-platform_tools.

12-19 March 2014

 * Starting out:
 * Learned what APIs are and discussed with Sumana what the point of an API library is
 * Ideally, it provides affordances that lets you access the deeper wiki structure in an intuitive and functional manner
 * Asked around for well documented APIs that other people have suggested
 * Ruby/S3 SDK
 * Google Drive
 * Google Android
 * Mailchimp
 * Looked at the code and the documentation for the Python libraries listed on API:Client Code
 * Noticed that some of the libraries created layers of abstraction around the MediaWiki API, and others were very simple wrappers over the MediaWiki API
 * Compared the three simple libraries on whether they are maintained, documentation quality, and whether the library includes unit tests. early revision
 * Attempted to start testing the simplemediawiki library...
 * ...but flailed very hard at setting up my tools for it. My portable computer only has Windows working on it right now, so, lessons learned:
 * I already had Python 2.7 installed, but it turned out that I didn't have a package manager. It additionally turned out that pip is ironically difficult to install on Windows.
 * I tried installing setup_tools with the installer it came with and then installing pip with setup_tools. When I then tried to use pip to install the simplemediawiki library I got error messages referencing "egg_info failed", usually associated with a bad package installer.
 * A recommended Windows .tar.gz unzipper is http://www.7-zip.org/. Note that you have to run it twice, once for the .tar and once for the .gz. This is apparently a.
 * setup.py to install setup_tools, setup_tools to install pip, pip to install simplemediawiki, mwclient, and requests. Success!
 * Writing test scripts
 * Started trying to use simplemediawiki to make API calls, initially trying those suggested in the API sandbox.
 * Problems along the way:
 * Figured out that the call function was very close to the actual API calls. I wasn't totally clear that 'action' wasn't to be replaced by e.g. 'wbsearchentities', but once I looked at the API documentation I could see that the same arguments that the API normally took were simply passed in as a dict)
 * Figured out not to try Wikidata API calls with the Mediawiki page!
 * Tested queries of various sorts, including ones that returned data on missing pages
 * See representative tests here, with their results
 * API calls with get seemed to be working ok, so I started testing page-editing capabilities
 * Created an account for User:fhocutt bot
 * Tokens were confusing (remembering python syntax helps, they're not fetched as json, also see: http://stackoverflow.com/questions/17730144/getting-a-python-error-attributeerror-dict-object-has-no-attribute-read-t)
 * The documentation on tokens and bots was somewhat helpful: Manual:Edit token, API:Tokens, User-Agent policy, Bot_policy
 * but: http://www.mediawiki.org/w/api.php?action=tokens and http://www.mediawiki.org/w/api.php?action=tokens&type=edit both give me empty string for tokens and I can't get sandbox API calls with &action=edit to work because I don't have a token. Trying to use the ones that the script gives User:fhocutt bot yields a badtoken error.
 * I got tokens and 'edit' working with simplemedialibrary! See this pastebin and API:Client Code/Access Library Comparison for details.
 * Conclusion based on current work:
 * Simplemediawiki makes it easy to make calls pretty directly to the API interface in a simple python bot. If I pass it the arguments it expects, it works so far.
 * To do: haven't tested any post calls besides edit so I don't know if login/cookies/tokens work with those.


 * Started mwclient tests
 * Once installed (also fine once I had pip), I looked at the documentation and pretty easily got it working for get calls (though you have to take care with capitalization or you get errors similar to this); having the variable names in the sample code distinct from the methods available would help users new to Python avoid this. (See: https://wiki.python.org/moin/BeginnerErrorsWithPythonProgramming.)
 * See API:Client Code/Access Library Comparison for details

Resources

 * MediaWiki collaboration tools
 * Wikimedia pastebin
 * Example, shared on IRC with Sumana: https://tools.wmflabs.org/paste/view/1394197e
 * MediaWiki code
 * Bugzilla list of open API bugs
 * Using this search page and searching for "API" yielded no results, but using the search textbox at the upper right corner does
 * Submit a bug


 * Learning styles resources for engineers/scientists
 * Learning styles as used at Hacker School
 * I love that Mel addresses the "but I don't fit into either of these options!" objection, because I thought precisely that at several points on the quiz
 * Quiz to figure your own out
 * my results and reflections on them
 * Description of 4 learning-style spectra


 * MediaWiki API resources
 * Special:APISandbox not Special:API Sandbox
 * API:Client code
 * Project:Sandbox
 * API
 * API:Tutorial
 * the Wikidata API sandbox
 * Extension:Wikibase/API


 * Other MediaWiki resources
 * Manual:Coding conventions/Python


 * Other API resources
 * Google, Ruby, S3 APIs
 * Ch. 1-2 of RESTful Web Services
 * Beginner's guide for journalists who want to understand API documentation Short guide to the idea of APIs and usual documentation, assumes no previous experience with them


 * Test pages/wikis, ok to use for trial edits
 * https://test.wikipedia.org/wiki/Main_Page
 * https://test2.wikipedia.org/wiki/Main_Page
 * Project:Sandbox
 * Not on my bot's talk page...