Evaluating and Improving MediaWiki web API client libraries

This is Frances Hocutt's project for May-August 2014, in which she is evaluating MediaWiki Web API Client Libraries as an OPW intern for Wikimedia.

Adriana Media Inc.DigiNet DigiServ rianav@live.com Copyright Java Git

Evaluating MediaWiki web API client libraries

 * Public URL: https://www.mediawiki.org/wiki/Evaluating_MediaWiki_web_API_client_libraries
 * Bugzilla report:
 * https://bugzilla.wikimedia.org/show_bug.cgi?id=62806 (Java)
 * https://bugzilla.wikimedia.org/show_bug.cgi?id=62808 (Perl)
 * https://bugzilla.wikimedia.org/show_bug.cgi?id=62809 (Python)
 * https://bugzilla.wikimedia.org/show_bug.cgi?id=62810 (Ruby)
 * Announcement: http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/76046
 * Output summary
 * API:Client code/Evaluations
 * API:Client code/Gold standard
 * API:Client code

Name and contact information

 * Name: Frances Hocutt
 * Email: franceshocutt@gmail.com
 * IRC or IM networks/handle(s): fhocutt
 * Web Page / Blog / Microblog / Portfolio: http://franceshocutt.com / http://twitter.com/franceshocutt
 * Resume (optional): http://franceshocutt.com/cv/
 * Location: Seattle, WA, USA
 * Typical working hours: between 1 pm and 2 am PDT (can move earlier if more convenient)

Synopsis
This project will select the best MediaWiki API client libraries in four languages. The existing actively maintained libraries will be evaluated on functionality, usability, presence/quality of documentation, level of abstraction, and any other relevant criteria. The best of these will be evaluated in more depth against a "gold standard" ideal, and will be provided with TODOs that will lay out the next steps to reach it along with testing files and documentation improvements. This project will then focus on substantially improving one of these libraries through the submission of bug reports, bugfixes, and expanded documentation.

This project will improve API:Client code and take the guesswork out of choosing a useful and appropriate library. It will provide a clear target standard for MediaWiki API client library developers to aim for, hopefully resulting in increased functionality and usability of available libraries. Additionally, the functionality and documentation of one MediaWiki API client library will be significantly improved.


 * Mentors: Sumana Harihareswara, Co-mentor Tollef Fog Heen, Technical Advisers Merlijn van Deen and Brad Jorsch.

Deliverables

 * Community Integration Period (8-10 working days):
 * Set up computer: install necessary languages and interpreters/compilers, install git, get everything working (3 days)
 * Computer is up and dual-booting Linux/Windows; after much trying (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1239578) the wireless is semi-functional
 * Python virtualenv is set up, Perl, Ruby, and possibly Java are installed, git is installed and configured
 * TODO: set up dev environments for Perl, Ruby, JS, Java; agreed with mentors that this could happen on an as-needed basis
 * Search for the existing API client libraries, add these to API:Client code
 * https://www.mediawiki.org/wiki/Evaluating_and_Improving_MediaWiki_web_API_client_libraries/Status_updates/Search_results[
 * Added/sorted libraries from search results, 19 May 2014
 * Communicate with mentors and MediaWiki community
 * Communicated over IRC and email with my mentors, asked questions in Wikimedia IRC channels, introduced myself to wikitech-l as a pending intern
 * Work on a mini-project to explore the MediaWiki API
 * Wrote a twitter-bot that posts obscure tweets (sourced from Wikipedia, Wikivoyage, Wikibooks, and Wikidata)
 * https://twitter.com/AutoWikiFacts
 * source at https://github.com/fhocutt/obscure-enwiki-fact
 * Submit first pull request to a git repository
 * submitted patches for Dreamwidth and had them merged: https://github.com/dreamwidth/dw-free/pulls/fhocutt?direction=desc&page=1&sort=created&state=closed


 * Week 1:
 * Select the best library/libraries in Python, Perl, JavaScript, Ruby, and Java. (required)
 * Done: API talk:Client code
 * Research and decide on criteria to evaluate these libraries in more depth.


 * Week 2:
 * Finish writing up evaluation criteria/library "gold standard" (required)
 * Evaluate Python libraries (building on work from microtask)
 * Post "gold standard" and useful resources found while writing it to Data & Developer Hub (required)
 * Data & Developer Hub

Continue evaluating best libraries of each language. To the library maintainers submit documentation of tests and results, praise for where they get things right, documentation written while evaluating (optional), and bug reports for areas that can use improvement.

Update resource lists (such as http://wikipapers.referata.com/wiki/List_of_tools and Data & Developer Hub) with the best libraries I find. Write an "About web API client libraries" page for Data & Developer Hub. If time permits, write a better API:Tutorial page.


 * Week 5:
 * Finish evaluating/writing TODOs for libraries in the third language. (required)


 * Week 7:
 * Finish writing TODOs for Python libraries
 * wikitools
 * mwclient
 * pywikibot
 * Check these with Merlijn and other mentors
 * Write an evaluation email template to make sending evaluations to individual maintainers easier
 * Email evaluations to maintainers
 * File issues/bug reports
 * simplemediawiki
 * mwclient
 * pywikibot
 * Add links to Data & Developer Hub
 * Email mediawiki-api with links to Python evaluations


 * Fill in evaluations for Perl libraries (Wikipedia/API and MediaWiki::Bot)
 * Look at the libraries myself, observing the 20-minute rule for asking for help
 * Pair with Brad and Tollef to evaluate/review the code
 * Evaluate the project history (pull requests/issues filed/etc.)
 * Write TODOs for both Perl libraries
 * Check these with Brad, Tollef, and other mentors
 * Email evaluations/TODOs to maintainers
 * File issues/bug reports


 * Start looking at Ruby libraries, following the 20-minute rule.
 * Go through the initial evaluation on the Sunflower gem


 * Week 8:
 * Set up pairing IRC meetings


 * Fill in evaluation for Ruby librar(y/ies)
 * Mediawiki::Gateway
 * Sunflower (do initial evaluation)
 * Pair with a Ruby dev to evaluate/review the code
 * Evaluate the project history
 * Write TODOs
 * Check TODOs with mentors
 * Email maintainers with broad evaluation and link to TODOs
 * File issues/bug reports


 * Evaluate Java libraries
 * JWBF
 * Wiki.java
 * Pair with a Java dev (Tollef?) to evaluate/review the code
 * Evaluate the project history
 * Write TODOs
 * Check TODOs with mentors
 * Email maintainers with broad evaluation and link to TODOs
 * File issues/bug reports


 * Select library to make in-depth contributions to (it will be one of the Python ones unless there is a very obvious other choice; most likely improving mwclient or pulling bits out of pywikibot to make a lighter-weight, easier-to-use library that doesn't make you log in to run queries!) (required)

Generally:
 * Finish evaluating/writing TODOs for libraries in all four languages. (required)
 * On consultation with my mentors and due to time constraints, I will not be evaluating the available libraries in JavaScript.


 * Week 9:
 * Begin contributing to chosen library; file bug reports. (required)
 * https://github.com/eldur/jwbf/issues/24
 * https://github.com/eldur/jwbf/issues/21
 * https://github.com/eldur/jwbf/issues/22
 * https://github.com/eldur/jwbf/issues/23
 * https://github.com/eldur/jwbf/issues/25
 * Determine goals for Weeks 10-12. They should include filing bug reports, submitting bugfixes, and improving the library's documentation. (required)


 * Weeks 10-13:
 * Carry out goals set in Week 9:
 * Documentation:
 * Revise the README to make it clearer, and friendlier to new developers
 * Write a document to provide an overview of the library for developers and suggestions on where to start when writing an API client
 * Submit pull requests for both of these.
 * Bugfixes:
 * Spoke with the maintainer, who suggested I work on this: https://github.com/eldur/jwbf/issues/18
 * Get set up for development on jwbf
 * Look at how the API implements search and think about how to implement it in the context of this library
 * Write a patch and tests to implement search as described in https://github.com/eldur/jwbf/issues/18
 * Debug, test, document, etc.
 * Follow-up and administrative work:
 * Write wrap-up blog post
 * Send wrap-up email to mediawiki-api
 * Post wrap-up on MediaWiki
 * Complete final evaluation

Participation
I plan to publish my weekly status updates on Evaluating and Improving MediaWiki web API client libraries/Progress Reports, including links to any place outside of MediaWiki where I have contributed code or documentation (i.e. the repositories for the various client libraries). I will also stay in contact with my mentors through regular videochats and over IRC. I plan to publish my in-progress work on the project page (or sub-pages). I plan to ask for help on IRC, wikitech-l, or personal chat/email with my mentors. Less formal notes will be kept on Evaluating and Improving MediaWiki web API client libraries/Status updates.

About you

 * Education completed or in progress:


 * University of Washington, MS, Chemistry, 2012–June 2014 (planned graduation date).
 * University of Washington, MS, Materials Science and Engineering, 2010–2012.
 * Harvey Mudd College, BS with distinction, Chemistry, 2003–2007.


 * How did you hear about this program?

I first heard about the OPW through Twitter—@callbackwomen, @ashedryden, and @hypatiadotca all promoted it. Later, Sumana Harihareswara reached out to me with encouragement and a few project suggestions.


 * Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?

I will be attending OSBridge June 24–27 and volunteering there. I will definitely be traveling June 19–22; I may leave as early as June 15.


 * We advise all candidates eligible to Google Summer of Code and FOSS Outreach Program for Women to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I will not be enrolled during the Spring 2014 quarter, so I will not be eligible for Google Summer of Code.

I am passionate about tools and I define "tools" broadly. No matter what discipline, I usually find myself making tools, sharing them, making them better, and teaching other people how to use them.
 * Why I want to make this project happen:

As a chemist, I developed a new reaction—a new tool for forming chemical bonds—optimized the reaction conditions, and tested it with a range of chemical inputs to find what types of chemicals would lead to high yields of product and what types would lead to none. As a handspinner, I've started modeling the mechanical relationships that let my spinning wheel vary the twist and tension on the forming thread. As a hackerspace organizer, I notice where tools for communication and collaboration can fit into our workflow, and along with the other members I try to find what works best for us.

The MediaWiki platform is another tool. Wikis enable collaboration, easy sharing of information, and user-built databases that have the potential to continually improve through the small contributions of many. By improving the MediaWiki API client libraries, I would make it easier to users and maintainers to access these wikis and use the data in them for their own purposes. By evaluating the libraries and selecting the best, I would inform potential users which of these tools are the most effective and the easiest to use. As I completed my evaluation I would offer a clear roadmap for improvement. In the final portion of this project I would improve one of these libraries with bugfixes and more documentation. I would, finally, make one of these tools as easy and effective to use as I could.

Past experience

 * Please describe your experience with any other FOSS projects as a user and as a contributor:

I've been using free/open source software for nearly a decade now. All of my computers run Ubuntu, which makes it easy for me to install and use other open source programs. I use Audacity for audio processing, Inkscape for image creation and manipulation, VLC for media, Chromium as a web browser, and more. In my academic work, I used LaTeX and its open source editors for document management and creation, and I ported existing Matlab scripts to Python so that I could expand on them to model the dynamics of a system I was studying. Like many students, I used Wikipedia as a jumping-off point for further research, and eventually I started making my own contributions (mostly anonymously). I've used LiveJournal and Dreamwidth for online journaling platforms for years now. I've recently started contributing to Dreamwidth as a developer, and to do so I've been learning Perl, vi, and git.


 * Please describe any relevant projects that you have worked on previously and what knowledge you gained from working on them (include links):

I have recently started development work for the Dreamwidth journaling platform. I checked out my first bug recently, and my first pull request is here. My second is here. When I'm more familiar with the codebase I plan to contribute to Dreamwidth's ongoing and extensive project of converting sections of the codebase from a custom markup language to Template Toolkit.
 * Dreamwidth

Last May, a small group of us started the Seattle Attic, the first hacker/makerspace explicitly founded on feminist principles. This project is not explicitly F/OSS related (although it has already fostered in-person collaboration) but it is a space founded on open culture. I wrote our code of conduct, our bylaws, and our statement of values. All of these are essentially documentation for our in-person community. These involved persuasively describing expected behavior; taking what I needed from a variety of sources and creating a coherent and consistent whole; and incorporating community feedback to write a statement of values that we all agreed was accurate. I would draw on these skills when approaching standards, TODOs, and documentation.
 * Seattle Attic Community Workshop

I've described what I've done for my microtask at User:fhocutt. The results can be seen here: API:Client Code/Access Library Comparison.
 * MediaWiki


 * What project(s) are you interested in (these can be in the same or different organizations)?

I am most interested in the Evaluating MediaWiki web API client libraries project with the Wikimedia Foundation.

Any other info

 * My preferred learning styles

This came out of a discussion with Sumana Harihareswara, based on her recent experience at Hacker School. I took this quiz, which gives an idea of where you fall in these learning types.

I ended up with a mild preference (3 pts) for: Reflective, Intuitive, and Global and a very mild preference (1 pt) for Visual.

Verbal-Visual-Active are all linked for me. Taking notes (longhand or typed) helps me retain information more than reading, watching, or listening alone. Talking out a concept/plan with someone else, or discussing what I understand and where I see it connecting to other things, is often very useful. I absorb information quickly from written text.

I can't stand informational videos. Put it in text and include any necessary illustrations and I'll learn it faster, retain it better, and be able to refer to it more easily. That said, I learn well from lectures, especially when I take notes while the instructor writes/draws on the board. I learn well in a traditional classroom.

I need both the parts and the whole, and if either is missing I'm going to be frustrated. I love learning details and getting meticulous work just right, but without context it's busywork. "Trust me; this is how it needs to be done, and I will explain later" counts as context. I like to be told, "The details will make this needlessly complicated; let's zoom out;" it lets me know I can go back for the details later without missing anything too important. "This is where I blatantly lie to you for your own good" (when someone is explaining fractally complicated concepts) lets me know (a) that there is a subtlety there and (b) that I don't need to worry about it right now.

I'm not sure whether it's possible to separate my reflective and global learning tendencies. Once I get enough information I see the patterns and then I can make the leaps I need to and the patterns just make sense. I do need some time to let the information/connections settle. I love interdisciplinary work and applying techniques/information/theories in unexpected ways.

This is a common workflow for me: read voraciously and mainline information. Goof off, procrastinate, clean the house, update my computer, read more things that are tangentially related, start to make an outline, decide it's WRONG ALL WRONG, reshuffle it and start writing bits of it (blocking out the not-quite-there bits with [not-quite-the-right-thing]), figure out what my thesis actually is, iterate, reshuffle, repeat. Write title/abstract/introduction. Edit severely and BAM it's beautiful. It requires lots of copy/paste and highlighting and maybe some capslock and I don't even know how I'd do it in longhand.

Some of my favorite learning experiences have been extremely intense but also supportive. I like diving in and new material rarely intimidates me. Unfamiliar communication and social structures are more likely to trip me up and it is helpful when people explain social norms to me, though I can pick them up eventually and get through until I do. Tight deadlines can be motivating as long as I don't have any other crises to deal with, but I need downtime after them.

I love detailed to-do lists. I mostly keep them on paper or in a .txt on my desktop but I'll be keeping some of the ones for this project on User:Fhocutt.