User:Mvolz/OPW proposal round 8

Improving URL citations on WikiMedia

 * Public URL: Improving URL citations on Wikimedia
 * Extension:CiteURLEngine


 * Bugzilla report: 57804
 * Announcement: Unannounced as of yet.

Name and contact information

 * Name: Marielle Volz
 * Email: marielle.volz@gmail.com
 * IRC: mvolz
 * Location: London, England U.K.
 * Typical working hours: 9-5 UTC -0 3 weekdays per week & 10-6 UTC- 0 on Sat/Sun (tentative)

Synopsis
Incomplete and missing citations of web resources using the tag are a relatively endemic problem on Wikipedia and other Wikimedia installations. There are two major problems to be addressed:


 * The relative difficulty and tedium of including citations provides a barrier to editors properly citing works.
 * As a consequence, there are many existing citations which are currently incomplete.

I propose to address the first issue and to improve the process of including citations, by automating citations given a user-submitted URL on WikiMedia.

Such a feature has already been planned; it is currently part of the Visual Editor roadmap to add references to the transclusion dialog, and there are currently mock-ups demonstrating the process of adding references. However, there is currently no back-end to supply a wiki formatted citation in response to a user-entered URL.

A mediawiki extension, CiteURLEngine, could be developed to return wiki mark-up citations in response to a submitted URL. The goal is to eventually produce a fully-featured extension that could potentially return citations both to the VisualEditor extension as well as the | RefToolbar in the WikiEditor extension.

A final version of this extension might:
 * 1) Take a URL
 * 2) Detect if the URL points to a resource that has any other identifier such as a DOI or ISBN
 * 3) Return the appropriate citation if such an identifier exists
 * 4) If not, return as a default a  citation.


 * Possible mentors: James Forrester, Trevor Parscal

Deliverables
A rough timeline of deliverables is as follows:


 * Get familiar with wikimedia code base and commit Hello World version of extension.
 * Simple extension that accepts a URL and scrapes the Title to return a citation.
 * Work with VisualEditor team/Trevor Parscal to incorporate it into the translocution dialog and make sure they can interact appropriately.
 * Potentially do the same with RefToolbar/ no mentor on this one so may have to table this?
 * Improve functionality of extension to provide better citations with more fields populated.
 * Investigate possibility of saving citations to a database instead of scraping in realtime to improve scalability?

Participation
In terms of documenting work, I will probably document a lot of the overarching structure on mediawiki itself, and obviously in comments in the code. When coding I often tend to write an empty function and then describe it inline before writing the function itself; then once the function is written, I make the description more concise.

In terms of commits, I will probably do several local commits and then upload those to gerrit on a semi regular basis (for instance, at the end of a work day.) In my project proposal, I'm doing an independent extension that will hopefully interact with 2-3 other extensions at some point. I tend to make lots of small commits (equivalent to reflexively hitting the "save" button) but I know that can clutter a commit tree when it's shared so I'll attempt to keep that to a minimum when intersecting with these other extensions. In terms of asking for help, I've already found the #mediawiki-visualeditor, where my mentors are denizens, an extremely helpful and responsive place. One challenge is that I'm 8 hours off timezone-wise from the visual editor team, which means that our work days overlap by only an hour. However, I am happy to chat later in the evening outside of working hours.

About you

 * Education completed or in progress:
 * B.A. Biological Sciences Cornell University, 2008
 * M.S. Ecology and Evolutionary Biology, 2009

DevChix mailing list
 * How did you hear about this program?

No, although for scheduling working hours, I do have to work around the availability of my childcare provider and my partner's conference/work schedule. I anticipate that I'll be able to schedule 40 hours of uninterrupted work a week.
 * Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?


 * We advise all candidates eligible to Google Summer of Code and FOSS Outreach Program for Women to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I'm only applying to OPW: WikiMedia and CPython.

Past experience
I've been a registered Wikipedia user since 2005 and I've used GNU/Linux as my main OS since 2008.
 * Please describe your experience with any other FOSS projects as a user and as a contributor:

My microtask is Bug 51012.


 * Please describe any relevant projects that you have worked on previously and what knowledge you gained from working on them (include links):

A few years back I wrote a web app/CMS for academics to keep track of their publications. What I learned from this is that adding papers to a database by filling out fields in a GUI is considerably worse than using the tag!

For the last two years I ran a web app for a CDC funded project called Ex-flu in Django/MySQL. I learned a lot of lessons from that, notably:
 * The two most terrifying things are: Sending out mass e-mails, and running SQL alter table statements on a live database.
 * Test before you commit. Never commit on a live server. Pull your commits before running management commands.

I've also done some contract work in WordPress.


 * What project(s) are you interested in (these can be in the same or different organizations)?

My project proposal comes from my interest in the following raw projects:


 * Better URL to citation conversion functionality
 * Work on RefToolbar
 * Visual Editor plugin
 * Bug 57804