User:Mvolz/OPW proposal round 8

Improving URL citations on WikiMedia

 * Public URL: Improving URL citations on Wikimedia
 * Extension:CiteURLEngine


 * Bugzilla report: 57804
 * Announcement: Unannounced as of yet.

Name and contact information

 * Name: Marielle Volz
 * Email: marielle.volz@gmail.com
 * IRC: mvolz
 * Location: London, England U.K.
 * Typical working hours: 9-5 UTC -0 3 weekdays per week & 10-6 UTC- 0 on Sat/Sun (tentative)

Synopsis
Incomplete and missing citations of web resources using the tag are a relatively endemic problem on Wikipedia and other Wikimedia installations. There are two major problems to be addressed:


 * The relative difficulty and tedium of including citations provides a barrier to editors properly citing works.
 * As a consequence, there are many existing citations which are currently incomplete.

I propose to address the first issue and to improve the process of including citations, by automating citations given a user-submitted URL on WikiMedia.

Such a feature has already been planned; it is currently part of the Visual Editor roadmap to add references to the transclusion dialog, and there are currently mock-ups demonstrating the process of adding references. However, there is currently no back-end to supply a wiki formatted citation in response to a user-entered URL.

A mediawiki extension, CiteURLEngine, could be developed to return wiki mark-up citations in response to a submitted URL. The goal is to eventually produce a fully-featured extension that could potentially return citations both to the VisualEditor extension as well as the | RefToolbar in the WikiEditor extension.

A final version of this extension might:
 * 1) Take a URL
 * 2) Detect if the URL points to a resource that has any other identifier such as a DOI or ISBN
 * 3) Return the appropriate citation if such an identifier exists
 * 4) If not, return as a default a  citation.


 * Possible mentors: James Forrester, Trevor Parscal

Deliverables
Deliverables are follows:


 * 1) Get familiar with wikimedia code base and commit Hello World version of extension.
 * 2) Simple extension that accepts a URL and scrapes the Title to return a  citation.
 * 3) Work with VisualEditor team/Trevor Parscal to incorporate it into the transclusion dialog and make sure they can interact appropriately.
 * 4) Improve functionality of extension to provide better citations with more fields populated.
 * 5) Investigate possibility of saving citations to a database instead of scraping in realtime to improve scalability.

Participation
In terms of documenting work, I will probably do most of the documentation on mediawiki itself, and as well as keeping the README.md and comments in the code up-to-date.

In terms of commits, in the beginning of the project when I'll primarily be working independently on the extension, I will probably do several local commits and then upload those to gerrit on a semi regular basis. However, at the point at which I'll be hopefully adding the extension as a submodule of Visual Editor and making changed to the VE submodule as well, I plan to adhere to community standards with respect to committing via gerrit, which I hopefully demonstrated with my microtask! In terms of asking for help, I've already found the #mediawiki-visualeditor, where my mentors are denizens, an extremely helpful and responsive place. One challenge is that I'm 8 hours off timezone-wise from the visual editor team, which means that our work days overlap by only an hour. However, I am happy to chat later in the evening outside of working hours.

About you

 * Education completed or in progress:
 * B.A. Biological Sciences Cornell University, 2008
 * M.S. Ecology and Evolutionary Biology, 2009

DevChix mailing list
 * How did you hear about this program?

No, although for scheduling working hours, I do have to work around the availability of my childcare provider and my partner's conference/work schedule. I anticipate that I'll be able to schedule 40 hours of uninterrupted work a week.
 * Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?


 * We advise all candidates eligible to Google Summer of Code and FOSS Outreach Program for Women to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I'm only applying to OPW: WikiMedia and CPython.

Past experience
I've been a registered Wikipedia user since 2005 and I've used GNU/Linux as my main OS since 2008.
 * Please describe your experience with any other FOSS projects as a user and as a contributor:

My microtask is Bug 51012.


 * Please describe any relevant projects that you have worked on previously and what knowledge you gained from working on them (include links):

A few years back I wrote a web app/CMS for academics to keep track of their publications. What I learned from this is that adding papers to a database by filling out fields in a GUI is considerably worse than using the tag!

For the last two years I ran a web app for a CDC funded project called Ex-flu in Django/MySQL. I learned a lot of lessons from that, notably:
 * The two most terrifying things are: Sending out mass e-mails, and running SQL alter table statements on a live database.
 * Test before you commit. Never commit on a live server. Pull your commits before running management commands.

I've also done some contract work in WordPress.


 * What project(s) are you interested in (these can be in the same or different organizations)?

My project proposal comes from my interest in the following raw projects:


 * Better URL to citation conversion functionality
 * Work on RefToolbar
 * Visual Editor plugin
 * Bug 57804