User:Jarry1250/GSoC 2012 roadmap

This is the application by me, Harry Burt (User:Jarry1250) to take part in the 2012 edition of Google Summer of Code.

Identity

 * Name:
 * Harry Burt


 * Email:
 * @gmail.com


 * Project title:
 * TranslateSvg: Bringing the translation revolution to Wikimedia Commons

Contact/working info

 * Timezone:
 * London (GMT/UTC+0, very shortly BST/UTC+1)


 * Typical working hours:
 * 9am to 6am, perhaps (see below)


 * IRC or IM networks/handle(s):
 * jarry1250 on Freenode


 * Other contact details
 * Onwiki: Jarry1250 (SUL), mostly like to be found hanging out at the English Wikipedia (user page) or on Wikimedia Commons (user page). Twitter: @harryaburt.

Project summary
In what few hours I managed to find over Christmas, I threw together a quick extension called Extension:TranslateSVG. This proposal, if it were to be accepted, would allow sufficient resources to turn it into a powerful tool and a viable WMF deployment. The existing extension provides both a starting point and a proof-of-concept, but require fundamental improvements before it could function on the kind of production wiki where it is "sorely needed", according to one developer with whom I had corresponded.

TranslateSvg, if completed, has the potential to revolutionise the ability of Wikimedia's increasingly diverse groups of image maintainers to work together creating and improving the same communal set of SVG (vector) images. It would achieve this by allowing for the translation of the textual elements of SVG images&mdash;not, as is common at the moment, by "forking" the image and therefore drastically increasing the maintenance burden and discouraging image improvement, but by embedding extra translations of the existing textual elements inside the image file itself. The file could then be displayed in either the language of a wiki, the user's preferred interface language, or any given arbitrary language. When I originally raised this idea it received the support of several Wikimedia Commons users as well as WMF developers.

Required deliverables
The final extension should:
 * be able to handle the translation 99%+ of text embedded in SVGs, taking account of (for example) meaningful s, italics, bold, superscript, subscript and other formatting using well-documented methods;
 * provide a functioning, internationalisable and polished interface able to adjust for translations (with a "native" fell and a minimum of visual clutter):
 * in different scripts,
 * with different x/y positioning,
 * and in right-to-Left languages;
 * modify file description pages, to enable visitors to view files in different languages and provide them with easy access to the translation mechanism;
 * be written to cope with "evolving" SVG files, i.e. those which go through a repeated translation-modification cycle;
 * be well documented, or, even better, be sufficiently simple that it needs little in the way of official documentation;
 * implement logical and informative permissions and error-handling;
 * and do all of the above in a resource-efficient manner, even for large SVGs; or, if this proves impossible, at least implement a reasonable upper limit with regard to performance (similar to the long-time situation with large PNG files).

If time permits
The final extension could:
 * provide quality control via a special logging action (reverting being handled natively via MediaWiki);
 * separate a new SVG-translation user right from the usual upload user right to allow for more granular permissions;
 * ensure SVG-file-parsing system is sufficiently logical and/or modular to be easily extensible by future developers.

Project schedule
I am submitting this under an unusual timetable, but please do bear with me, as I feel it's very much an achievable one:
 * 20 April to 31 May: "Ramp up period" (approximately 12 hours a week), fixing priorities, methods and designs (both for the main translation page and the additions to the file description page). This would include interviews with translators – in order to understand their needs fully – as well as a quantitative analysis of SVGs files extant on Wikimedia Commons looking at which structures will need to be accommodated. Agree performance-friendly methods with mentor and other developers. Blog enthusiastically about the project.
 * 1 June to 23 June: **temporary lull due to unavoidable university-wide examinations** (approximately 6 hours a week) It's not much time, but it's enough time to work on (and probably complete) the changes to file description pages. Additionally, there could well be some leftover time ready to start probing at the other interface design elements. I would probably blog far less during this period.
 * 23 June to 22 July: main bulk of development work, 45 hours per week, semi-regular blog posts and tweets.
 * 23 June to 8 July: finish new interface, open up for testing and comments, regular glob posts.
 * 8 July to 15 July: do most of work with regard to the extension handling the full plethora of SVG structures, iterative fixes to interface, more communications.
 * 15 July to 22 July: finish work on handling SVG files, including permissions and error handling.
 * 22 July - 12 August: Heavy-duty code review, documentation, and several rounds of testing with real-world translators.
 * 12 August to 20 August: Final polish, integration focussed elements. Pencils down on official project.
 * 20 August to end of October (provisional): Continue to work on the project part-time, prepare for deployment if required, publish retrospective blog posts and plenty of time for "wind down" engagement.

About you
I'm an 18-year-old (soon to be 19-year-old) student from Colchester in Essex, England. I'm currently studying combined Philosophy, Politics and Economics (PPE) at the University of Oxford. That and my position as the top A-level examinations student in England for my year group are both characteristic of my eternal urge to take on new and exciting challenges, especially where those can make a real difference to the world. PPE isn't a degree that lends itself naturally to a sideline in computer programming, but it's certainly a challenging one that has over the past six months enabled me to perfect my time management and communications skills as well as putting to the test my problem solving and critical thinking capabilities. In my spare time I've continued to program actively, building up contributions to MediaWiki and other open-source projects (details below).

I have chosen internationalisation as the topic of my project because of its ability to engage and empower so many potential contributors so easily. Thus it seems to be an ideal area in which to gain traction even during the limited time period available for full-time development work as part of Google Summer of Code. I've felt for some time as though a great deal could be achieved in this area if only I could set aside a block of time to sit down and work on it&mdash;that's why I'm excited to be applying for a Google Summer of Code placement.

Participation
Friends rarely complain that I communicate too little and talking is a habit I'm unlikely to break during Google Summer of Code. I've got my own blog (already on Planet Wikimedia), which will be the central focus for progress reports, though I'm also an occasional tweeter. Source code will be pushed regularly (probably daily) to the extension's repository on Git. I'll be lurking in IRC throughout and I'm also very responsive to emails when awake, allowing both updates and support to flow forwards and backwards whenever necessary. Testing itself will be documented on extension subpages on MediaWiki.org, allowing for editors to engage with the development process in the mode they are most likely to be happy with. I am reasonably well-known at present, and am typically collegial and drama-intolerant, even to a fault – though the benefit of this is that I feel that I could draw on a wide and varied support network if I needed to without fear of partisanship.

Past open source experience
My contributions to MediaWiki itself ncluding a number of patches (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, possibly others). In the past, I've also submitted patches to Mozilla, as well as putting my many other bits of code (including a successful website, a sizeable collection of Toolserver pages, a sizeable proportion of a bot framework, bot code, and patches for AutoWikiBrowser) under various libre and semi-libre licenses. I'm familiar with Git, am already set up with Gerrit, and as the regular "Technology report" writer for the English Wikipedia-based Signpost electronic newspaper, am familiar with virtually all aspect of Wikimedia technological jargon. I subscribe to the OpenKnowledge Foundation's mailing list, and have participated in several themed hackdays, the product of which is always openly licensed. Finally, the Brighton Hackathon enabled me to put faces to several of the Wikimedia names with which I was so familiar.

Any other info

 * https://commons.wikimedia.org/wiki/File:TranslateSvg.ogv - Proof of concept video (also linked above)
 * https://github.com/Jarry1250/TranslateSvg - Proof of concept code
 * https://commons.wikimedia.org/w/index.php?oldid=65233694#SVG_translation - Previous engagement with media creators over design requirements, indicative of their support for the project
 * http://ultimategerardm.blogspot.co.uk/2012/01/translations-in-svg.html - Blog post by Gerard Meijssen referencing proof of concept code
 * https://developer.mozilla.org/en/SVG/Element/switch - Useful primer describing the technology that makes this whole project possible