Wikimedia Engineering/Report/2011/December

Major news in December include:
 * The deployment of WebFonts to select Indic wikis;
 * The first developer prototype of the Visual editor;
 * A new version of Article Feedback being tested on the English Wikipedia;
 * Progress on the Swift media storage project.

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * Judging for the October 2011 Coding Challenge continued and winners will be announced in January.

Upcoming events

 * San Francisco hackathon (21–22 January 2012, San Francisco, California, USA) — Erik Möller and Sumana Harihareswara continued to plan and publicize this outreach-focused developers week-end. Heather Walls developed a more attractive homepage for the event. Sumana began arranging for tutorials and activities for the event, focusing on mobile, the web-accessible API and our framework for JavaScript feature development. Registration opened and more than 70 participants registered.


 * Pune hackathon (10–12 February 2012, Pune, India) — Preparation began and registration opened for an outreach-focused developers week-end to take place in Pune, India, and led by Alolita Sharma. Approximately 70 participants are expected, focusing on the gadgets framework, mobile Wikimedia access, and internationalization.


 * GLAMcamp DC (10–12 February 2012, Washington, D.C., USA) — Ryan Kaldari and Asaf Bartov plan to attend the technical track of this GLAM conference. Engineers will work on mass upload and analytics functionality.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Developers and engineers:
 * Interaction Designer
 * Systems Engineer (Data Analytics)
 * Software Developer (Back-end, Data Analytics)
 * Software Developer (Rich Text Editing, Features)
 * Software Developer (Front-end)
 * QA Lead
 * Software Developer (Mobile)
 * Software Security Engineer


 * Management & Product:
 * Director of Features Engineering
 * Product Manager


 * Requests for proposals:
 * Executive Dashboard - Analytics — Help us improve and centralize the dashboard summarizing the most important data about the Wikimedia movement to understand overall community health.
 * XML Dumps — Help us improve the infrastructure used to build XML dumps of Wikipedia content, for backups and reuse by third parties.
 * Mobile UX — Help us redesign our mobile platform and apps as more and more visitors access Wikipedia and its sister sites via mobile devices.

Short news

 * Yuvaraj Pandian and Max Semenik joined the mobile team as contract developers.
 * Sara Smollett joined the operations team as a part-time contractor.
 * Diederik van Liere, formerly with the Community Department, is now helping the engineering department as a contractor for analytics work.

Site infrastructure

 * Data Centers — The team deployed a new MediaWiki profiling system based on graphite, to track performance across the application stack, and to provide statistics/graphing as a service for MediaWiki within the WMF production environment. Some database servers were moved to newer hardware (including OTRS), and those in the Ashburn data center were upgraded to a new build of mysql-at-facebook. Mark Bergsma refactored our configuration tool (Puppet) to address scalability and performance issues.


 * Media Storage — As part of the preparation for the migration of our media service to Swift, a distributed storage back-end, we need to keep the current system afloat a bit longer. We reclaimed some space by purging thumbnails not newly generated and not in use on any of our projects. We also performed Swift thumbnail integration and stress testing. Read performance is about 10x what we need on the performance test cluster so we're good on that front. Write performance is only 2x what we need, but sufficient to move forward. Tests and research indicates performance drops over a few million objects; the easiest path forward is to shard the Commons container using the existing hashed characters in the URL, splitting the container into 256 containers.


 * HTTPS — HTTPS support was added for mobile, for Wikipedia. After an initial testing period, we'll enable this for further mobile sites. A number of other miscellaneous services also had HTTPS set up or fixed.

Testing environment

 * Wikimedia Labs — A server admin log was created for every project, as well as a [//labsconsole.wikimedia.org/wiki/Server_Admin_Log combined log]. OpenStackManager 1.3 and LdapAuthentication 2.0a were deployed to Labs. Live migration of instances has been enabled for the OpenStack Nova infrastructure, allowing updates and upgrade of hardware without bringing instances down. A gluster storage cluster has been ordered for use as volume storage. A number of projects were added or moved to Labs, including adminbot, nagios, Cluebot, testswarm and the reportcard service. There are now 33 projects, 52 instances, and 74 users.

Backups and data archives

 * Data Dumps — The end of the year closed out with another full dump of the English language Wikipedia on schedule. More work was done on code to allow restart of the history phase of a dump from a specified point without a long catchup delay. An experimental service was tested this month: a newly formatted file of article content and an accompanying index, more convenient for data analysts and for use with offline readers.

Editing tools

 * Visual editor — The team deployed a developer prototype of the visual editor sandbox to mediawiki.org for public feedback and testing. Trevor Parscal fixed bugs and refactored code. Inez Korczynski worked on the toolbar (text styles), the undo/redo stack, and lists (creating, deleting, and changing indentation). Gabriel Wicke worked on the parser test runner and the parser pipeline, including [//lists.wikimedia.org/pipermail/wikitext-l/2011-December/000494.html the tokenizer and its grammar] and template expansions. Neil Kandalgaonkar worked on the undo/redo feature and did a lot of refactoring.
 * Internationalization and localization tools — The WebFonts extension was deployed to select Indic languages and projects, making it possible to read content in languages using non-Latin fonts without installing fonts manually. The deployment [//lists.wikimedia.org/pipermail/wikitech-l/2011-December/056966.html uncovered bugs and issues] that were addressed by the team, like cross-site font loading. The team also improved the Narayam and Translate extensions, and the latter was enabled on mediawiki.org to facilitate the translation of software documentation (like the Help:Extension:WebFonts page).

Participation and editor retention

 * Article feedback — Aaron Halfaker, Oliver Keyes and Dario Taraborelli have finished gathering valuable data from the community about the usefulness of comments coming in from each of the three forms launched in December. A survey to get comments from readers about the effectiveness and attractiveness of each design was also introduced, and the team has been compiling the various sets of data to produce a report on the pros and cons of each form. Fabrice Florin is leading development on the next round of features, including a new feedback page, to be implemented by OmniTI, our development partner.
 * Feedback Dashboard — Rob Moen added support for wikitext in responses, preview of changes, handling of blocked users, and fixed various bugs. Benny Situ worked on feedback response, notifications and HTML e-mails. Dario Taraborelli developed two sets of charts to visualize data from the [//toolserver.org/~dartar/fd/ FeedbackDashboard] and [//toolserver.org/~dartar/fd_notify/ response notification].

Multimedia Tools

 * UploadWizard — Users can now choose a default license for all uploads in their user preferences under "Upload Wizard" (bug 24702). All license choices now also link to the legal code of a license. The built-in feedback form more prominently links to Bugzilla.

MediaWiki infrastructure

 * ResourceLoader — Roan Kattouw updated and created tests for PHPUnit. Timo Tijhof fixed layout bugs in the Gadget manager, did some code review, and tested the migration of gadgets on a prototype.

Feature support

 * Wikipedia Education Program — Jeroen De Dauw started to work on a MediaWiki extension to support the Wikipedia Education Program; the implementation of course, term and institution management has been completed, and a test wiki is available.

Mobile

 * Mobile Research — Mani Pande and Parul Vora consolidated all the research findings from Brazil, India, and the USA into one report. It's currently being converted to PDF and wikitext to facilitate its publication.


 * MobileFrontend —


 * Android Wikipedia App — Several release candidates were released over the month and we're nearing completion of the first version of the app, thanks to developers Yuvaraj Pandian and Brion Vibber. Nightly builds are available for testing.


 * WikipediaZero — We began work on the infrastructure for zero-rated Wikipedia access. Next month, we'll start testing with one of our partners to work out the kinks of giving users free data access to Wikipedia.


 * GPS Storage/Retrieval — Max Semenik joined the mobile team and began prototying an API to store and retrieve GPS coordinates on our wikis. This will be a critical component of the mobile projects; it will replace our existing use of GeoNames.org and can also supplement GeoHack.


 * Featured Article RSS — Max Semenik built the first version of an extension to expose featured articles, In the news, and other main page content so that our partners can better re-use our data.

Fundraising support

 * 2011 Fundraiser — The DonationInterface extension underwent enhancements to tighten up security. Support was also added for monthly recurring donations for credit cards through our new payment processor, GlobalCollect, and we are working on automating the processing of recurring payments to our instance of CiviCRM. We built custom mass-mailing scripts to e-mail about 1 million past donors to encourage them to donate again. The ContributionReporting extension was enhanced by storing aggregated data in their own tables and updating them periodically, to eliminate the cache stampede problem uncovered last month. We added support for automatic notification of non-credit card payments from GlobalCollect, which allows us to automatically record donor and donation information in our donor database.

Offline

 * Kiwix UX initiative — Work continued on the 0.9 release of Kiwix. Lead developer Emmanuel Engelhart released Kiwix 0.9 beta5 for community testing, fixing lots of reported bugs and soliciting testers to get involved.

MediaWiki Core

 * The "MediaWiki Core" team was featured on the Wikimedia Tech blog this month.


 * MediaWiki 1.19 — Rob Lanphier continued to coordinate the efforts from Wikimedia engineers to review commits to the MediaWiki codebase, as part of the Wikimedia engineering 20% policy. Mark Hershberger started to send automated reminders to developers whose revisions are marked as "fixme" in Code review. Progress on unreviewed commits is tracked through the 1.19 revision report and the Code review statistics. As of December 31st, about 600 revisions remain to be reviewed in trunk. Brion Vibber proposed a feature freeze in trunk, to catch up on code review, instead of branching the code for release.
 * Continuous integration — The TestSwarm package was Debianized and its configuration almost entirely entered into Puppet. Antoine Musso and Daniel Zahn deployed TestSwarm to the production continuous integration portal.
 * Git conversion — The rules to convert  have been finished and a [//gerrit.wikimedia.org/r/gitweb?p=test/mediawiki/core.git;a=summary test repository] is now available. Chad Horohoe is now working on convert rules for extensions, and on user permissions.
 * VipsScaler — VIPS hasn't proven as effective at saving memory on large PNG files as what was expected, but it has shown improvements for large TIFFs. Deployment to Wikimedia sites is deferred until MediaWiki 1.19 is deployed. Bryan Tong Minh will reach out to upstream developers to include fixes for PNG and JPG files.
 * Multimedia — Ian Baker and Neil Kandalgaonkar completed the review of all the code, including the transcoding part. They started to plan a test plan and a deployment to Wikimedia Labs. Aaron Schulz merged the FileBackend branch into, and Tim Starling started to review the code. Aaron, Ben Hartshorne, Ariel Glenn discussed sharding the Swift containers at the MediaWiki level, and Aaron started to implement it.
 * HipHop deployment — After Facebook [//www.facebook.com/notes/facebook-engineering/the-hiphop-virtual-machine/10150415177928920 announced] that they were developing virtual machines for HipHop, Tim Starling [//lists.wikimedia.org/pipermail/wikitech-l/2011-December/056986.html indicated] that Wikimedia would put their current efforts on HipHop on hold, until the virtual machines can be evaluated. Other performance efforts like Wikitext scripting will take priority instead.

Wikimedia analytics

 * Wikimedia Report Card 2.0 — The reportcard 2.0 was moved to the Labs environment, and its source code centralized. The back-end and front-end code of stats.grok.se was rewritten and is being deployed to Labs as well. A renewed effort is expected as new employees come on board in January.

Technical Liaison; Developer Relations

 * Bug management — Mark Hershberger followed up on MediaWiki 1.18 bugs, and wrote a FAQ listing issues and offering solutions until 1.18.1 is released. Mark also continued to go through "highest priority" bugs, dealt with bugzilla vandalism, reviewed patches submitted in bugzilla, and held bug triages on MediaWiki 1.18 and Fundraising engineering.
 * Summer of Code 2011 — Neil Kandalgaonkar met with the Internet Archive about progress on ArchiveLinks; they are awaiting fixes and improvements in ArchiveLinks before it can go live. Yuvaraj Pandian fixed several issues with his offline article selection project, SelectionSifter, towards getting the code to a deployable state. Roan Kattouw made plans to review and merge Salvatore Ingala's gadgets project during his ResourceLoader 2 work in February 2012.
 * Engineering project documentation — Guillaume Paumier performed perennial maintenance on project pages and the Roadmap (updating, cleaning up and organizing the content as needed), and wrote this report.
 * Volunteer coordination and outreach — Sumana Harihareswara continued to follow up on contacts and recruit new contributors to the Wikimedia tech community (especially for commit and patch review), and mentor new contributors. Eleven developers got commit access, all from the non-staff MediaWiki community. Sumana also prepared for the January San Francisco hackathon and the February Pune hackathon, and recruited participants. Partly in preparation for these coding events, Sumana and Guillaume Paumier continued to consolidate training documentation to facilitate the onboarding of new developers.
 * MediaWiki architecture document — This project was mostly on hold in December, while the book's technical reviewers went through the content. They provided a second round of feedback and minor recommendations, that will be addressed in January.
 * Wikimedia blog maintenance — Guillaume Paumier continued to [//github.com/gpaumier/WP-Victor/commits/master fix bugs] in the blog's theme, and started to develop a WordPress plugin to bring functionality specific to the Wikimedia blog independently of the theme (for example, expanding the group of users who can upload files).

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.