Wikimedia Engineering/Report/2012/February

 Engineering metrics in February:
 * 67 unique committers contributed code to MediaWiki.
 * About 530 code commits were reviewed.
 * The total number of unreviewed commits went from 44 to 31.
 * About 35 shell requests were processed.
 * 13 developers got commit access, among which six volunteers.
 * Wikimedia Labs now hosts 59 projects, 97 instances and 126 users.

Major news in February include:
 * The difficult deployment of our Swift infrastructure to serve image thumbnails;
 * Continued success for our Wikipedia Android app;
 * the deployment of MediaWiki 1.19 to all Wikimedia sites except for most Wikipedia languages;
 * Continued preparation for our [//blog.wikimedia.org/2012/02/15/wikimedia-engineering-moving-from-subversion-to-git/ move from Subversion to git].

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * Pune hackathon (10–12 February 2012, Pune, India) — A few dozen participants came to this three-day developer outreach event cohosted with GNUnify. Participants focused on language support (internationalization and localization) and mobile applications. Some new translations were created and the Wikimedia Mobile team received improvements to the Wikipedia Android app.


 * GLAMcamp DC (10–12 February 2012, Washington, D.C., USA) — A Wikipedia citation tool was developed as a web browser extension that allows users to obtain a citation from any online MARC library catalog, and in that specific language version of Wikipedia. A mass upload script was also written for importing the images and metadata of the Walters Art Museum. The results of the test run can be seen on Commons. The full collection (~20,000 images) will be uploaded in March (see documentation).

Upcoming events

 * Chennai Hackathon March 2012 (17 March 2012, Chennai, India) — Yuvaraj Pandian and volunteer Srikanthlogic are hosting this one-day hackathon for experienced developers. Volunteers can work with the MediaWiki API and other Wikimedia technologies and show off their accomplishments.


 * Berlin hackathon (1–3 June 2012, Berlin, Germany) — Wikimedia Germany is hosting this three-day "inreach" hackathon for the Wikimedia technical community, including MediaWiki developers, Toolserver users, bot writers and maintainers, Gadget creators, and other Wikimedia technologists. The event will mostly involve focused sprints, bugbashing, and other coding, with a few focused tutorials and trainings on Git, Lua, Gadgets changes, or other topics of interest. Wikimedia Germany will also use this event to consult on and discuss the Wikidata structured data project. Wikimedia developers will soon get more information on travel sponsorships.


 * Wikimania hackathon (10–11 July 2012, Washington, D.C., USA) — Katie Filbert, Gregory Varnum, and Sumana Harihareswara have begun planning the hybrid inreach/outreach hackathon occurring just prior to Wikimania. Experienced Wikimedia technologists will collaborate, while interested new developers will be able to learn introductory MediaWiki development. The organizers are deciding on themes and focus topics for the event, possibly including accessibility.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Lucene Search Operations Engineer (RFP)
 * Mobile Quality assurance (RFP)
 * Senior Software Frontend Engineer
 * Software Developer Backend
 * Software Developer Frontend
 * Software Developer Mobile
 * Software Security Engineer
 * Technical Product Analyst
 * Interaction Designer

New hires

 * David Schoonover joined the Platform engineering team as Systems Engineer for Data Analytics (announcement).
 * Jon Robson joined the Mobile engineering team as Software Developer for Mobile (announcement).
 * Terry Chay joined the Wikimedia Foundation as Director of Features Engineering (announcement).
 * Christian Aistleitner joined the Operations team as a contractor working on the XML dump infrastructure (announcement).

Site infrastructure

 * Ashburn data center — Mark Bergsma and Peter Youngmeister completed the setup and deployment of a new Squid-based caching infrastructure for text.wikimedia.org and api.wikimedia.org. Mark also added capacity and redundancy for bits.wikimedia.org. Leslie Carr upgraded and migrated ganglia.wikimedia.org out of the Tampa data center.


 * Tampa data center — The team completed a work plan on the search infrastructure, which includes short, medium and longer term fixes. Short term fixes (mainly consisting of configuration tweaking and moving data around) were implemented and brought back some stability. Medium term fixes involve puppetizing the current configuration, upgrading some of the components and building new infrastructure in the Ashburn data center. We also supported the deployment of MediaWiki 1.19, and the associated database schema changes. Last, new database servers were added to the core clusters, to address capacity and performance requirements, and to retire some of the older servers.


 * Media Storage — February saw Swift deployed to production to serve thumbnail requests. A few bugs were fixed, but one was serious enough to decide to revert the deployment and fall back to the legacy thumbnail infrastructure. Once the issue is fixed and Swift serves thumbnails again, the next steps will involve documentation and maintenance procedures, creating a mirror cluster in Ashburn, setting up Swift in Wikimedia Labs, and handling original media (not just thumbnails) with Swift.

Testing environment

 * Wikimedia Labs — The gluster project storage has been racked, installed, and has had   installed and peered in a cluster. Work is ongoing to automount the storage to instances automatically. We switched the scheduler used for compute nodes to choose the compute host that has the least number of instances, rather than picking a random host (Simple scheduler). Ryan Lane gave a talk at FOSDEM on Labs entitled Infrastructure as an open-source project, with a good turnout of about 500 people. Sara Smollett replaced the   LDAP library with , to fix a bug when using TLS/SSL. Andrew Bogott has spent a long time working on adding gluster control to OpenStack, but has hit some technical issues; he's now working on fixing OpenStack's Unicode support.

Backups and data archives

 * Data Dumps — We now have a copy of all dumps on a secondary host in another data center. We've been working with two organizations on full mirrors of the dumps, sorting out performance issues before they can go live. Christian Aistleitner has started to work on a test framework for the dumps. We've made contact with the Internet Archive, and we're working on scripts using the S3 API to push our historical dump archives to their servers. We're also checking that dumps are generated correctly after the deployment of MediaWiki 1.19.

Other news

 * Domain names — The Wikimedia Foundation has started to move its domain names from GoDaddy to MarkMonitor (announcement).
 * Squid issue — An issue with Swift thumbnails led to an accidental restart of all Squid servers, which took longer than expected and caused site issues.
 * DDoS attack — Domas Mituzas noticed a distributed denial of service attack on February 27th. It involved flooding our Squid cache servers by POSTing 1MB files to the root directory. Mark Bergsma blocked the requests. The incident lasted for about 10 minutes and some Wikipedia users experienced slow response or timed-outs.

Editing tools

 * Visual editor — Trevor Parscal did research on cursor interaction and selection rendering for RTL (right-to-left) and support for line breaks in PRE elements. Gabriel Wicke improved template expansion and parser function support, investigated Microdata and RDFa for WikiText-in-HTML-DOM embedding and added rough support for images and other files. Rob Moen committed a working Editable Surface IME prototype (bidirectional text not fully supported). Audrey Tang joined the team and worked on the sanitizer and the testing process.
 * Internationalization and localization tools — The team is currently in the middle of a Translate extension sprint. They added new translation admin features, and are trying to get translation memory (TMX) ready for deployment to Wikimedia sites. Santhosh Thottingal used the  library to compress some WebFonts even more. The team also deployed WebFonts on the English Wikisource. The team will be moved to the Experimentation & Internationalization team going forward.

Participation and editor engagement

 * Article feedback — Fabrice Florin is working with our development partner OmniTi on version 5 of this tool, with the help of community liaison Oliver Keyes and analyst Dario Taraborelli. This month, the team created a new feedback page, with special features to be tested with oversighters and rollbackers in early March. Roan Kattouw has completed code review for compatibility with 1.19 and will deploy a new release on March 8, as well as a  new working test environment in coming weeks. Final reports for phase 1 of this project will be published in early March.
 * Article Creation Workflow — Copy updates have been finalized and updated. All outstanding items have been resolved including testing current code against MediaWiki 1.19. The screenshots of the landing system are a little larger than the finalized version: there are no embedded login/registration form, nor tooltips for mouse-over. The wizard and several of its options have been prototyped on Wikimedia Labs.

MediaWiki infrastructure

 * ResourceLoader —

Feature support

 * Wikipedia Education Program — Jeroen De Dauw implemented a lot of new features, including ambassador profiles, personalized course listing for students, instructors and ambassadors, and article listing for students.

Mobile

 * Wikipedia Mobile App — In February, the [//market.android.com/details?id=org.wikipedia Wikipedia Android app] crossed over 1.8 million device installs. This is incredible growth as the app has been in the Android Market for just under two months. Yuvaraj Pandian announced a new beta version of both the [//lists.wikimedia.org/pipermail/mobile-l/2012-February/005379.html Android app] and the newly re-written iOS app. The re-written iOS version builds on our PhoneGap code base, and will allow us to deprecate our old objective C code. New features include: OSM integration, quick search integration, URL intents, search enhancements, bug fixes, better developer attribution, and more.


 * MobileFrontend — Patrick Reilly, Jon Robson, Max Semenik, and Arthur Richards all worked on refactoring MobileFrontend to make it [//lists.wikimedia.org/pipermail/wikitech-l/2012-February/057936.html less Wikimedia-centric]. We also expanded our API to return pages in a mobile-friendly format and cleaned up a lot of our JavaScript and CSS code.


 * Wikipedia Zero — Patrick Reilly continued work on our zero-rated MediaWiki extension. He spent the month cleaning up issues reported by our partners and prepped the extension for more scheduled partner testing in March.


 * Wikipedia over SMS/USSD — Patrick Reilly, along with the PraeKelt Foundation, presented a demo instance of a SMS/USSD gateway to access Wikipedia at Mobile World Congress.


 * GPS Storage/Retrieval — Max Semenik has completed the first iteration of the GeoData extension and deployed it to a prototype wiki on Labs. Further work will happen when we have an app to use its data to figure out what else needs to be done.


 * Mobile design — Philip Chang and Heather Walls worked on new design mockups for full screen search, contact us, navigation (ongoing), and references.


 * MobileFrontend/Photo upload — Philip Chang started definition of Photo Upload as two workflows, Basic and Advanced. Basic is modeled after the current Upload Wizard as the basis of a mobile-friendly workflow, and Advanced uses a Wikipedia article (or other site content) as the starting point and incorporates game dynamics.


 * UCOSP Spring 2012 — The UCOSP project moved into a feature freeze for the time being for a nice relaxing reading break. Currently the team is targeting bugs, cleaning things up and improving usability in the v0.1 Alpha release. Check their to-do list for the next steps.

Fundraising support

 * 2012 Wikimedia fundraiser — Adding support for recurring GlobalCollect donations was the primary engineering focus in February, with work on this functionality carrying over into March. Several deployments were made to the payments cluster to better our form localization in several countries in Africa. A subset of those forms were used in a week-long banner and landing page test that also ran in February. A great deal of effort was expended in February in the name of building out the team by two more people; The search for new fundraising engineers is ongoing.

Offline

 * Kiwix UX initiative — The Kiwix project released a new version of Kiwix for Sugar in February. Work continues on the next major release of Kiwix for all platforms.

MediaWiki Core

 * MediaWiki 1.19/Roadmap — In February, MediaWiki 1.19 was gradually deployed to Wikimedia sites. Stages 1 through 4 of the deployment schedule have been completed; all sister projects, and a few Wikipedia wikis, are now running MediaWiki 1.19. Many new features and bug fixes brought by MediaWiki 1.19 are back-end, behind-the-scenes changes, for example infrastructure work to support our ongoing [//blog.wikimedia.org/2012/02/09/scaling-media-storage-at-wikimedia-with-swift/ move to Swift] as our media storage platform. There are also more visible improvements, like better diff readability for colorblind people, and better support of the user's gender and language in the interface. A list of all changes is available in the [//svn.wikimedia.org/viewvc/mediawiki/branches/REL1_19/phase3/RELEASE-NOTES-1.19?view=markup draft release notes].
 * Continuous integration — The focus for February has been around integration with Git and Gerrit (bug 34141). Various Ant configurations were merged into a single Ant configuration, reducing duplication and opportunity for breakage.  Also new in February is that the tests have been broken up into three suites:  dbless, db, and parser.  Only dbless tests (which run quickly) will block a commit.  Work on CI slowed down due to Antoine being pulled into 1.19 bugfixing.
 * Git conversion — The migration of MediaWiki core and of extensions used on WMF sites is now tentatively scheduled for 21 March 2012. Sumana Harihareswara and Chad Horohoe wrote a blog entry to answer some common questions about the migration. WMF Engineering is making strong efforts to train the development community in using Git and Gerrit, including documentation and trainings via screensharing on February 27 and 28. We will be moving Gerrit infrastructure to Ashburn, as the Tampa server we're using can't handle the load.
 * Multimedia — Michael Dale and Jan Gerber continued testing and improving the TimedMediaHandler extension setup in Wikimedia Labs (inside the Wikimedia Commons deployment-prep wiki). It will be tested there before it is prepared for deployment to production, with the required transcoding infrastructure. After more polish by Aaron Schulz and a February deployment of SwiftMedia and Swift, Swift is now serving 100% of thumbnails on Wikimedia Commons (engineering report).

Wikimedia analytics

 * Analytics/Reportcard — Erik Zachte generated a world map with per country coloring of Wikipedia usage.

Technical Liaison; Developer Relations

 * Bug management — Mark Hershberger has been using the 1.19 deployment cycle to work with on-wiki editors to find and fix bugs as the deployment cycle goes on. Through the connections that he makes, he hopes to use these relationships during future deployments to make them smoother.
 * Summer of Code 2011/management — After contact from Sumana Harihareswara, GSoC students Kevin Brown and Akshay Agarwal have returned to their projects, working towards the goal of getting them deployed and in use on Wikimedia wikis.
 * Summer of Code 2012/management — MediaWiki aims to participate in Google Summer of Code 2012. Sumana Harihareswara, as MediaWiki's organizational administrator, improved the Summer of Code 2012 page, added project ideas, and communicated with prospective mentors and students to flesh out project ideas and help students learn.
 * Wikimedia Foundation engineering project documentation — Guillaume Paumier performed perennial maintenance on project pages and the Roadmap, and put together this report.
 * Volunteer coordination and outreach — Sumana Harihareswara continued to follow up on contacts and recruit new contributors to the Wikimedia tech community (especially for commit and patch review), and mentor new contributors. Sumana also prepared for the June Berlin hackathon and the Wikimania hackathon in July and recruited participants for upcoming events. 13 contributors got commit access.
 * Wikimedia blog maintenance — Rob Halsell upgraded the WordPress code and third-party plugins to their latest version, and started to set up a test instance in Wikimedia Labs. The recent improvements to the theme and new features in the WMBlog plugin are awaiting deployment to the production site. In the meantime, Guillaume Paumier has started to rewrite the theme's layout so it can be centered.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.