Wikimedia Engineering/Report/2012/February

 Engineering metrics in February:
 * 67 unique committers contributed code to MediaWiki.
 * About 530 code commits were reviewed.
 * The total number of unreviewed commits went from 44 to 31.
 * About 33 shell requests were processed.
 * 13 developers got commit access, among which six volunteers.
 * Wikimedia Labs now hosts 59 projects, 97 instances and 126 users.

Major news in February include:
 * The difficult deployment of our Swift infrastructure to serve image thumbnails;
 * Continued success for our Wikipedia Android app;
 * the deployment of MediaWiki 1.19 to all Wikimedia sites except for most Wikipedia languages;
 * Continued preparation for our [//blog.wikimedia.org/2012/02/15/wikimedia-engineering-moving-from-subversion-to-git/ move from Subversion to git].

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * Pune hackathon (10–12 February 2012, Pune, India) — A few dozen participants came to this three-day developer outreach event cohosted with GNUnify. Participants focused on language support (internationalization and localization) and mobile applications. Some new translations were created and the Wikimedia Mobile team received improvements to the Wikipedia Android app.


 * GLAMcamp DC (10–12 February 2012, Washington, D.C., USA) — A Wikipedia citation tool was developed as a web browser extension that allows users to obtain a citation from any online MARC library catalog, and in that specific language version of Wikipedia. A mass upload script was also written for importing the images and metadata of the Walters Art Museum. The results of the test run can be seen on Commons. The full collection (~20,000 images) will be uploaded in March (see documentation).

Upcoming events

 * Chennai Hackathon March 2012 (17 March 2012, Chennai, India) — Yuvaraj Pandian and volunteer Srikanthlogic are hosting this one-day hackathon for experienced developers. Volunteers can work with the MediaWiki API and other Wikimedia technologies and show off their accomplishments.


 * Berlin hackathon (1–3 June 2012, Berlin, Germany) — Wikimedia Germany is hosting this three-day "inreach" hackathon for the Wikimedia technical community, including MediaWiki developers, Toolserver users, bot writers and maintainers, Gadget creators, and other Wikimedia technologists. The event will mostly involve focused sprints, bugbashing, and other coding, with a few focused tutorials and trainings on Git, Lua, Gadgets changes, or other topics of interest. Wikimedia Germany will also use this event to consult on and discuss the Wikidata structured data project. Wikimedia developers will soon get more information on travel sponsorships.


 * Wikimania hackathon (10–11 July 2012, Washington, D.C., USA) — Katie Filbert, Gregory Varnum, and Sumana Harihareswara have begun planning the hybrid inreach/outreach hackathon occurring just prior to Wikimania. Experienced Wikimedia technologists will collaborate, while interested new developers will be able to learn introductory MediaWiki development. The organizers are deciding on themes and focus topics for the event, possibly including accessibility.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Lucene Search Operations Engineer (RFP)
 * Mobile Quality assurance (RFP)
 * Senior Software Frontend Engineer
 * Software Developer Backend
 * Software Developer Frontend
 * Software Developer Mobile
 * Software Security Engineer
 * Technical Product Analyst
 * Interaction Designer

New hires

 * David Schoonover joined the Platform engineering team as Systems Engineer for Data Analytics (announcement).
 * Jon Robson joined the Mobile engineering team as Software Developer for Mobile (announcement).
 * Terry Chay joined the Wikimedia Foundation as Director of Features Engineering (announcement).
 * Christian Aistleitner joined the Operations team as a contractor working on the XML dump infrastructure (announcement).

Site infrastructure

 * Ashburn data center — Mark Bergsma and Peter Youngmeister completed the setup and deployment of a new Squid-based caching infrastructure for text.wikimedia.org and api.wikimedia.org. Mark also added capacity and redundancy for bits.wikimedia.org. Leslie Carr upgraded and migrated ganglia.wikimedia.org out of the Tampa data center.


 * Tampa data center — The team completed a work plan on the search infrastructure, which includes short, medium and longer term fixes. Short term fixes (mainly consisting of configuration tweaking and moving data around) were implemented and brought back some stability. Medium term fixes involve puppetizing the current configuration, upgrading some of the components and building new infrastructure in the Ashburn data center. We also supported the deployment of MediaWiki 1.19, and the associated database schema changes. Last, new database servers were added to the core clusters, to address capacity and performance requirements, and to retire some of the older servers.


 * Media Storage — February saw Swift deployed to production to serve thumbnail requests. A few bugs were fixed, but one was serious enough to decide to revert the deployment and fall back to the legacy thumbnail infrastructure. Once the issue is fixed and Swift serves thumbnails again, the next steps will involve documentation and maintenance procedures, creating a mirror cluster in Ashburn, setting up Swift in Wikimedia Labs, and handling original media (not just thumbnails) with Swift.

Testing environment

 * Wikimedia Labs — The gluster project storage has been racked, installed, and has had   installed and peered in a cluster. Work is ongoing to automount the storage to instances automatically. We switched the scheduler used for compute nodes to choose the compute host that has the least number of instances, rather than picking a random host (Simple scheduler). Ryan Lane gave a talk at FOSDEM on Labs entitled Infrastructure as an open-source project, with a good turnout of about 500 people. Sara Smollett replaced the   LDAP library with , to fix a bug when using TLS/SSL. Andrew Bogott has spent a long time working on adding gluster control to OpenStack, but has hit some technical issues; he's now working on fixing OpenStack's Unicode support.

Backups and data archives

 * Data Dumps — We now have a copy of all dumps on a secondary host in another data center. We've been working with two organizations on full mirrors of the dumps, sorting out performance issues before they can go live. Christian Aistleitner has started to work on a test framework for the dumps. We've made contact with the Internet Archive, and we're working on scripts using the S3 API to push our historical dump archives to their servers. We're also checking that dumps are generated correctly after the deployment of MediaWiki 1.19.

Other news

 * Domain names — The Wikimedia Foundation has started to move its domain names from GoDaddy to MarkMonitor (announcement).
 * Squid issue — An issue with Swift thumbnails led to an accidental restart of all Squid servers, which took longer than expected and caused site issues.
 * DDoS attack — Domas Mituzas noticed a distributed denial of service attack on February 27th. It involved flooding our Squid cache servers by POSTing 1MB files to the root directory. Mark Bergsma blocked the requests. The incident lasted for about 10 minutes and some Wikipedia users experienced slow response or timed-outs.

Mobile

 * Wikipedia Mobile App — In February, the [//market.android.com/details?id=org.wikipedia Wikipedia Android app] crossed over 1.8 million device installs. This is incredible growth as the app has been in the Android Market for just under two months. Yuvaraj Pandian announced a new beta version of both the [//lists.wikimedia.org/pipermail/mobile-l/2012-February/005379.html Android app] and the newly re-written iOS app. The re-written iOS version builds on our PhoneGap code base, and will allow us to deprecate our old objective C code. New features include: OSM integration, quick search integration, URL intents, search enhancements, bug fixes, better developer attribution, and more.


 * Wikipedia Zero — Patrick Reilly continued work on our zero-rated MediaWiki extension. He spent the month cleaning up issues reported by our partners and prepped the extension for more scheduled partner testing in March.


 * Wikipedia over SMS/USSD — Patrick Reilly, along with the PraeKelt Foundation, presented a demo instance of a SMS/USSD gateway to access Wikipedia at Mobile World Congress.


 * GPS Storage/Retrieval — Max Semenik has completed the first iteration of the GeoData extension and deployed it to a prototype wiki on Labs. Further work will happen when we have an app to use its data to figure out what else needs to be done.

Offline

 * Kiwix UX initiative — The Kiwix project released a new version of Kiwix for Sugar in February. Work continues on the next major release of Kiwix for all platforms.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.