Wikimedia Engineering/Report/2012/February

 Engineering metrics in February:
 * 67 unique committers contributed code to MediaWiki.
 * About 530 code commits were reviewed.
 * The total number of unreviewed commits went from 44 to 31.
 * About 33 shell requests were processed.
 * developers got commit access, among which volunteers.
 * Wikimedia Labs now hosts 59 projects, 97 instances and 126 users.

Major news in February include:
 * Swift deployment for thumbnails
 * 1.19 deployment to all Wikimedia sites except for most Wikipedia languages

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * Pune hackathon (10–12 February 2012, Pune, India) — A few dozen participants came to this three-day developer outreach event cohosted with GNUnify. Participants focused on language support (internationalization and localization) and mobile applications. Some new translations were created and the Wikimedia Mobile team received improvements to the Android app for Wikipedia.


 * GLAMcamp DC (10–12 February 2012, Washington, D.C., USA) — A Wikipedia citation tool was developed as a web browser extension that allows users to obtain a citation from any online MARC library catalog, and in that specific language version of Wikipedia. A mass upload script was also written for importing the images and metadata of the Walters Art Museum. The results of the test run can be seen on Commons. The full collection (~20,000 images) will be uploaded in March (see documentation).

Upcoming events

 * Chennai Hackathon March 2012 (17 March 2012, Chennai, India) — Yuvaraj Pandian and volunteer Srikanthlogic are hosting this one-day hackathon for experienced developers. Volunteers can work with the MediaWiki API and other Wikimedia technologies and show off their accomplishments.


 * Berlin hackathon (1-3 June 2012, Berlin, Germany) — Wikimedia Germany is hosting this three-day "inreach" hackathon for the Wikimedia technical community, including MediaWiki developers, Toolserver users, bot writers and maintainers, Gadget creators, and other Wikimedia technologists. The event will mostly involve focused sprints, bugbashing, and other coding, with a few focused tutorials and trainings on Git, Lua, Gadgets changes, or other topics of interest. Wikimedia Germany will also use this event to consult on and discuss the Wikidata structured data project. Wikimedia developers will soon get more information on travel sponsorships.


 * Wikimania hackathon (10-11 July 2012, Washington, DC, USA) — Katie Filbert, Gregory Varnum, and Sumana Harihareswara have begun planning the hybrid inreach/outreach hackathon occurring just prior to Wikimania. Experienced Wikimedia technologists will collaborate while interested new developers will be able to learn introductory MediaWiki development. The organizers are deciding on themes and focus topics for the event, possibly including accessibility.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Developers and engineers:
 * Senior Software Engineer Front-end
 * Interaction Designer
 * Software Developer (Back-end, Data Analytics)
 * Software Developer (Rich Text Editing, Features)
 * Software Developer (Front-end)
 * Software Developer (Mobile)
 * Software Security Engineer


 * Management & Product:
 * Technical Product Analyst


 * Requests for proposals:
 * Mobile QA — Help us set up testing and automation processes for all Wikimedia Mobile projects.
 * Lucene Search Operations Engineer — Help us maintain and improve our Search software stack and infrastructure.

Short news

 * David Schoonover joined the Platform engineering team as Systems Engineer for Data Analytics (announcement).
 * Jon Robson joined the Mobile engineering team as Software Developer for Mobile (announcement).
 * Terry Chay joined the Wikimedia Foundation as Director of Features Engineering (announcement).
 * Christian Aistleitner joined the Operations team as a contractor working on the XML dump infrastructure (announcement).

Site infrastructure

 * Ashburn data center — Mark Bergsma and Peter Youngmeister completed the setup and deployment of a new Squid-based caching infrastructure for text.wikimedia.org and api.wikimedia.org. Mark also added capacity and redundancy for bits.wikimedia.org. Leslie Carr upgraded and migrated ganglia.wikimedia.org out of the Tampa data center.


 * Tampa data center — The team completed a work plan on the search infrastructure, which includes short, medium and longer term fixes. Short term fixes (mainly consisting of configuration tweaking and moving data around) were implemented and brought back some stability. Medium term fixes involve puppetizing the current configuration, upgrading some of the components and building new infrastructure in the Ashburn data center. We also supported the deployment of MediaWiki 1.19, and the associated database schema changes. Last,  new database servers were added to the core clusters, to address capacity and performance requirements, and to retire some of the older servers.


 * Media Storage — February saw Swift deployed to production to serve thumbnail requests. A few bugs were fixed, but one was serious enough to revert the deployment and fall back to the legacy thumbnail infrastructure. Once the issue is fixed and Swift serves thumbnails again, the next steps will involve documentation and maintenance procedures, creating a mirror cluster in Ashburn, setting up Swift in Wikimedia Labs, and handling original media (not just thumbnails) with Swift.

Testing environment

 * Wikimedia Labs — The gluster project storage has been racked, installed, and has had glusterd installed and peered in a cluster. Work is ongoing to automount the storage to instances automatically. We switched the scheduler used for compute nodes to choose the compute host that has the least number of instances, rather than picking a random host (Simple scheduler). Ryan Lane gave a talk at FOSDEM on Labs entitled Infrastructure as an open-source project, with a good turnout of about 500 people. Sara Smollett replaced the   LDAP library with , to fix a bug when using TLS/SSL. Andrew Bogott has spent a long time working on adding gluster control to OpenStack, but has hit some technical issues; he's now working on fixing OpenStack's Unicode support.

Backups and data archives

 * Data Dumps — We now have a copy of all dumps on a secondary host in another data center. We've been working with two organizations on full mirrors of the dumps, sorting out performance issues before they can go live. Christian Aistleitner has started to work on a test framework for the dumps. We've made contact with the Internet Archive, and we're working on scripts using the S3 API to push our historical dump archive to their servers. We're also checking that dumps are generated correctly after the deployment of MediaWiki 1.19  in the middle of the transition to MW 1.19, checking that the dumps work correctly for migrated projects.

Other news

 * Domain names — The Wikimedia Foundation has started to move its domain names from GoDaddy.
 * Squid issue — An issue with Swift thumbnails led to an accidental restart of all Squid servers, which took longer than expected and caused site issues.
 * DDoS attack — Domas Mituzas noticed a distributed denial of service attack on February 27th. It involved flooding our Squid cache servers by POSTing 1MB files to the root directory. Mark Bergsma blocked the requests. The incident lasted for about 10 minutes and some Wikipedia users experienced slow response or timed-outs.

Mobile

 * Wikipedia Mobile App — In February The Wikipedia Android app crossed over 1.8 million device installs. This is incredible growth as the app has been in the Android Market for just under two months making it one of our fastest growing applications ever. Yuvi Panda released a new beta version of both the Wikipedia Android app and the newly re-written Wikipedia iOS app this month. The re-written iOS version builds on our PhoneGap code base and will allow us to deprecate our old objective c native code base. New features include: OSM integration, quick search integration, url intents, search enhancements, bug fixes, better developer attribution, and more


 * Mobile Frontend - Patrick Reilly, John Robson, and Arthur Richards all worked on a refactor of Mobile Frontend in order to make it less Wikimedia centric.


 * WikipediaZero — Patrick Reilly continued work on our zero rated Mediawiki extension. He spent the month cleaning up issues reported by our partners and prepped the extension for more scheduled partner testing in March.


 * Wikipedia over SMS/USSD — Patrick Reilly, along with the PraeKelt Foundation, presented a demo instance of a SMS/USSD gateway to access Wikipedia at Mobile World Congress.


 * GPS Storage/Retrieval —


 * Mobile Designs - Phillip Chang and Heather Walls worked on new design mockups for full screen search, contact us, navigation (ongoing), references, and more


 * Wiktionary app —

Offline

 * Kiwix UX initiative —

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.