Wikimedia Engineering/Report/2012/August

Engineering metrics in August:
 * 97 unique committers contributed patchsets of code to MediaWiki.
 * The total number of unreviewed commits remained stable around 360.
 * About 35 shell requests were processed.
 * About 25 developers got access to Git and Wikimedia Labs.
 * Wikimedia Labs now hosts 120 projects, 214 instances and 587 users; to date 999 instances have been created.

Major news in August include:
 * A [//blog.wikimedia.org/2012/08/06/wikimedia-site-outage-6-august-2012/ site outage] caused by a fiber cut between our two data centers;
 * Progress by the Internationalization team on the [//blog.wikimedia.org/2012/08/14/internationalisation-language-selector-milkshake/ Universal language selector & Milkshake], and [//blog.wikimedia.org/2012/08/24/webfonts-in-uls-translation-rally/ WebFonts];
 * Changes in our analytics system to [//blog.wikimedia.org/2012/08/31/improving-the-accuracy-of-the-active-editors-metric/ improve the accuracy of the active editors metric];
 * Major work on the [//blog.wikimedia.org/2012/09/01/wiki-loves-monuments-for-mobile-is-here/ Wiki Loves Monuments] app.

Recent events
Wikipedia Engineering Meetup (15 August 2012, San Francisco, USA)
 * Approximately 100 people attended the first Wikipedia Engineering Meetup in San Francisco, in a series meant to showcase Wikimedia's interesting engineering problems and products to the local developer community. Tentatively, the meetup will happen every two months at the Wikimedia offices in San Francisco, and will consist of three 15-minute engineering presentations, followed by a question & answer period bracketed by mingling. The inaugural meetup featured talks about Mobile engineering, Analytics and the VisualEditor.

Upcoming events
Wikimedia's internationalization and mobile teams are tentatively planning a volunteer outreach event in Bangalore, India, November 9–11. More information will come in September.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.



Announcements

 * Srikanth Lakshmanan joined the Internationalization and localization team as outreach coordinator / QA engineer, contractor (announcement).
 * Daniel Zahn moved from Germany to the San Francisco office, and joined the Operations team as full-time Technical Operations Engineer.
 * Andrew Bogott was converted from contractor to full-time Dev/Ops Engineer, working on Labs development.
 * Asher Feldman was promoted to the position of Site Architect.

Technical Operations
Site infrastructure
 * Continuing from his earlier MySQL work, Asher Feldman built additional MySQL servers for each of the clusters in Ashburn, all in preparation for the primary data center migration in the coming quarter. In the Tampa datacenter, he added a new server to the English Wikipedia (en.wp) cluster and replaced the en.wp master with newer hardware. A database tree chart provides the latest information on our database clusters.


 * Thanks to Varnish Software support, we have a new build of Varnish that comes with persistent cache and the video streaming bug fix. Mark Bergsma tested the build on one of the mobile Varnish servers, and so far it has been stable. In the coming days, Mark will be updating the 'upload' Varnish cluster at Ashurn (Eqiad) and move traffic through them.


 * Mark has also successfully updated and deployed the NetApp storage servers and enabled replication from Tampa to Ashburn. He started working on migrating some of the systems that are mounting to nfs1 to this new server. With this, Mark has resolved another critical path item on the migration to the new primary data center. In addition, Jeff Green started using the nas1-a to archive the Fundraising banner logs.

Network Infrastructure
 * The usual traffic surge due to the new school year caused an increase in package loss on our Tampa internal network. With Chris Johnson's help, Mark upgraded the links between the racks. Earlier this month, Leslie Carr and Chris installed a new passive optics (CWDM) system between the 2 floors of the Tampa datacenter hosting our servers, giving us effectively a 4X capacity increase.

Fundraising Infrastructure
 * Jeff continues to make progress in the Fundraising infrastructure buildup at Ashburn (EQIAD). With Leslie's help, the new firewall was set up and Jeff deployed a build host, a logging host, the application cluster and built the pxeboot, preseed and puppet configurations. He has also enabled nagios-nsca monitoring for those new hosts.

Object Store/Swift
 * 'Originals' were successfully copied over to the Swift cluster from the ms7 (a NFS filer for images). In addition to serving thumbnails (which was completed last month), Swift is now also the primary object store for Images and multimedia contents. In the current setup, MediaWiki reads from Swift only, but writes to both the Swift cluster and the legacy NFS servers (ms5 & ms7). In the coming months, we will be disabling ms5 & ms7, and run solely on Swift.

Wikimedia Labs
 * This month was mostly spent on upgrading all of the Labs infrastructure. OpenStack nova and glance were upgraded to the essex release. The keystone service was added and now handles all authentication for Labs-related OpenStack services. OpenStackManager was upgraded to support keystone, use the OpenStack API rather than the EC2 API, and to have multi-region support, in anticipation of the new region we'll be bringing up in Eqiad. Testing of ceph as a replacement of gluster for project storage continued during this month; more testing is required. A lot of puppet work has been done to start moving our spaghetti code-style repository into modules.

Data Dumps
 * We've been focusing on the media infrastructure, working on the migration to Swift, and also taking a hard look at scaled media usage and storage. Since scaled media (thumbnails) could be regenerated at will from the original, we are going to evaluate treating thumb storage as a medium-to-long term cache rather than permanent disk storage as we have been doing. Running the numbers on existing thumbs turned up some interesting results. We're still bringing mirrors online; we've gotten all the hardware and network issues worked out with WANSecurity and have started copying over the data.  They'll have files most mirrors don't host: page view files, archives, and more, as well as a full copy of our media files.

Offline
Kiwix
 * Our work mostly focused on the 0.9 RC2 (see CHANGELOG) which should be released soon after we port kiwix-serve to MS/Windows. Kiwix UI localization was improved, thanks to the translatewiki.net Translation Rally; four new languages have been added. For the ZIM autobuild project, we have migrated the server to a datacenter in Zurich, Switzerland, and coding work is ongoing. We are planning our next projects and seeking volunteer help.

Wikidata

 * The Wikidata project is funded and executed by Wikimedia Deutschland.

The team has been working further on getting the code-base ready for a first deployment. You can try the current status on the demo system. Work focused on diff, undo, migrating to using the Universal Language Selector, and providing useful edit summaries in recent changes and article history. They also published a draft for the export to RDF.

The team published tasks to get started to make it easier to contribute to Wikidata.

Joan Creus released pywikidata, a framework for Wikidata bots.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.