Wikimedia Engineering/Report/2012/August

 Engineering metrics in August:
 * 97 unique committers contributed patchsets of code to MediaWiki.
 * The total number of unreviewed commits remained stable around 360.
 * About 35 shell requests were processed.
 * About developers got access to Git and Wikimedia Labs.
 * Wikimedia Labs now hosts 120 projects, 204 instances and 587 users; to date 999 instances have been created.

Major news in July include:

Recent events
Wikipedia Engineering Meetup (15 August 2012, San Francisco, USA)

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.



Announcements

 * Srikanth Lakshmanan joined the Internationalization and localization team as outreach coordinator / QA engineer, contractor (announcement).
 * Daniel Zahn (irc:mutante) moved from Germany to San Francisco office and joined us as a full time Technical Operations Engineer
 * Andrew Bogott (irc:andrewbogott) was converted from a contract to a full time Dev/Ops Engineer, working on Labs development.
 * Asher Feldman (irc:binasher) has been promoted to Site Architect.
 * Asher Feldman (irc:binasher) has been promoted to Site Architect.

Technical Operations
Site infrastructure
 * Continuing from his earlier MySql work, Asher built additional MySql servers for each of the clusters in Ashburn, all in preparation for the primary data center migration in the coming quarter. In Tampa datacenter, he  added a new server to the En cluster and replaced the En master with a newer hardware. The lastest information on our  database clusters can be found here.


 * Thanks to Varnish Software support, we have a new build of Varnish that comes with persistent cache and the video streaming bug fix . Mark tested the build on one of the mobile varnish servers. So far it has been stable. In the coming days, Mark will be updating the 'upload' varnish cluster at Ashurn (Eqiad) and move traffic thru them.


 * Mark has successfully updated and deployed the NetApp storage servers and enabled replication from Tampa to Ashburn. He started working on migrating some of the systems that are mounting to nfs1 to this new server.  To date, nas1-a (in Tampa)  serves Tampa /home to Fenari, Hume, Spence and Serv193 (testwiki). With this, Mark has resolved another critical path item on the migration to the new primary data center.  In addition,  Jeff started using the nas1-a to archive the Fundraising banner logs.

Network Infrastructure
 * With the starting of the new school year, we saw the usual traffic surge and that higher load caused an increase in package loss on our Tampa internal network. With Chris' help, Mark upgraded the links between the racks to either 2x GigE aggregated or shared 2x 10G aggregated for the entire row stack and that resolved the packet loss issue due to bandwidth capacity constrain. :Leslie had noticed the network capacity between the 2 Tampa floors (that house our Tampa data centers) were approaching saturation point. Earlier this month, Leslie and Chris installed a new passive optics (CWDM) system between the 2 floors, giving us effectively a 4X  capacity increase.

Fundraising Infrastructure
 * Jeff continues to make progress in the Fundraising infrastructure buildup at Ashburn (EQIAD). With Leslie's help, the new firewall was setup and Jeff deployed Boron (build host), Indium (logging host), the application cluster and built the pxeboot, preseed and puppet configurations. He has also enabled nagios-nsca monitoring for those new hosts.

Object Store/Swift
 * Contents from ms7 ('Originals') were successfully copied over to the Swift cluster from the ms7 (a NFS filer for images). In addition to thumbnails (which was completed last month), Swift is now also the primary object store for Images and multimedia contents (aka Originals).  In the current setup, Mediawiki is modified to perform 'read' from Swift only, but perform 'write' to both the Swift cluster and the NFS servers (ms5 & ms7). In the coming months, we will be disabling ms5 & ms7, and run solely on Swift.

Wikimedia Labs
 * This month was another month of a stability cycle of Labs. Next month will start a features cycle, though we'll still be doing some stability work as well. This month was mostly spent on upgrading all of the Labs infrastructure. OpenStack nova and glance were upgraded to the essex release. The keystone service was added and now handles all authentication for Labs related OpenStack services. OpenStackManager was upgraded to support keystone, use the OpenStack API rather than the EC2 api, and to have multi-region support, in anticipation of the new region we'll be bringing up in eqiad. Testing of ceph as a replacement of gluster for project storage continued during this month; more testing is required. A lot of puppet work has been done this month to start moving our spaghetti-code style repo into modules.

Data Dumps
 * We've been focusing on the media infrastructure this month, working on the migration to Swift, and also taking a hard look at scaled media usage and storage. Since scaled media (aka Thumbnail) could be regenerated at will from the original, we are going to evaluate treating thumb storage as a medium-to-long term cache rather than permanent disk storage as we have been doing.  Running the numbers on existing thumbs turned up some interesting results; see  the wiitech-l email  for more.  We're still bringing mirrors on line; we've gotten all the hardware and network issues worked out with WANSecurity and have started copying over the data.  They'll have files most mirrors don't host: page view files, archives, and more, as well as a full copy of our media files.

Other news
 * Site issue Aug 6 2012

Offline
Kiwix

We mostly have worked on the 0.9 RC2 (see CHANGELOG) which should be released soon after the portage of kiwix-serve to MS/Windows. Kiwix UI localisation was improved, thanks to the Translatewiki Rally, four new languages are supported. For the ZIM autobuild project, we have migrated the server to a datacenter in Zurich, Switzerland and coding work is ongoing. Otherwise, pretty much energy is involved in 2013 projects planning, we need volunteers and ideas... Join us!.

Wikidata

 * The Wikidata project is funded and executed by Wikimedia Deutschland.

The team has been working further on getting the code-base ready for a first deployment. You can try the current status on the demo system. The things that were worked on include diff, undo, migrating to using the Universal Language Selector, and providing useful edit summaries in recent changes and article history. They also published a draft for the export to RDF.

The team made it easier to contribute to Wikidata by for example publishing tasks to get started.

Joan Creus released pywikidata, which will make it easy to write bots for Wikidata for example.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.