Wikimedia Engineering/Report/2012/July

 Engineering metrics in July:
 * The total number of unreviewed commits went from about 320 to about 360.
 * About 35 shell requests were processed.
 * About 80 developers got access to Git and Wikimedia Labs.
 * Wikimedia Labs now hosts 114 projects, 211 instances and 559 users.

Major news in July include:

Recent events
Wikimania and Pre-Wikimania hackathon (10–15 July 2012, Washington, D.C., USA)
 * This year's pre-Wikimania Hackathon was special in that it had a full track for newcomers, going beyond tutorials. The Hackathon was a collaboration with OpenHatch, an open-source teaching non-profit. The new efforts included appropriate first-time tasks to orient newcomers into more advanced Wikipedia editing and tech contribution, a laptop setup guide that steps attendees through the process of configuring development environments, and constant in-person assistance to help people past problems they encountered. While at the event, we saw many people learning more about templates, editing Wikipedia, and using and modifying bots to improve the encyclopedia and media on it. At least 65 people signed in, with surely more in attendance. During the main Wikimania conference, a number of volunteer and staff gave talks and led discussions about technology-related topics.

Upcoming events
Wikipedia Engineering Meetup (15 August 2012, San Francisco, USA)


 * The Engineering department of the Wikimedia Foundation has initiated a Wikipedia Engineering Meetup to showcase the interesting problems and products they work on to the local developer community. Tentatively, the meetup will happen every two months at the Wikimedia offices in San Francisco, and will consist of three 15-minute engineering presentations, followed by a question & answer period bracketed by mingling. The inaugural meetup will feature talks about Mobile engineering, Analytics and the VisualEditor.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.



Announcements

 * Peter Youngmeister, who was working as a contractor for the Operations team, was converted to full-time Technical Operations Engineer (announcement).
 * S Page joined the Editor engagement experiments team as Software Engineer (announcement).

Operations
Site infrastructure
 * July was a relatively quiet month for Operations, and the team was working mostly behind the scenes. Mark Bergsma has successfully integrated and tested the upgraded Varnish software (with persistent cache patch) on some of our mobile caching servers. They are working very well and the plan is to roll it widely in the coming weeks.
 * Mark also made several feature upgrades to LVS/Pybal, including IPv6 BGP support, and a DNS recursor implemented as an LVS cluster and on the back of Ubuntu 12.4. The package has been puppetized and deployed. It also means EQIAD servers no longer hop to Tampa to get DNS answers.
 * Peter Youngmeister has been packaging, puppetizing and testing the new application server build. This build runs on Precise (Ubuntu 12.04) and works with the Swift object store (rather than the current NFS filer). There will be further performance and scalability tests across bigger portion of Tampa application servers shortly.
 * Asher Feldman has deployed an upgraded version of the parser cache server and the results have been impressive. This is relevant to every page request from logged-in and cookied logged-out users, so it should have a meaningful impact on the user experience. In addition, Asher has completed and deployed his latest (Precise) MySQL build on one of the database slaves.

Object Store/Swift
 * Migration to Swift is progressing to the final stages now that the performance bottleneck issue identified in June has been resolved. MediaWiki is now operating fully using Swift as the primary object store for thumbnails (the NFS filer is relegated to a secondary fail-over backup). The 'originals' (uploaded images and multimedia contents) have been copied over to Swift as well, setting the stage to migrate away from the NFS filer next.

Wikimedia Labs
 * This month was focused on adding new hardware, working on upgrading OpenStack infrastructure, and other stability efforts. virt6–8 have been added to the cluster and about 20 instances have been migrated to these nodes so far. Another 40 instances have been created on virt6–8 since the addition. Initial instance migration efforts ended in 30 instances being corrupted due to a KVM block migration bug. A cold migration process was created as a workaround, with an automated script. Development effort is ongoing to upgrade OpenStackManager to support the Essex release of OpenStack. Keystone support is complete and OpenStack API support is being added currently. Development work on OpenStack continues as well. Andrew Bogott's openstack-common plugin framework has been merged. novaclient work is progressing and should be merged after some cleanup efforts. Some changes needed for OpenStack Keystone's LDAP backend to work for the essex (stable) release were pushed in collaboratively between Ryan Lane and OpenStack developer Adam Young. A blueprint has been submitted for using Keystone to manage LDAP entries via templates, so that we can move to Keystone as an LDAP manager in the future. Work has begun on using OpenStack Nova for managing DNS entries. GlusterFS project storage has been upgraded to version 3.3. A tutorial on Using Puppet with Labs was hosted at the pre-Wikimania Hackathon by Leslie Carr and Ryan Lane, and a presentation on Labs and the State of Our Open Source Infrastructure was given at Wikimania.

Data Dumps
 * The YAS3 library for uploading to archive.org and to other s3-compatible sites, along with several command line clients, is now usable (though still under heavy development). This library handles 100 Continue correctly; this means that for large file uploads, the upload is only attempted once the client has been redirected to the right host, a great time saver. The library also supports uploads of large files in multiple chunks automatically, rather than requiring the user to split the file into separate pieces. That's a necessity for us since many of our dump files are quite large.

Offline
Kiwix


 * We finally released Kiwix 0.9 rc1 (see the CHANGELOG). All the binary files were compiled using our new continuous integration build platform. In collaboration with Wikimedia France (for the Afripedia project), we released a first version of kiwix-plug, a standalone WiFi hotspot using cheap plug computers. The Black&White project, contracted by Wikimedia CH, was completed; a recent achievement was the introduction of Kiwix in the official Debian package repository. Also in collaboration with Wikimedia CH, we started a new project called ZIM autobuild aiming to quickly and automatically generate ZIM files of our projects.

Wikidata

 * The Wikidata project is funded and executed by Wikimedia Deutschland.

The Wikidata team has made good progress towards their first roll-out. The initial deployment plans are being made and the Hungarian Wikipedia community stepped up to be the first to use the interwiki part of Wikidata in a few weeks. This also means the demo system needs to be tested more. If you have five spare minutes, have a look at the demo system and report any bugs you might find there so they can be fixed before the initial deployment.

The team also started to collect future use cases of Wikidata that should be kept in mind during development. You are invited to refine them or add your own. Additionally, the team is looking for feedback on the third iteration of the storyboard for linking Wikipedia articles in the future.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.