Wikimedia Engineering/Report/2012/April

 Engineering metrics in April:
 * 53 unique committers contributed code to MediaWiki.
 * The total number of unreviewed commits went from about 100 to 138.
 * About 34 shell requests were processed.
 * 63 developers got developer access to Git and Wikimedia Labs, among which are volunteers.
 * Wikimedia Labs now hosts projects,  instances and  users.

Major news in April include:

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * OpenStack Design Summit and Conference —

Upcoming events

 * Berlin hackathon (1–3 June 2012, Berlin, Germany) —


 * Wikimania hackathon (10–11 July 2012, Washington, D.C., USA) —

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.



New hires

 * Matthias Mullie joined the Features team (announcement).
 * Faidon Liambotis joined the Operations team to work on Wikimedia Labs (announcement).
 * Chris Steipp joined the Platform team as Senior Security Engineer (announcement).
 * Tauhida Parveen joined the Platform team to work on QA and testing (announcement).

Site infrastructure

 * US data centers —
 * Added additional servers to bits.wikimedia.org @ eqiad for capacity growth and redundancy
 * Deployed SWIFT again with added capacity after addressing some initial teething problems. All thumbnails are served using SWIFT now.
 * After months of preparation and refactoring work with our dated Lucene search implementation @ Tampa, we are glad to report Peter (with help from Asher, rainmain_sr and Jeff) successfully built and deployed the new Search infrastructure at our EQIAD datacenter. The performance improvement is quite amazing, at the 99th percentile level, search latency dropped from a high of 9 seconds to 1 second, and the average search is only 100ms, down from 700ms. In addition, the new infrastructure addresses some of the previous single point of failures and capacity limitations.
 * Asher completed the dbschema migration/upgrade to support SHA-1 hashes in the coming mediawiki release
 * Varnish is now used in Eqiad to serve all upload (images & media) traffic (other than Europe, which has its own servers). Mark implemented varnish to replace our Squid instances which are running in Tampa. In addition to having consistent hashing, Mark ran half of the Eqiad Varnish instances with the experimental persistent storage backend. Unfortunately, after a few days, he found showstopper bugs and reverted it to the stable version.
 * A secondary network transit link has been added to our Eqiad network, providing us redundancy and capacity, and comes with IPV6 enabled.
 * Deployed a new udp2log server (in Eqiad) thus providing added extra capacity to collect new data for the Analytic folks.


 * Amsterdam data center —


 * Media Storage — April saw two areas of progress: the Mediawiki code to allow original media storage in swift was deployed to production (though it is not yet in use) and further investigation into old corrupted objects continued with new evidence and cleanup.  During May we hope to begin the data migration from the older storage system into Swift as well as deploy improved monitoring and metrics.

Testing environment

 * Wikimedia Labs —

Backups and data archives

 * Data Dumps — The gluster share with the last 5 or so good dumps for all projects is ready for use by lab projects.  A first copy of uploaded media, accessible via rsync, was announced and some work was done on tine infrastructure to generate downloadable bundles of media per project.  We're working with the Internet Archive to produce media bundles that they can host for download as well.  A new version of the dump scripts was deployed with some minor bug fixes.  Christian Aisteitner wrapped up work on the PHPUnit tests for the dump maintenance scripts and discovered a problem with the database schema which we will need to discuss with the user community in order to find a resolution that works for everyone.

Other news

 * There was a short site incident at our Amsterdam site on 4/26/12 at around 0600 UTC, which lasted for half an hour, and impacted some of our European users. We experienced an unusual surge in traffic which overwhelmed one some resources. That was quickly addressed once we found the cause.

Offline Projects

 * Kiwix UX initiative —

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.