Wikimedia Engineering/Report/2012/September

Engineering metrics in September:
 * unique committers contributed patchsets of code to MediaWiki.
 * The total number of unreviewed commits went from about 360 to
 * About shell requests were processed.
 * About developers got access to Git and Wikimedia Labs.
 * Wikimedia Labs now hosts projects,  instances and  users; to date  instances have been created.

Major news in September include:
 * https://blog.wikimedia.org/2012/09/06/language-teams-plan-translation-memory-uls/
 * https://blog.wikimedia.org/2012/09/07/recovery-of-broken-gerrit-repositories/
 * https://blog.wikimedia.org/2012/09/11/using-the-wiki-loves-monuments-app-as-a-travel-log/
 * https://blog.wikimedia.org/2012/09/17/new-e-book-export-feature-enabled-on-wikipedia/
 * https://blog.wikimedia.org/2012/09/18/server-decommissioning-donations-sept2012/
 * https://blog.wikimedia.org/2012/09/25/page-curation-launch/
 * GSoC 2012
 * GSoC 2012

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

Technical Operations
Site infrastructure
 * The focus for September was getting EQIAD to take over as the primary data center, if possible, in October. The outstanding items are setting up :
 * Varnish with persistent cache (to replace current Squid implementation). Mark has successfully deployed it on 8 servers @ EQIAD and routed traffic thru them for the last three weeks. He will add another 8 servers and fully deploy it in the coming week or two.
 * Redis as a replacement of current Memcached implementation. Asher has built and puppetized it, and the Tampa servers have been setup. Asher will be deploying it in the coming week or two as well and he will be testing it in parallel with current Memcached, to mitigate any risk associated with the Redis implementation. Once the team is comfortable and satisfied with it, Asher would be replicating the Redis datastore across to EQIAD.  This is critical to the EQIAD migrate because we would then have 'warm' caches at both data-centers.
 * Apache servers to run mediawiki and image scalers. Peter has expanded his deployment at Tampa. Unfortunately that surfaced a bug that is a blocker (bugzilla 40462). Thanks to Tim, he identified the issue and now Faidon is working on the fix. Peter will be deploying the fix in the coming weeks. Meantime, Peter has deployed several application servers at EQIAD to be used by Asher and Aaron for testing purposes.
 * Swift to be replicated across the data centers. However when Faidon was implementing Swift replication, he encountered several bugs. While he did overcome and fix those bugs, the replication rate was very slow and at that pace, the replication would take 6 to 12 months to complete. This is now an issue and the team is currently brainstorming a suitable solution. Openstack Swift acknowledged inherent weakness with the current implementation and has plan to rewrite the replication feature. But that is months away.


 * Asher has reconfigured db1047 for data analysis users. That db contains both the enwiki replica and custom user DBs. The new db1047 is  running mariadb 5.5, and now has an additional database called "staging" that users can write to with 5TB of free space. This is our first use of mariadb.


 * Jeff has been building the new Fundraising infrastructure at EQIAD for some time now. He is glad to report that the infrastructure has successfully processed live fundraising traffic. You can see the infrastructure on ganglia.

Wikimedia Labs
 * Several key enhancements were implemented:
 * Code was moved from OpenStackManager to OpenStack Nova for updating Instances' on-wiki status pages, making their updates much more reliable.
 * Salt installed on all Labs instances, with virt0 as the master. This allows us to easily and quickly do remote execution tasks on all instances in all projects. There are plans on the work to extend salt's capabilities to make it multi-tenant, so that we can allow remote execution rights for instances within projects.
 * Writing/testing new deployment system in the demo project. demo-deployment1 is the deployment system, demo-web1/demo-web2 are the app servers. demo-deployment1 can call the deployment runner on virt0 via peer permissions.
 * Work was completed to allow open registration for Labs. Specifically, shell access was split apart as a right. Shell access must be requested separately from creating an account.
 * Two-factor authentication was modified so that certain groups are required to use it when logging in, if they wish to use nova features. Any user that can modify user rights is currently forced to use two-factor, as they can add themselves to any project and role.
 * A new compute node was added to the pmtpa cluster. The rest of the cisco nodes will soon be added as well.
 * Work began on replacing the home directory NFS share with gluster shares.

Data Dumps
 * Although most of this month went to beating on Swift hardware, we found some time to find and squash the pesky bug in the bz2 multistream index generation. There's now a toy offline reader  using the bz2 multistream XML file, a sorted index file and a python script to grab and display the text of the en wikipedia article of your choice on demand, without reading through the entire file.

Others
 * Wikimedia sites experienced 3 episodes of intermittent performance lags and brief unavailability on 16th & 17th August 2012. The first two incidents occurred on 16th at 13:22 UTC and at 15:30 UTC. The third incident was on 17th September at 10:40 UTC. You can find more information here

Mobile
We are preparing for another work sprint on the mobile interface! Some beta features will be graduated to the standard mobile view, such as the new navigation menu.

Preliminary support for sharper images on high-density displays (such as the iPhone 4/4S/5 and many Android phones) is being worked on; this will apply also to the desktop view on suitable tablets (iPad 3, Nexus 7, Kindle HD) and laptops (Retina MacBook Pro, Windows laptops with desktop zoom at 150% or 200%).

Offline
Kiwix

Nothing this month. Work on current projects continues, and planning for 2013 is almost over.

Wikidata

 * The Wikidata project is funded and executed by Wikimedia Deutschland.

The Wikidata team is working on the last parts of a first deployment and the code is currently being reviewed by WMF engineers. Anja Jentzsch has joined the team and focuses on quality and the deployment of Wikidata. On the coding side a lot has been done including work on edit conflicts and permissions and reworking the special page to create new items. In addition to that work on phase 2 of Wikidata (infoboxes) has started. This includes for example the ValueHandler extension which will be used for our data values. The team has also met with a group of database experts from different projects to get input for phase 2 and 3 from them.

In addition we started a page for bot discussions and coordination, published test coverage data, updated the demo system and attended a lot of events including WikiCon and Software Freedom Day and held another round of office hours. Oh and if you are interested in contributing to Wikidata there is now a new contribute page for you.

You can find more comprehensive weekly updates here and we now also have a Facebook and Google+ page!

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.