Wikimedia Engineering/Report/2012/December

Engineering metrics in December:
 * 113 unique committers contributed patchsets of code to MediaWiki.
 * The total number of unresolved commits went from about 535 to about 648.
 * About 39 shell requests were processed.
 * As of December 2012, users can self-register on Wikimedia Labs (and get access to git/Gerrit). It is no longer necessary to request an account for developer access.
 * Wikimedia Labs now hosts 148 projects, 847 users; to date 1378 instances have been created.

Major news in December include:
 * https://blog.wikimedia.org/2012/12/07/inventing-as-we-go-building-a-visual-editor-for-mediawiki/ https://blog.wikimedia.org/2012/12/12/try-out-the-alpha-version-of-the-visualeditor/
 * https://blog.wikimedia.org/2012/12/20/article-feedback-new-research-and-next-steps/
 * https://blog.wikimedia.org/2012/12/10/introducing-mediawiki-community-metrics/
 * https://blog.wikimedia.org/2012/12/11/welcome-to-floss-outreach-program-for-women-interns/
 * [https://blog.wikimedia.org/2012/12/12/translation-interface-makeover-in-progress/
 * [https://blog.wikimedia.org/2012/12/12/translation-interface-makeover-in-progress/

''Note: Like last month, we're proposing a shorter and simpler version of this report that does not assume specialized technical knowledge.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.



Announcements

 * Matthew Flaschen joined the Wikimedia Features engineering team as Features Engineer (announcement).
 * Mike Wang joined the Operations team as part time Labs Ops Engineer (consultant) (announcement).

Technical Operations
Production Site Switchover/Failover


 * TechOps continue to work on completing the outstanding migration tasks, and to ready our Ashburn infrastructure for the big switchover day, i.e., failover from Tampa to Ashburn datacenter the week of January 22, 2013.
 * Today, most of our the traffic (about 90%) are served out of Ashburn data center and we have successfully failover those traffic between the two main data centers the last few months. However there are still the application, memcached and database systems that are running out of Tampa datacenter only. We have been working to upgrade the technologies and setting up those systems at Ashburn. We plan to perform the switchover of those few systems from Tampa to Eqiad in the coming weeks. This will provide us some assurance of a hot standby data-center should we encounter an irrecoverable and lengthy outage in one of the main data-centers. For those interested in the planning and status, the Countdown meeting minutes are documented on http://etherpad.wmflabs.org/pad/p/EqiadMigration.

Site Infrastructure
 * December is the Fundraising month. TechOps dials down making site infrastructure changes to mitigate the risks of introducing outages. Some of the lesser risk work performed include deploying the new parsoid cluster to support the Visual Editor project, rolling out doc.wikimedia.org, our auto-generated puppet documentation, using a new and unified SSL certificate for *wikipedia.org and *.m.wikipedia.org, and setting up the new Ashburn monitoring server and service - icingna.wikimedia.org.
 * Asher migrated one of the main production English Wikipedia slaves, db59, to MariaDB 5.5.28. He has previously been testing 5.5.27 on the primary research slave, and on the current build on a slave in Ashburn datacenter. Taking the times of 100% of all queries over regular sample windows, the average query time across all enwiki slave queries is about 8% faster with MariaDB vs. our production build of 5.1-fb.  Some queries types are 10-15% faster, some are 3% slower, and nothing looks aberrant beyond those bounds.  Overall throughput as measured by qps has generally been improved by 2-10%. He wouldn't draw any conclusions from this data yet, more is needed to filter out noise, but it's positive.The main goal of migrating to MariaDB is not performance driven.  More so, I think it's in WMF's and the open source communities interest to coalesce around the MariaDB Foundation as the best route to ensuring a truly open and well supported future for mysql derived database technology.
 * Both Mark and Faidon have made tremendous progress in testing and deploying Ceph in our Ashbrun site. We are hopeful it would be robust and scalable.
 * Ryan has been working on writing a new deployment system using git and Saltstack. Parsoid is currently being deployed with this system and MediaWiki is slated to use this system for its next major deployment.

Fundraising
 * There were no major changes on the fundraising infrastructure because of the fundraiser itself. We ordered/received bastion hosts for which we're in the process of deploying. Monitoring got an overhaul and we're now sending send alerts to fundraising-tech and/or techops depending on metric.

Data Dumps
 * A tool for dump users to set up interwiki links on their local mirrors is now in alpha, details of that and docs on the innards of the interwiki cdb file are here.
 * Work with WanSecurity on mirroring is moving forward again; they now hold a current copy of all 'other' files, including page views and Picture of the Year bundles, among other things. More to come soon.

Wikimedia Labs
 * Labs came out of beta this month, following the opening of self-registration. Another major change this month was the migration from the shared nfs instance to per-project glusterfs volumes. A number of smaller changes were made, including:
 * Addition of puppet documentation links from classes and variables on the instance configuration pages
 * Modification of the project filter to act as a table of contents
 * Split ldap project groups into projects and posix groups - fixed a bug with group search
 * Saltstack was installed on all instances to act as a guest agent

Language engineering
Highlights of the Language Engineering team’s projects this month include:

1. Translate extension improvements - Development of the new user interface for translate as well as the translation editor functionality continued at full pace throughout the month of December with iterative feature development and user experience improvements. Santhosh Thottingal and Niklas Laxstrom are leading development while Pau Giner is focused on optimizing user experience elements.

2. MediaWiki Language Extension Bundle: The latest version of MLEB was released by the team.

3. Universal Language Selector: Increased support for language variants, alternate language codes were added to ULS.

4. L10n/i18n language tools collaboration: Alolita Sharma continued to work with Red Hat’s L10n and i18n teams to evaluate localization data, translation tools as well as i18n tools and technologies.

1. Milkshake: Added more language input methods contributed by language communities to jquery.ime library.

1. Community outreach: Pau Giner and Amir Aharoni participated in the Open Tech Chat this month to talk about best practices in multilingual user testing and internationalization. Amir Aharoni also participated in mentoring OPW’s candidate Priyanka Nag for the new LevelUp program.

2. Blog posts by the team this month:

a. Translation editor growing snazzier - http://blog.wikimedia.org/2012/12/31/translation-editor-growing-snazzier/

b. Translation interface makeover in progress - http://blog.wikimedia.org/2012/12/12/translation-interface-makeover-in-progress/

3. Srikanth Lakshmanan and Arun Ganesh’s tenure ended with the Language Engineering team in December.

Kiwix
The Kiwix project is funded and executed by Wikimedia CH.

New Kiwix 0.9rc2 released. This version embedds our ZIM HTTP server kiwix-serve for Windows, OSX and Linux. Better, this software is now integrated in the Kiwix UI; allowing everyone, in two mouse clicks, to share Wikipedia on a LAN. We also have revamped our audience measurement tool, a solution which could be interesting for other projects using Mirrorbrain. We continue at the same time to increase our ZIM production throughput with 8 new Wikipedia ZIM files in December. December was also a month of new records for Kiwix: for the first time with have had more than 70.000 downloads a month and a Lead position for Education sotwares at Sourceforge.

Wikidata
The Wikidata project is funded and executed by Wikimedia Deutschland.

New code and bugfixes have been deployed (detailed changes here and here and test2.wikipedia.org now gets language links from Wikidata. Changes on Wikidata that concern articles on test2 are shown in the recent changes of test2 as well. If there are no problems deployment on the Hungarian Wikipedia will be in January 14th. Other Wikipedias are going to follow later.

For the second phase of Wikidata representation of values is the central focus. We published a draft for this and discussions have started. We'd appreciate your feedback.

Additionally Denny and Lydia held office hours on IRC again. (logs in English and German)

More detailed summaries about what is happening around Wikidata are available here.

Future

 * The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.