Wikimedia Engineering/Report/2011/May

Major news this month include:
 * the Berlin Hackathon, where about 70 developers and engineers met to improve our technical infrastructure.
 * the deployment of the Upload Wizard as default uploader on Wikimedia Commons;
 * the continued development, deployment and roll-out of the Article feedback tool on the English Wikipedia;
 * major progress in reducing our code review backlog.

Recent events

 * Berlin Hackathon 2011 (May 13-15, Berlin) — About 70 MediaWiki developers and engineers participated in this event, organized by Wikimedia Deutschland. A lot of coding, bug squashing and discussion happened over these three days, including on the new parser, performance improvement and infrastructure; see the dedicated blog posts for more information (Friday, Saturday, Sunday, Monday). A special effort was made on documentation and remote attendance: all the talks were recorded, photos were taken and notes were taken in real time on all three days (notes for Friday, Saturday and Sunday).


 * CiviCRM code sprint (May 24-25, Berlin) — The Wikimedia Foundation office in San Francisco hosted a coding sprint for about 8 CiviCRM developers in May. Participants squashed many bugs, and also improved contact/contribution search performance by 15-25x. This is particularly useful for major users of CiviCRM with large databases, like the Wikimedia Foundation and its donors database. The Wikimedia Foundation, who endeavors to contribute to the free software ecosystem, had already hosted meet-ups for the CiviCRM community in the past. (read more about the event).

Upcoming events

 * Wikimania (August 2-7, Haifa, Israel) —


 * Check out the Software deployments page on the wikitech wiki for up-to-date information on the upcoming deployments to Wikimedia sites.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles. The following positions have opened this month: The following positions are still open: In addition, we hope to post the following positions over the next few months:
 * Operations Engineer — Special projects
 * Software Developer Front-end — General
 * Software Developer Back-end — General
 * Engineering Program Manager — Data Analytics
 * Software Developer — Features
 * Systems Engineer — Data Analytics (previously Data Analytics Engineer)
 * Operations Engineer
 * Senior QA Engineer
 * Networking Contractor — Amsterdam
 * Software Developer, Rich Text Editing — Features
 * Product Manager — Features
 * Release Engineer
 * Technical Writer

Short news

 * Visitors —
 * Hires and changes
 * Andrew Shields is a new contractor working as a Technician in our Tampa data center.
 * Nimish Gautam started to transition from the Technology department to the Global Development department of the Wikimedia Foundation.
 * Katie Software Engineer — Community R&D

Operations

 * Program manager: Mark Bergsma

Site operations
Virginia Data Center — Installation of a world-class primary data center for Wikimedia Foundation websites.
 * Status: Unfortunately the delivery of our connectivity has been incurring additional delays, which has prevented us from bringing services in the new data center live. The latest estimation for delivery is June 10th, after which we should be able to deploy some services running actively from the new location.

Media Storage — Improvement of our media storage architecture to accommodate expected increase in media uploads.
 * Status: The Swift cluster on the test servers was upgraded from version 1.1 to 1.3, to fix some problems we were observing, and to test with the latest released code as well. Russ Nelson also continued to develop a MediaWiki extension to integrate MediaWiki with Swift.

Testing environment
Virtualization test cluster — Environment to deploy temporary machines for testing and experimentation, for use by WMF staff and volunteers working on important projects (as capacity allows).
 * Status:

Backups and data archives
Data Dumps — Improvement of processes to create and provide public copies of public Wikimedia data.
 * Status: We incurred delays in moving to the new server, but started to puppetize the rollout of new servers. This will simplify not only the setup of new hosts, but also the maintenance of current servers. We are still trying to identify the cause of an issue with compression of files sometimes quitting partway through. We missed our target of producing new English Wikipedia dumps once a month (holding out for the new server), but data will be available in early June.

Other activities

 * Backups — This project was on hold in May, as we were still waiting for connectivity between our two data centers to be installed. We expect to have live replication or daily backups of all important data by the end of June.
 * Router upgrade — Due to multiple ongoing issues on several network devices, we decided to schedule an hour of maintenance on Tuesday May 24th, to upgrade software on nearly all devices, fixing multiple bugs that we were experiencing. This appears to have resolved the issues that we were seeing. A summary of the event is posted here.
 * HTTPS & IPv6 — Ryan Lane, Peter Youngmeister and Mark Bergsma have started working on HTTPS and IPv6 support for the wiki platforms. A new test cluster is being set up to serve these protocols, and limited testing on a subset of wikis will commence soon.
 * Tampa Data Center contractor - Rich Cole is no longer with us and we have now Andrew Shields taking his place.
 * Additional servers for m.wikimedia.org - To cater for the growth, two new servers are racked and ready for application installation.

Features Engineering

 * Program manager: Alolita Sharma

Editing tools
Visual editor 0.1 — Exploratory work to identify & prototype initial ideas for a visual editor for MediaWiki. wikiDom = storage structure and functionality taking the intermediate text from the parser and using it for the visual editor
 * Status: Trevor Parscal and Neil Kandalgaonkar have done exploratory work on the visual editor project. Neil worked with developers of HackPad (a custom version of real-time collaborative editing software Etherpad) on a proof of concept of integration between Etherpad and MediaWiki (read more). They're now working on turning it into a MediaWiki extension. Trevor continues work on WikiDom functionality. Work on the visual editor is also intersecting with the groundwork done on the new parser.

Content Quality and Editorial Tools
Article Feedback — A feature to collaboratively assess article quality and incorporate reader ratings on Wikipedia.
 * Status: Version 3 was deployed to the English Wikipedia on May 9 with new features, like the article feedback dashboard, a summary page showing general rating trends. The experiment was expanded to 100,000 articles and may be expanded further after analysis of the results. The next version of the Article Feedback feature is currently in development.

Discussions and Interactions
Liquid Threads — A feature that brings threaded discussions capabilities to Wikimedia projects and MediaWiki. WikiLove 1.0 — An extension to encourage praise and virtual gifts between users. now in the process of making it's reviewed and feature-complete will be deployed this month MoodBar 0.1 — A feature to encourage new users to provide feedback. Werdna will be the lead developer
 * Status: Lead developer Andrew Garrett continued to work on his new object model for the back-end. Timo Tijhof will be working with Andrew on the new front-end.
 * Status: Ryan Kaldari and Jan Paul Posma completed the first version of the WikiLove extension, including documentation for the API. They also made changes to the code in order to add parameters to the configuration to add new gifts. The code is now pending review.
 * Status: Brandon Harris published initial designs for this feature allowing new users to quickly provide feedback to the community. Erik Möller reviewed the designs with the team, and Brandon is working on a second round of designs.

Multimedia Tools
Upload wizard — A feature that provides an easier way of uploading files to Wikimedia Commons, the media library associated with Wikipedia.
 * Status: The Upload wizard was enabled as the default upload system on Wikimedia Commons on May 9. It was disabled shortly after that because of issues believed to come from a bug in ResourceLoader. It was re-enabled on May 17 after further investigation. The next phase is being planned, which includes changing how images are stored prior to completion of the wizard.

Other projects
development sprint planned for July intial specification for RL 2.0 can be found here (add link to doc) Katie Horn joined on May 31 to support Community prototyping efforts. WMDE has completed initial development, Roan now reviewing code. Should be live on June 15.
 * ResourceLoader 2.0 — ResourceLoader 1.0 is now in maintenance mode. Roan Kattouw and Timo Tijhof discussed requirements during the Berlin Hackathon, but there are currently no engineering resources available to work on ResourceLoader 2.0.
 * Non-Roman character set localization — Alolita Sharma and Erik Möller are currently gathering requirements on this project with the help of possible customers, including the language committee.
 * Community feature prototyping —
 * German Wikipedia editor survey support — Wikimedia Deutschland will be running its own editor survey in mid June, to assess community health. The survey will run on the English and German Wikipedia. Wikimedia Deutschland will handle the development work needed to integrate CentralNotice with the user profile information. The Features team of the Wikimedia tech department is helping with code review and deployment.
 * Mobile survey support — The Global Development department of the Wikimedia Foundation is planning to run a survey about mobile usage on the English Wikipedia in early June. Ryan Kaldari and Arthur Richards provided engineering support for the survey; Nimish Gautam took over the project as he transitioned to the Global Development department.

Wikimedia Labs
Media projects — A set of features to improve media handling and key infrastructure support tools, many developed with Kaltura, such as Metavid, MwEmbed, and the Video Editor.
 * Status:

Michael Dale's code has been reviewed by Brion Michael currently working on code fixes for TMH will then get tested and to the point of deployment
 * Program manager: Alolita Sharma

Special projects

 * Program manager: Tomasz Finc

Mobile
Mobile projects — All things Mobile and Wikimedia. Mobile Research — A research project to help determine our Mobile strategy. Mobile site rewrite — Port of our Ruby-based mobile gateway to PHP. We continued on development of the extension intergated functionality extension can work as both approaches whether you look through WAP or through smartphone, the new gateway will support both expanded out device detection list to better support our users next month: loading the WURFL catalog http://nomad.tesla.usability.wikimedia.org/index.php/Quark
 * Status: We've continued to move on development and research to plan our mobile efforts. As always, the majority of our thoughts can be found at [the Mobile project page]
 * Status: Our India fieldwork in Bangalore and Delhi continued in May, consisting of about 30 interviews, led by Parul Vora and Mani Pande . A follow-up workshop with interview participants, as well as community members, took place in Bangalore on May 15. We continued to recruit and prepare for the parallel study in Brazil, consisting of about 20 interviews, that will be conducted in June. We also sent out an RfP for the third mobile research study in the United States. The mobile survey launch was delayed due to reallocation of resources, but is planned to go live in mid-July.
 * Status: We demoed the mobile extension at the Berlin hackathon and answered concerns about implementing it as a skin vs. an extension (see the follow-up discussion on wikitech-l). We've started to prepare an upcoming QA cycle for which volunteer help will be very much welcome.

Fundraising support
2011 Fundraiser — Support and development for the annual fundraiser of the Foundation.
 * Status: Arthur Richards continued work to streamline our audit framework to surface missing transactions. This has helped us find even more transactions that were not present in our fundraising database but were real transactions. He also helped run a code sprint that gave the WMF instance of CiviCRM a huge performance increase. At the same time the fundraising team started to develop the next user stories for development. For those curious their temporary home can be found here. During the following month we'll be doing a week of intensive tests followed by a sprint planning meeting.

Offline
Wikipedia version tools — Support and development of a series of tools to select Wikipedia content for offline use. OpenZim for Collections — Integration of openZim into the Collections extension. Kiwix — Improvement of the user experience of the Kiwix app to access offline Wikimedia content. 7 participants, 6 of which were recorded posting the videos to Commons later Just within the 7 studies, it became clear just how easy it is to get a lot of valuable input with a few short sessions. link to summary of findings: http://www.kiwix.org/index.php/Usability_Testing_Script#Results results will be incorporated into our next dev sprint
 * Status: GSoC student Yuvi Panda began to port the WP 1.0 Bot to a MediaWiki extension. He drafted a project plan and started to develop a way to parse and track assessment data found in articles.
 * Status: All the existing critical bugs were fixed. We are now talking with PediaPress and numerous others about where to take the project next.
 * Status: We're wrapping up phase2 of development allowing us to release our first version of the integrated downloader. We're looking for testers and and if your available the week of 6/6 please add your name to our tester list
 * Kiwix usability study in Berlin: The usability was an excellent success. Kaldari ...Emmanuel and sumana ran a usability study with

General Engineering

 * Program manager: Rob Lanphier

MediaWiki development and tools
MediaWiki 1.17 release — The upcoming MediaWiki release. Code review management — Review of changes made to the MediaWiki code. Bugmeistering — Management of our bug tracker. Summer of Code 2011 — A sponsored community program allowing students to join the community as developers. Parser & gadgets — Groundwork for the next generation visual editor of MediaWiki.
 * Status: Almost all the blockers have been dealt with.
 * Status: Until a few weeks ago, the amount of unreviewed commits was increasing at the same rate as they were before the 1.17 code review sprint. The group of reviewers (Brion Vibber, Tim Starling, Chad Horohoe, Trevor Parscal, Roan Kattouw and Sam Reed) was expanded to include Timo Tijhof, Bryan Tong Minh, Alexandre Emsenhuber, Aryeh Gregor, Neil Kandalgaonkar and Andrew Garrett. Thanks to a conscious effort made by code reviewers, the trend was reversed, and the backlog of unreviewed commits slated for inclusion in the next 1.18 release is now decreasing.
 * Status: Mark Hershberger continued his efforts to watch, assign and resolve bugs, notably by leading the bug squashing sessions at the Berlin Hackathon . He also worked with Priyanka Dhanda to get meaningful reports and metrics out of bugzilla.
 * Status: In May, our eight Summer of Code students learned how MediaWiki works, as software and as a community. They checked out IRC and the mailing lists, talked with their mentors, asked for commit access, and started investigating the components that would include their project areas. On May 23rd, they started working on their projects full-time. WMF staffers and community members are mentoring the students, assisted by Sumana Harihareswara.
 * Status: Activity in Berlin on that front. Wikitext-l was revived.
 * Parser plan, Abstract syntax tree, Parser test cases, Gadget Studio

App-level monitoring — Implementation of application-level parameter monitors using the API.
 * Status: Sam Reed started to implement a job queue monitor, as the first of a set of application-level monitoring tools using the API.

Wikimedia analytics
udp2log — A custom data analytics logging system.
 * Status: Our recent router upgrade made it possible do enable multicast logging

A/B testing — A set of tools to perform A/B testing on Wikimedia sites. Wikimedia Report Card 2.0 — Usability improvements and streamlining of the creation of the monthly report card.
 * Status: Nimish Gautam completed this project, that has now transitioned to maintenance and bugfixing mode.
 * Status: Erik Zachte, Nimish Gautam Erik Möller, Mani Pande and Asher Feldman laid down the requirements and groundwork of the next version of the Report Card. Erik Zachte's scripts will be modified to enter the data into a database, that can then be accessed with a dedicated API to automatically generate the report card and other charts using a visualization framework. The API will also be puclicly available for third parties to access the data.

Technical communications
Engineering project documentation — An activity to ensure that project documentation of Wikimedia engineering projects is complete and up-to-date.
 * Status: The Berlin Hackathon and Wikimedia tech days were an opportunity to start catching up on missing project pages. Stubs were created using the new, lighter format, and some existing pages were transitioned to the new format.
 * Comms support

Other activities

 * Disk-backed object cache (DBOC) — The deployment of a disk-backed object cache to increase the parser cache hit ratio was on hold in May, in favor of the 1.17 release work. It will be resumed in June.
 * API maintenance — Sam Reed continued to fix bugs and to do general maintenance on the MediaWiki API.
 * Shell bugs — The backlog of shell bugs was on the agenda for the Berlin hackathon, and the team continued to hold dedicated triage sessions. Priyanka Dhanda also started to work on a process to streamline configuration changes, especially for popular requests.
 * Access to Subversion — About 13 new developers were granted commit access in May, among which 6 Summer of Code students, and 2 Wikimedia Foundation employees. Volunteer development coordinator Sumana Harihareswara joined the review team, and will become the primary point of contact for commit access requests.
 * Heterogeneous deployment — Priyanka Dhanda prepared a project plan and did the initial preparatory work, including setting up a prototype.
 * HipHop support — HipHop was discussed during the Berlin hackathon, and it was agreed that HipHop support would be part of the MediaWiki 1.20 release.
 * Configuration management — Chad Horohoe committed his initial work on a configuration management system. The goal is to move from the current system (where configuration mostly happens through globals in LocalSettings.php) to one where all the configuration parameters are contained in a configuration object.