Wikimedia Engineering/Report/2011/June

Major news this month include:
 * the network setup in our new datacenter, that opened the way to new server setup and backups;
 * progress on features to encourage and facilitate participation, like the Visual editor groundwork, and the WikiLove button;
 * productive community testing on our now mobile front-end and the Kiwix download manager;
 * the release of the stable MediaWiki release 1.17.0;
 * the first commits by our Summer of Code students;
 * major progress on our code review backlog.

Recent events

 * 2011 Wikimedia fundraising summit (June 17-19, Vienna, Austria) —
 * Open Source Bridge conference (June 21-25, Portland, Oregon, USA) — Sumana Harihareswara presented new features in MediaWiki 1.17 and gathered offers of volunteer help (especially around database support, testing, bug triage, and right-to-left support). She also gave a talk about technology management, and recruited candidates for the Wikimedia Foundation's current job openings (read more).

Upcoming events

 * OSCON (July 25-29, Portland, Oregon, USA) — A delegation of about a dozen Wikimedia engineers will be attending the Open Source Convention in July. Many of them will give talks (see full schedule).
 * Wikimania (August 2-7, Haifa, Israel) — Another delegation of about a dozen Wikimedia engineers will be attending the Wikimania conference in August. Besides the Developer Days and wm2011:OpenZIM Developers Meeting, they will also give several talks; the full schedule is now available.
 * Check out the Software deployments page on the wikitech wiki for up-to-date information on the upcoming deployments to Wikimedia sites.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles. The following positions have opened this month: RfPs: The following positions are still open: In addition, we hope to post the following positions over the next few months:
 * Product Manager — Features
 * Product Manager — Analytics
 * QA Lead
 * Operations Engineer — Networking
 * Director of Features Engineering
 * Internationalization and Localization Outreach
 * Internationalization and Localization Feature Development
 * Software Developer — Features
 * Systems Engineer — Data Analytics (previously Data Analytics Engineer)
 * Operations Engineer
 * Senior QA Engineer
 * Networking Contractor — Amsterdam
 * Software Developer, Rich Text Editing — Features
 * Product Manager — Features
 * Software Developer Front-end
 * Software Developer Back-end
 * Release Engineer
 * Technical Writer

Short news

 * CTO leaving WMF — Danese Cooper announced in early June that she would be leaving as Chief Technology Officer of the Wikimedia Foundation at the end of July.
 * Promotions and role changes — Erik Möller announced a reorganization of the engineering management team, promoting Rob Lanphier to Director of Platform Engineering, Tomasz Finc to Director of Mobile and Special Projects, Tim Starling to Lead Platform Architect and Mark Bergsma to Lead Operations Architect. Alolita Sharma is serving as acting Director for Features Engineering, and Erik Möller assumed the position of VP of Engineering and Product Development (read more).
 * Other changes — Guillaume Paumier's position officially transitioned from Product Manager to Technical Communications Manager.

Site operations
Virginia Data Center — Installation of a world-class primary data center for Wikimedia Foundation websites. Media Storage — Improvement of our media storage architecture to accommodate expected increase in media uploads.
 * Status: In June, our network setup in EQIAD, our facility at Equinix in Ashburn, Virginia, was finished. Two independent "transport" connections between our two data centers in Tampa and Ashburn have been installed, and local "IP transit" (Internet connectivity) is now available in Ashburn as well. We started replicating our thumb server from ms5 (tampa) to ms1004 (eqiad). All our 48 database servers are now up and running, and database replication will start this week. Several services should come online from eqiad in July.
 * Status: Russ Nelson continued to develop the SwiftMedia extension to integrate MediaWiki with Swift. He also cleaned up and removed duplicate code, made performance assessments and, with the help of Neil Kandalgaonkar, got the extension to mostly work with UploadStash.

Testing environment
Virtualization test cluster — Environment to deploy temporary machines for testing and experimentation, for use by WMF staff and volunteers working on important projects (as capacity allows).
 * Status:

Backups and data archives
Data Dumps — Improvement of processes to create and provide public copies of public Wikimedia data.
 * Status: The first run of the English Wikipedia dumps on the new beefy server was somewhat disappointing. More than half of the files were truncated; clearly 32 jobs at once is a bit much for it.  We'll be doing some testing to find the sweet spot this coming month.  In the meantime the truncated files are being regenerated in smaller batches. Code to detect truncated dump files has been committed and will go live shortly.  This code let us find an issue with the zh wikipedia history dumps that has been around for almost a year; the issue is under investigation while a good dump is being generated in smaller chunks.  After fixing a glitch which caused the per-project dump web page not to be readable on hosts with the upgraded OS, installs are now fully puppetized, with just one host left to upgrade.

Other activities

 * Backups — Network connectivity to our new data center is now available. The first data is being copied and/or replicated onto the new servers and storage systems in Ashburn, and all important data will be present in our new facility in July.
 * HTTPS & IPv6 — Ryan Lane announced that Wikimedia sites would be switching to protocol-relative URLs in July, as part of the work to properly support HTTPS.


 * June 23 outage — Production wikis suffered from a 45-minute outage on June 23; a postmortem was drafted, as well as a more general incident response document.
 * Server decommission donations — Rob Halsell announced that decommissioned Wikimedia servers would be donated to non-profits, and called for applicants. Applications are now closed.
 * Summer of Research 2011 — Asher Feldman and Ryan Lane created the systems infrastructure for the Summer of Research team to perform data mining and analysis work.

Editing tools
Visual editor 0.1 — Exploratory work to identify & prototype initial ideas for a visual editor for MediaWiki. FlaggedRevs — A feature to allow changes made by logged-out and new users to be reviewed before they appear as the primary version of an article.
 * Status: Trevor Parscal continued to work on the front-end of the visual editor, and specifications for accessing the editing surface via the API. A hybrid rendering approach appears to be the best strategy for the visual editor. Neil Kandalgaonkar continued to work on the middleware, DOM and transactions. Neil also continued to work on a demo to integrate MediaWiki and Etherpad. With Alolita Sharma, they planned their upcoming sprints. Neil and Trevor are posting about their work to the parser email list.
 * Status: Aaron Schulz improved user preferences and changed the way statistics are stored in the database, among other minor improvements. Chad Horohoe helped review the backlog of unreviewed commits.
 * Program manager: Alolita Sharma

Content Quality and Editorial Tools
Article Feedback — A feature to collaboratively assess article quality and incorporate reader ratings on Wikipedia.
 * Status: Additional features were added in June. An additional dashboard tracking articles receiving low ratings was added. In response to community feedback, tooltips were added to provide more information on the meaning of the star ratings. Roan Kattouw is currently implementing the UDP back-end to provide clicktracking metrics to assess user engagement. The community provided feedback and bug reports, and the development team addressed the concerns raised, for example by implementing a user preference to hide the tool. Dario Taraborelli continued to evaluate the data provided by the articles already showing the feature. The incremental roll-out to all articles on the English Wikipedia is planned to be completed by mid-July.

Discussions and Interactions
WikiLove 1.0 — An extension to encourage expressions of appreciation between users. MoodBar 0.1 — A feature to encourage new users to provide feedback.
 * Status: Jan Paul Posma completed the back-end work and fixed bugs. Roan Kattouw reviewed the code and enabled the extension on a private production wiki for testing. The feature was then enabled on a public prototype wiki that was used for informal user testing. The extension is planned to be deployed to production wikis on June 30.
 * Status: Brandon Harris updated the feature's design, while Andrew Garrett and Timo Tijhof have completed a first iteration of development on both front-end and back-end (see code in SVN).

StructuredProfile -- A set of features that a) make it easier for new editors to fill out their profile pages with meaningful information about their background and interests and b) surface select profile information to experienced editors within lists such as recent changes, watchlist, etc. The ideas are still in development, so please provide feedback on the StructuredProfile Talk page.

Multimedia Tools
Upload wizard — A feature that provides an easier way of uploading files to Wikimedia Commons, the media library associated with Wikipedia.
 * Status: Neil Kandalgaonkar continued to fix bugs, and added an additional functionality to show thumbnails before upload in modern browsers.

Other projects

 * Non-Roman character set localization — Alolita Sharma published two RfPs to assemble a dedicated team to tackle localization issues from a feature development and outreach perspective. She is currently reviewing applications and contracts.
 * ResourceLoader 2.0 — This project was mostly on hold in June due to the lack of engineering resources. Work is planned to resume in July.
 * LiquidThreads 3.0 — WMF work on this project was mostly on hold in June due to limited resources, that were affected in priority to supporting the 2011 Board Election and the MoodBar.

Wikimedia Labs
Media projects — A set of features to improve media handling and key infrastructure support tools, many developed with Kaltura, such as Metavid, MwEmbed, and the Video Editor. Parser — Groundwork for the next generation visual editor of MediaWiki.
 * Status: Michael Dale continued to address Brion Vibber's comments from code review, and update and fix the TimedMediaHandler code. He also started to work on a test plan to perform user experience testing on a prototype.
 * Status: Brion Vibber continued to work on the parser plan, and also moved the "parser playground" gadget to an extension. He invited the developer community to use it and provide feedback (read more).

Mobile
Mobile Research — A research project to help determine our Mobile strategy. Mobile site rewrite — Port of our Ruby-based mobile gateway to PHP.
 * Status: In June, we completed our fieldwork in Brazil, consisting of 16 interviews, led by Parul Vora and Mani Pande in São Paulo, Salvador and Porto Alegre. We conducted extensive in-home interviews with three kinds of participants: readers of Wikipedia on a mobile phone, potential mobile readers (i.e. who currently use a computer, but could become mobile readers) and editors (primarily of the Portuguese Wikipedia, and to a lesser degree the English Wikipedia). We also received about 6 proposals from US firms in response to our RfP, to conduct research in three cities in the US. The mobile survey is scheduled to be launched at the end of July.
 * Status: Tomasz Finc sent a call for testers to help test the prototype in English, Japanese and Hebrew. Feedback is now being addressed by the mobile team, who is tracking fixes and new feature requests in bugzilla. Patrick Reilly and Asher Feldman also worked together to profile the MobileFrontend extension (formerly "PatchOutputMobile") to prep it for deployment. We're now looking at how to integrate it with our Varnish and Squid caching architecture, so that we can have the advantages of the WURFL mobile device database with an acceptable performance.

Short news
Mobile issues — In June, we noticed a lot of 500 errors for image resources on our mobile platform. Asher Feldman upgraded the Mobile ruby gateway to a newer version of Ruby and Passenger, which cleared up a lot of production issues and [cut our service times dramatically.

Fundraising support
2011 Fundraiser — Support and development for the annual fundraiser of the Foundation.
 * Status: Arthur Richards, Ryan Kaldari and Katie Horn are wrapping up their first development sprint, and preparing for the next one beginning on July 5. Their work focuses on new features for CentralNotice to better facilitate banner management, as well as back-end enhancements to the donation processing pipeline.

Offline
Wikipedia version tools — Support and development of a series of tools to select Wikipedia content for offline use.
 * Status: GSoC student Yuvi Panda continued to port the WP 1.0 bot to a MediaWiki extension. Mentored by Arthur Richards, and supported by WP 1.0 Bot author/maintainer User:CBM, Yuvi implemented an assessment template processing feature, and is now working on a WP 1.0 bot replacement feature that will automatically include real-time assessment statistics on project pages. A feature to filter and select articles based on assessment criteria is planned to be added in July.

Collections — General work for Collections extension. Kiwix — Improvement of the user experience of the Kiwix app to access offline Wikimedia content.
 * Status: Tomasz Finc discussed future work with PediaPress, Wikimedia Italia and others. Possible directions include scaling the Collections extension to output much larger collections, integrating it with the new Kiwix download manager, improving general user experience and making it a general purpose solution for article selection.
 * Status: We posted the videos from our Berlin usability testing session, and released the next beta release of Kiwix. Users of Kiwix can now easily download new openZim files right within the interface. We're looking at connecting it to the Collections extension (see above) so that anyone can easily download new books collections. If you're interested in participating, please check out our volunteer program. We're especially in need of an expert who can help us with some complex libtool issues.

MediaWiki Core
MediaWiki 1.17 — The latest MediaWiki stable release.
 * Status: Tim Starling announced the release of MediaWiki 1.17.0. Besides many bug fixes, MediaWiki 1.17 features a new install wizard, the ResourceLoader (a tool to load JavaScript and CSS assets), category sorting improvements, and improved localization (see full release notes).

MediaWiki 1.18 — The upcoming MediaWiki release. Code review management — Review of changes made to the MediaWiki code.
 * Status: Thanks to the efforts of the code review team, the backlog of unreviewed commits for MediaWiki 1.18 was drastically reduced in June (see chart). Mark Hershberger started a discussion about which extensions to bundle with MediaWiki 1.18.
 * Status: The backlog of unreviewed commits continued to decrease in June. A long but productive discussion between developers happened on the wikitech-l list about how to further improve the code review process. It led to a proposal of a "20% policy", according to which every eligible Wikimedia engineer would spend 20% of their time doing service work that directly benefits the rest of the community.

Wikimedia analytics
Wikimedia Report Card 2.0 — Usability improvements and streamlining of the creation of the monthly report card.
 * Status: Erik Zachte and Nimish Gautam started a development sprint and worked on the back-end infrastructure, supported by Asher Feldman & Sam Reed. The information stored in a database is accessed via a new MediaWiki extension ("MetricsReporting", see in SVN), and the visualization part uses JQplot. The team hopes to demonstrate a prototype for the next report card in early July.

Technical Liaison; Developer Relations
Bugmeistering — Management of our bug tracker. Summer of Code 2011 — A sponsored community program allowing students to join the community as developers. Engineering project documentation — An activity to ensure that project documentation of Wikimedia engineering activities is complete and up-to-date.
 * Status: Mark Hershberger continued conducting bug triages to surface issues that require attention or decisions; this month these meetings switched from phone to IRC to improve transparency and accessibility to the community. After helping developers wrap up the 1.17 tarball, he started looking at 1.18 bugs, and led a triage to narrow down the list of open bugs blocking deployment of MediaWiki 1.18 on Wikimedia Foundation sites.  He also worked on several concerns raised by the community, such as enabling "International" numerals on Hindi wiki with Priyanka's help, and right-to-left and extension bundling issues.  He also did internal knowledge sharing on his Bugzilla API client.
 * Status: Our eight Summer of Code students continued working on their projects full-time, and all are now committing code. WMF staffers and community members are mentoring the students, assisted by Sumana Harihareswara. They are preparing for their mid-term evaluations, to take place in mid-July.
 * Status: Guillaume Paumier finalized the infrastructure for project pages, using templates and transclusion. Because of the tools' limits, full automation wasn't possible. He also continued to update project pages and statuses. A sprint is planned in July to catch up with the documentation of all projects.

Other activities

 * API maintenance — Sam Reed continued to fix bugs and to add new features to the MediaWiki API.
 * Shell requests — Priyanka Dhanda continued to process shell requests, after it appeared that a harmonization of wiki configuration options wouldn't be a significant time saver.
 * Access to Subversion — Volunteer development coordinator Sumana Harihareswara is now the primary point of contact for commit access requests. About 7 new developers were granted commit access in June, among which were 2 Summer of Code students, and 2 Wikimedia Foundation employees.
 * Heterogeneous deployment — Priyanka Dhanda and Tim Starling added features and improved the code, which is now in SVN. Priyanka used it to deploy different versions of MediaWiki on a prototype. Tim and Priyanka are now discussing edge cases and remaining tasks.
 * translatewiki.net support — The list of 500 most used MediaWiki interface messages was updated to help translators focus on the messages with the most impact. The Translate extension may be reviewed in July to be used for content translation on Wikimedia sites, e.g. on meta-wiki.
 * Academic publications authentication proxy — Chad Horohoe started a project whose goal is to allow selected Wikimedians to access third-party academic publishing sites to help with verifiability. The authentication challenges this project entails are not trivial; Ryan Lane is also involved in this project, particularly with his previous experience with OpenID (which maybe used as a mechanism for tying to the CentralAuth). (remove occurrences of "project")
 * HipHop support — HipHop support is still planned to be part of MediaWiki 1.20. In the meantime, we're looking for volunteers to help us package it for different distributions. Please contact Sumana Harihareswara if you have experience with packaging or would like to get involved in this area.
 * Disk-backed object cache — A deployment was attempted on June 27, but had to be rolled back. Tim Starling and Domas Mituzas are investigating what went wrong, and another deployment attempt will be scheduled soon.
 * Projects on hold — The App-level monitoring and Configuration management projects were delayed in June, in favor of other work like the 1.17 release. Some of them will be resumed in July.