Wikimedia Engineering/Report/2010/September

(to be posted on techblog.wikimedia.org by RobLa-WMF)

Below is another overview update from Wikimedia Foundation Engineering, pulled together by Alolita, Danese, Guillaume, Mark, Tomasz, Zak, and myself. This edition of the update was drafted on mediawiki.org, where you can find the complete history of everyone who contributed. We believe we've gotten better at characterizing our work, but there are almost certainly gaps (especially when it comes to ongoing activities versus projects that have clear begin and end dates).

As before, each area has a program manager, who is responsible for coordinating the activity in that area. More detailed updates will come from those people as they are availalbe.

A quick summary of the major development and operations initiatives underway this month:


 * Testing and deployment of the grant-funded improvements to media uploading. These activities must be completed by a certain date so that we can meet our contractual agreement with the funder;
 * Testing and deployment of the new ResourceLoader, which will improve performance of our sites for all users;
 * Testing and deployment of the article feedback tool, which is timed to coincide with the Public Policy Initiative's first semester;
 * Engineering and testing for the 2010 fundraiser, including testing of a new analytics framework;
 * Build-out of a new primary data center location in Virginia. The new data centre eliminate our Tampa data center as a single point of failure for all Wikimedia Foundation projects;
 * A bug-smash in October to resolve the backlog of general bug and maintenance requests;
 * Ongoing development of Pending Changes;

More detail below the fold...

(insert the article break here)

Operations
 Virginia Data Center - Setting up a world-class primary data center for Wikimedia Foundation properties.
 * Status: We're in the final phase of selecting the new datacenter. Rob Halsell is relocating to Virginia to be the primary on-site operations engineer for the new buildout.
 * Program manager: Mark Bergsma

 Media Storage - Re-vamping our media storage architecture to accomodate expected increase in media uploads.
 * Status: (need update from Danese)
 * Program manager: Mark Bergsma

 Monitoring - Enhancing both Operations and public monitoring to a) notice potential outages sooner, b) increase transparency to the community, c) support progress tracking required in the 5-year plan.
 * Status: Last month, we improved our integration of Monitoring systems with our Configuration Management software, which helps us make sure that monitoring is automatically set up for any new machines that we deploy. In the coming month, we plan to set up SMS notifications for vital service outages and other abnormalities.  We're also investigating some third-party monitoring solutions for monitoring our uptime and site/service performance.
 * Program manager: Mark Bergsma

 Credit card server upgrade - Upgrading our current payments infrastructure to support 1-click donations.
 * Status: In-progress. We're increasing our capacity, and in turn simplifying our donation flow.  We'd be happy to talk to the Wikimedia Chapters about lessons learned.
 * Program manager: Tomasz Finc

Virtualization cluster:
 * Goal: The goal of the cluster is to more easily deploy temporary machines for testing and experimentation. This cluster is intended for use not just by WMF staff, but will be available to volunteers working on important projects as capacity allows.
 * Status: Ryan has been working on this since he started. He's been investigating OpenStack and other open source solutions. He's testing on a couple of machines. There's nothing production grade yet, but he's hoping to have something useful hopefully by the end of the year.

Content Quality Tools
 Article Feedback - Working on feature to collaboratively assess article quality and incorporate reader ratings on Wikipedia
 * Status: Phase 1, deployed last week in a pilot experiment for the Public Policy Initiative. The goal is to gather metrics and learn about how this feature should work. We plan to use this information to guide a rewrite in the first quarter of 2011.
 * Program manager: Alolita Sharma

 Pending Changes - Pending Changes is a new review feature recently deployed to en.wikipedia.org, which allows changes made by anonymous and new users be reviewed before they appear as the primary version of an article.
 * Status: We're planning a release of Pending Changes in November 2010, which is currently under development, described on our roadmap here: (see Pending Changes roadmap). Aaron Schulz is advising us as the author of the vast majority of the code, having mostly implemented the "reject" button.  Chad Horohoe and Priyanka Dhanda are working on some of the short term development items. Brandon Harris and Parul Vora are advising us on how we can make this feature mesh with our long term usability strategy.  We're currently tracking the list of items we intend to complete in Bugzilla.  For more information, see the full update on wikitech-l
 * Program manager: Rob Lanphier

Threaded discussions
 Liquid Threads - LiquidThreads is an extension that brings threaded discussions capabilities to Wikimedia projects and MediaWiki.
 * Status: Liquid Threads is on hiatus due to the primary developer going back to school. We're planning to perform a user interface study and hopefullly wider deployment in the first half of next year.
 * Program manager: Alolita Sharma

Multimedia tools
 Upload wizard - The upload wizard is an extension for MediaWiki providing an easier way of uploading files to Wikimedia Commons, the media library associated with Wikipedia.
 * Status: Neil Kandalgaonkar checked the bulk of the available code into a branch, and it's undergoing code review now. Guillaume Paumier is working on the Licensing tutorial and the Multimedia Usability project report. We're working hard to meet the October 31 deadline for the Ford Foundation grant.
 * Program manager: Alolita Sharma

 Add media wizard - The Add-media wizard is a gadget to facilitate the insertion of media files into wiki pages. Its development is supported by Kaltura (consider re-working this section to a "MediaLabs" section that can contain notes on MDale's gadgets).
 * Status:
 * Program manager: Alolita Sharma

MediaWiki Infrastructure
Resource loader - The resource loader aims to improve the load times for JavaScript and CSS components on any wiki page. The intention of this work is to support faster loading of the Vector skin, media extensions, and anything else that makes use of Javascript.
 * Status: ResourceLoader was checked in in September.  Since that point, Trevor has been debugging and modifying extensions to be compatible with the new framework.
 * Program manager: Alolita Sharma

 Central Notice - CentralNotice is a banner system used for global messaging across Wikimedia projects.
 * Status: Last month, we deployed GeoIP functionality, new dynamic banner loader, deployed new testing features. The coming month is going to be about bugfixes, incorporating feedback, and documentation.
 * Program manager: Tomasz Finc

 Analytics Revamp - Incorporate an analytics solution that can grow and answer the questions that the Wikimedia movement has.
 * Status: Last month, we developed all of our required features for the fundraiser (flow tracking through the donation pipeline, categorization, and goal tracking), and now we're integrating with the WMF infrastructure and testing the result. We're working with Peter Adams, the main developer for Open Web Analytics to make it possible to deploy on WMF properties in the future.  Now that we've done significant work to make it easier to install OWA with MediaWiki, we'd love for people to try this out and tell us how it works.
 * Program managers:  Rob Lanphier &  Tomasz Finc

General Engineering
Test framework deployment - Building an automated test environment for MediaWiki using CruiseControl, Selenium, and PHPUnit
 * Status: Currently weekly meetings (which are public) are aimed at getting enough of the framework in place by the Hack-A-Ton so we can code against it there since team members (staff and volunteer) will all be in DC. Come on over if you're interested.  Working to configure on a per-test basis so tests are repeatable even when they alter the database.
 * Program manager:  Rob Lanphier

 Process improvement - Increase transparency and generally organize Wikimedia Foundation's engineering efforts more efficiently
 * Status: Established monthly blog posts, project pages and more info to the Tech Blog in general about what we're doing.  Some of the work on selecting Project Management tools has been put on hold as we've not fallen in love with anything we've encountered so far. We currently have high hopes for the release of BugZilla 4.0, and ways in which we'll be able to configure it for our needs.
 * Program managers: Rob Lanphier

Code review
 * Status: We've known for a long time that our code review resources were too constrained, but it's taken time to realize that finding or growing individuals with suitable expertise and experience to perform sufficient code review may also be suboptimal to address the needs of the developer community. We want to create an environment where MediaWiki developers can grow into capable mentors and reviewers. In the near term, given Tim Starling's pending paternity leave and the Mark Hershberger's recent accident (which he's thankfully recovering well from), we have recruited a larger team of existing and former staff (including Brion Vibber, Trevor Parscal, Roan Kattouw, and Mark as he comes back up to speed). In the longer term, we are brainstorming now how to best identify and train community members to take up some of the code review tasks and how to provide appropriate oversight to ensure that we continue to catch potential style, architecture, performance, and security flaws in submitted code.

Technical Documentation – Improve our technical documentation by making small, incremental improvements to the docs and docs process.
 * Status: Overall, MediaWiki's technical documentation is quite useful, but varies greatly in accuracy, completeness, content, page structure, translation status, writing style and quality. This inconsistency makes the documentation hard for many people to use. Over the next few months, Zak Greant will be focused on six key things – community engagement, transparency, accountability, editing and writing, wiki gardening and planning – to help improve the technical documentation and the technical documentation process. For more details, review the Technical Documentation project page.
 * Program Managers: Rob Lanphier /  Zak Greant

Fundraising
2010 Fundraiser
 * Status: Weekly tests runs are happening every Thursday up until the fundraiser starts on November 8, with subprojects involving fraud prevention, central notice and the analytics upgrade.


 * Program manager: Tomasz Finc