Wikimedia Engineering/Report/2010/October

(to be posted on techblog.wikimedia.org by RobLa-WMF)

Here is the November monthly report from Wikimedia Foundation, reporting on what we've been working on and what we're planning. October featured continued work on the Virginia data center migration, continued work on features such as ResourceLoader, Article Feedback and Upload Wizard, increased focus on code review, new testing infrastructure, many new job postings, and the Hack-A-Ton in Washington DC. More below...

(insert the article break here)

Operations
Virginia Data Center - Setting up a world-class primary data center for Wikimedia Foundation properties.
 * Status: Delayed. We're still in the final phase of selecting the new data center. Rob Halsell has relocated to Virginia to be the primary on-site operations engineer for the new build-out.
 * Program manager: Mark Bergsma

Media Storage - Re-vamping our media storage architecture to accommodate expected increase in media uploads.
 * Status: Hiring a contractor for this project. We've met with a few potential candidates for this position.
 * Program manager: Mark Bergsma

Monitoring - Enhancing both Operations and public monitoring to a) notice potential outages sooner, b) increase transparency to the community, c) support progress tracking required in the 5-year plan.
 * Status: In the coming month, we plan to set up SMS notifications for vital service outages and other abnormalities. We're also finalizing a deal with a third-party monitoring solutions for monitoring our uptime and site/service performance.
 * Program manager: Mark Bergsma

Virtualization cluster - more easily deploy temporary machines for testing and experimentation. This cluster is intended for use not just by WMF staff, but will be available to volunteers working on important projects as capacity allows.
 * Status: Ryan Lane has investigated a few open source solutions, and has selected OpenStack. He's testing their newest beta release on a couple of machines. There's nothing production grade yet, but he's hoping to have something useful by the end of the year.
 * Program manager: Mark Bergsma

Content Quality Tools
Article Feedback - Working on feature to collaboratively assess article quality and incorporate reader ratings on Wikipedia
 * Status: The first version of this feature was rolled out at the end of September, and we spent the month of October collecting and analyzing the feedback from the tool. We are currently designing the second version of this feature, and starting development later this month. We plan to roll out the next phase in sync with the Public Policy Initiative Phase 2 in January.
 * Program manager: Alolita Sharma

Pending Changes - Pending Changes is a new review feature recently deployed to en.wikipedia.org, which allows changes made by anonymous and new users be reviewed before they appear as the primary version of an article.
 * Status: Recently we completed two large features related to Pending Changes - speeding up the display of diff pages and adding a "Reject" button. We are currently on track to deploy a new version of Pending Changes on November 16 with further incremental improvements to be rolled-out as we complete development.


 * Program manager: Rob Lanphier

Threaded discussions
Liquid Threads - LiquidThreads is an extension that brings threaded discussions capabilities to Wikimedia projects and MediaWiki.
 * Status: Development on this feature slowed in October due to lack of staff availability. However, we anticipate continued maintenance work occuring through the end of the year, with development picking back up in January.


 * Program manager: Alolita Sharma

Multimedia tools
Upload wizard - The upload wizard is an extension for MediaWiki providing an easier way of uploading files to Wikimedia Commons, the media library associated with Wikipedia.
 * Status: This project has been extended one more month (until the end of November) to have a more robust implementation and time for testing before launching this on Wikimedia Commons. Neil Kandalgaonkar has finished development of a temporary storage system for media files missing required metadata like copyright or source information. Roan Kattouw has started prep-work for deployment of this feature scheduled for late November. The feature is currently in testing on the Commons prototype.
 * Program manager: Alolita Sharma

Licensing tutorial - The licensing tutorial is an illustrated educational comic strip for Wikimedia Commons, explaining the basics of copyright and free licenses.
 * Status: This project is reaching its final stage after a few months of work. Illustrator Michael Bartalos delivered the final color design, which was recently published on Wikimedia Commons under a free license. The tutorial will be integrated into the interface of the Upload Wizard and shown to new users. Translation and localization of the tutorial is underway. For more details, see the recent blog post Illustrated licensing tutorial for Wikimedia Commons.
 * Project manager: Guillaume Paumier

Add media wizard - The Add-media wizard is a gadget to facilitate the insertion of media files into wiki pages. Development effort for this labs feature is supported by Kaltura (consider re-working this section to a "MediaLabs" section that can contain notes on Michael Dale's gadgets).
 * Status: The current phase of development is focused on integrating the existing components with ResourceLoader to decrease load time for the gadgets. Other components in development include a media sequencer, embedPlayer ( timedMediaHandler, WebM integration ), and SwarmPlayer.
 * Program manager: Alolita Sharma

MediaWiki Infrastructure
Resource loader - The resource loader aims to improve the load times for JavaScript and CSS components on any wiki page. The intention of this work is to support faster loading of the Vector skin, media extensions, and anything else that makes use of Javascript.
 * Status: The feature is largely complete and checked into trunk. The team has been working on unit testing the framework and performance testing (and tuning). We are also ensuring compatibility with existing extensions and modifying extensions to use the new messaging system. We hope to have this work wrapped up some time in December, with a deployment sometime after that.
 * Program manager: Alolita Sharma

General Engineering
Analytics Revamp - Incorporate an analytics solution that can grow and answer the questions that the Wikimedia movement has. Includes udp2log revamp.
 * Status: our primary focus has been on analytics tools for the 2010 Fundraiser, allowing that team to make data-driven decisions to refine our donation workflow. We've also kept an eye on long-term development, so that we can use these tools for studying and improving other parts of our user interface. Much improvement has been made on OWA/MediaWiki integration, as well as scaling OWA to work in our environment.
 * Program managers: Rob Lanphier & Tomasz Finc

Test framework deployment - Building an automated test environment for MediaWiki using CruiseControl, Selenium, and PHPUnit
 * Status: We have a basic framework in place for Selenium testing, and we're now fleshing it out with more tests and helper functions to make it easier to build tests. Weekly meetings continue, alternating between voice and IRC.
 * Program manager: Rob Lanphier

Process improvement - Increase transparency and generally organize Wikimedia Foundation's engineering efforts more efficiently
 * Status: This activity will be ongoing. Blog posts like this and project pages continue to be updated.  The Bugzilla reconfiguration work put on hold as Priyanka and Chad focus on Pending Changes work, and as we await Bugzilla 4.0.  We'll institute other improvements on an as needed basis, but this will be the last report about "process improvement" as a general area.
 * Program manager: Rob Lanphier

Code review - improving the way we provide code reviews for MediaWiki
 * Status: The new larger team of reviewers has been making headway (or rather, at least has kept us from falling much further behind). Other than "just get cracking" (which we're doing), one other thing is to try to improve visibility into the problem.  Roan Kattouw is working on some changes to make it easier for others to participate in the review process.  Rob Lanphier plans to start reporting on our current metrics based on some work that Bryan Tong Minh has already started.  See code review status graph to see the number of "ok" commits plotted against the number of total commits in the "phase3" branch.
 * Program manager: Rob Lanphier

Technical Documentation – Improve our technical documentation by making small, incremental improvements to the docs and docs process.
 * Status: Zak Greant continues to plug away making incremental improvements on mediawiki.org, currently working on a review pass of the MediaWiki manual.
 * Program Managers: Rob Lanphier / Zak Greant

wmsync – Replace our current deployment tools (e.g. "scap") with more robust software
 * Status: This project is in the very early discussion stages. The goal with this project is to make deployments more robust and easier, as well as reduce our reliance on NFS.
 * Program Managers: Rob Lanphier

Fundraising
2010 Fundraiser - The engineering tasks necessary to run a successful fundraiser, with sub-projects involving fraud prevention, CentralNotice, and the analytics upgrade.
 * Status: This month has been full of content creation to allow us to optimize our donation workflow. We've seen gains as high as five percent in between tests. We will be launching Monday, November 15.


 * Program manager: Tomasz Finc

Credit card server upgrade - Upgrading our current payments infrastructure to support 1-click donations.
 * Status: We've been successful in adding caching and optimizing Mediawiki. In our second stage we added a squid caching layer which has given us an 8x throughput increase. We are currently preparing a new mini-cluster in order to horizontally scale our fundraising infrastructure. This will allow us to securely support the increased amount of traffic we project for the upcoming fundraiser while also providing us the agility to rapidly increase our throughput as needed.
 * Program manager: Tomasz Finc

Mobile
Mobile site rewrite - Porting our existing gateway for easier support, development, and participations
 * Stats: Our current mobile gateway is very popular and a very valuable resource for many (including much of the staff). However, it has proven difficult to support operationally. We're currently contemplating a redesign of this feature. We're planning to hire an engineer to start work on the rewrite, and hope to start in earnest in the beginning of 2011.
 * Program manager: Tomasz Finc

Offline
Offline
 * Status: We are starting to ramp back up on this activity as the development team starts to become available again (having wrapped up most of the devleopment work on the Fundraiser). Ariel Glenn is working on parallelizing the generation of offline XML dumps. Additionally, we're exploring adding additional export formats to the collection extension. We're also exploring new clients for offline reading of Wikipedia.
 * Program manager: Tomasz Finc

Hiring
We have a lot of hiring coming up before the end of the year. Job descriptions are already posted for the following:
 * Director of Technical Operations
 * Bugmeister
 * User Interface Designer

In addition, we hope to post the following positions soon: (Note: all of these positions may change as Foundation requirements evolve)
 * Software Developer (Mobile)
 * Software Developers (Features)
 * Data Analyst
 * Senior QA Engineer
 * Release Engineer
 * Technical Writer
 * Storage Engineer (contractor)

Misc
Hack-A-Ton - held October 22–24, 2010 just outside Washington, D.C. in the United States. This was our first developer event on the east coast of the United States. We had 27 attendees total (16 Foundation staff and 11 non-staff). The engineers in attendance were able to fix 46 bits of code marked "fixme" (roughly one-third of the total) We had lots of great conversations about release engineering and code review, and plenty of discussion and coordination of future feature work. There were many short presentations and discussions about WikiBhasha, the Selenium framework, the PHPUnit framework, and sentence level editing among others. See the summary blog posts (first day and second day) for more information.