Wikimedia Engineering/Report/2011/February

Major accomplishments this month include:
 * Racking party at our new datacenter in Virginia
 * Data Summit, Feb 4th in California
 * Release of Editor Trends study data and tooling
 * the (painful but ultimately successful) deployment of MediaWiki 1.17 to all Wikimedia wikis
 * Ward Cunningham starts year of monthly "in residence" visits to Wikimedia HQ

Recent events

 * Data Summit (February 4, California) — A lot of fruitful discussions happened during this working session. Notes are available from the working groups on parsers, structured data and analytics.
 * FOSDEM 2011 (February 5-6, Brussels, Belgium) —
 * GNUnify 2011 (February 11-12, Pune, India) —

Upcoming events

 * Berlin Hackathon 2011 (May 20-22, Berlin) — Mark your calendar: the Berlin Hackathon will take place on May 20-22. Participants are now listing topics to work on.
 * Wikimania (August 2-6, Haifa, Israel) — This year's Wikimania will be preceded by two days of hacking. Mark your calendar for August 2-3! You can also submit a talk or workshop for the Technology tracks of the actual conference (August 4-7).

Personnel
Are you looking to work for Wikimedia? We have a lot of hiring coming up this year, and we really love talking to active community members about these roles. The following positions are currently open:
 * Volunteer Development Coordinator
 * Performance Engineer
 * Software Developer (Features)
 * Software Developer (Mobile)
 * Data Analytics Engineer
 * Operations Engineer
 * Senior QA Engineer

In addition, we hope to post the following positions over the next few months:
 * Rich Text Editor Engineer
 * Release Engineer
 * Technical Writer
 * Network Engineer (contractor)

Short news

 * visits from contractors (find dates in engineering calendar)
 * visits from contractors (find dates in engineering calendar)

Operations
Virginia Data Center — Installation of a world-class primary data center for Wikimedia Foundation websites.
 * Status: Nearly all hardware has been delivered to the data center. More than 50 pallets of equipment have been unboxed, stacked and installed in the 16 racks by a 4-person team. We are now finishing up the cabling of all equipment, and the initial setup of all devices to make them available for management on the network. In March, configuration of the first clusters of servers and services will begin, while we wait for network transport and transit services to be installed.
 * Program manager: Mark Bergsma

Media Storage — Improvement of our media storage architecture to accommodate expected increase in media uploads.
 * Status:


 * Program manager: Mark Bergsma

Virtualization test cluster — Environment to deploy temporary machines for testing and experimentation, for use by WMF staff and volunteers working on important projects (as capacity allows).
 * Status: A new OpenStack has just been released, which contains the software features we need. This project was however also delayed due to the build out of the new data center. We expect to have the virtualization test cluster production ready in March.
 * Program manager: Mark Bergsma

Backups — Improvement of backup coverage of Wikimedia-hosted data.
 * Status: We have purchased a dedicated storage solution which will arrive in March, and improve the reliability of part of our data. Once servers in the new data center are online, and our private connection between Tampa and Ashburn is up, we will be able to replicate all data between the two data centers as well.
 * Program manager: Mark Bergsma

Data Dumps — Improvement of processes to create and provide public copies of public Wikimedia data.
 * Status:
 * Program manager: Mark Bergsma

Content Quality and Editorial Tools
Article Feedback — A feature to collaboratively assess article quality and incorporate reader ratings on Wikipedia.
 * Status: The deployment to our prototype has surfaced additional feature requirements that we've now addressed. Now that MediaWiki 1.17 has been successfully deployed, we can release the latest version of the Article feedback tool on the English Wikipedia, as part of our pilot experiment this quarter. Requirements for the next version (3.0) are being drafted.
 * Program manager: Alolita Sharma

Pending Changes — A feature to allow changes made by logged-out and new users to be reviewed before they appear as the primary version of an article.
 * Status: Developer Aaron Schulz has focused on bug fixes. Further development is waiting for the English Wikipedia community to come to a consensus regarding what the future of the trial should be. A new Request for Comment was started for this purpose.
 * Program manager: Alolita Sharma

Controversial content management system — A feature to handle controversial content on a wiki.
 * Status: Following the 2010 Wikimedia Study of Controversial Content, Brandon Harris has created mockups of the feature, including initial UI design recommendations, in collaboration with the Community department and Board member Phoebe Ayers, who also sent an update. They will be presented to the Board of Trustees by the Strategic product team.
 * Program manager: Alolita Sharma

External review system — An interface for external reviews of Wikipedia content.
 * Status: At the request of the Strategic Product Department, Guillaume Paumier has researched and compared previous and current initiatives of quality review of Wikipedia content. He has also analyzed the goals and needs of both Wikipedians and "experts", in order to publish a set of requirements for an extensible and flexible external review system.
 * Commissioned by: Erik Möller

Discussions and Interactions
Liquid Threads — A feature that brings threaded discussions capabilities to Wikimedia projects and MediaWiki.
 * Status: Andrew Garrett has published documentation on upcoming back-end and architecture changes. New design specifications have been published by Brandon Harris as well. A discussion was started on the "gender gap" mailing list about how this new discussion system could improve interactions between participants.
 * Program manager: Alolita Sharma

SimpleSurvey/2.0 — A MediaWiki extension to create and run surveys in MediaWiki.
 * Status: In our work on the Article Feedback tool, we used some functionality from the existing SimpleSurvey extension. In order to make it more robust, Trevor Parscal has been evaluating the existing codebase, refactoring the extension, and consolidating code from other survey extensions. SimpleSurvey will also help us conduct small surveys to support strategic research.
 * Program manager: Alolita Sharma

?
non-roman characters set localization three input methods status: working on plans to create a team in India to do the first Indic-related work in this area

Community feature prototyping
Starting in February, the Community department will host engineers joint experiment more agile way to test prototype features Trevor for Feb/March

Multimedia Tools
Upload wizard — A feature that provides an easier way of uploading files to Wikimedia Commons, the media library associated with Wikipedia.
 * Status: Ryan Kaldari has joined Neil Kandalgaonkar to fix bugs and prioritize the work to be done for an UploadWizard 1.0 release. A bug fix sprint is in progress currently.
 * Program manager: Alolita Sharma

JavaScript parsing library — A JavaScript parsing library for wikitext.
 * Status: Neil Kandalgaonkar implemented a JavaScript parser for wikitext using Parsing expression grammar. It will allow JavaScript tools to support internationalization, templating and other features; it will especially benefit multimedia and Media labs tools. Integration with ResourceLoader is underway.
 * Program manager: Alolita Sharma

MediaWiki infrastructure
Resource loader — A feature to improve the load times for JavaScript and CSS in MediaWiki, enabling faster loading of the Vector skin, media extensions, and anything else that makes extensive use of Javascript and CSS.
 * Status: The deployment of MediaWiki 1.17 to Wikimedia sites has surfaced many bugs. Roan Kattouw and Trevor Parscal have worked on fixing them, and were also available for an IRC office hour to help JavaScript maintainers fix compatibility issues. A result of those office hours was a couple of documents helping people through the migration
 * Program manager: Alolita Sharma

Wikimedia Labs
HTML5 media projects — A set of features to improve media handling and key infrastructure support tools, many developed with Kaltura, such as Metavid, MwEmbed, and the Video Editor.
 * Status: Michael Dale has been working on the integration of TimedMediaHandler and the Add MediaWizard with the Resource Loader. Michael has been converting his gadgets to extensions to help the integration process with Resource Loader
 * Program manager: Alolita Sharma

MediaWiki development
MediaWiki 1.17 deployment — Deployment of the latest MediaWiki version (1.17) to Wikimedia sites.
 * Status: In preparation for the planned deployment of MediaWiki 1.17 on February 8, all outstanding revisions were reviewed. The deployment was attempted twice that day, and eventually postponed because of major performance issues that caused an outage. The problems were investigated, and another plan was published, based on heterogeneous deployment (meaning not all wikis would run the same version of the software). Tim Starling and Roan Kattouw developed wmerrors, a PHP extension to display fatal error pages for PHP. On February 11, a first wave of small wikis were switched to MediaWiki 1.17. On February 16, other small and medium-sized wikis were switched. An attempt to deploy to our biggest wiki (en.wikipedia.org) resulted in a short outage. The English Wikipedia and all remaining wikis were successfully upgraded to MediaWiki 1.17 later that day. Many issues encountered this month were due to the large amount of code changes since the last release {{fixtext|(about XX changes over XX months)} . In the future, software deployments should be smaller and happen more regularly, hence reducing the risk of repeated outages.
 * Program manager: Rob Lanphier

MediaWiki 1.17 release — The upcoming MediaWiki release.
 * Status: Now that MediaWiki 1.17 has been deployed to all Wikimedia wikis, remaining bugs are expected to surface and be fixed. We're hoping to release MediaWiki 1.17 soon for third-party users (see draft release notes), but problems related to DBMS support may delay it. Its main feature will be the Resource loader. It will also include category collation improvements. Developers are already discussing MediaWiki 1.18.
 * Program manager: Rob Lanphier

Test framework deployment — Creation of an automated test environment for MediaWiki using CruiseControl, Selenium, and PHPUnit.
 * Status: Foundation work on this was put on hold pending the 1.17 release. We're now planning on publishing an open request for proposals calling for developers to move this work forward. In the community, Markus Glaser continues to add support for database setup inside the Selenium framework.
 * Program manager: Rob Lanphier

Technical Documentation – Improvement of our technical documentation by making small, incremental improvements to the docs and docs process.
 * Status: The initial phase of this effort was wrapped up in February. We plan to put Foundation work on this on hold while we shift focus to Volunteer Developer services.
 * Program manager: Rob Lanphier

Wikimedia analytics
udp2log — A custom data analytics logging system.
 * Status: We initially attempted to deploy the multicast version of udp2log, but we discovered firmware problems in our routing infrastructure. Our plan is now to have a second machine that receives unicast logging messages that we use for secondary services.
 * Program manager: Rob Lanphier

OWA — Installation and customization of an Open Web Analytics (OWA) platform to process data to support decision making
 * Status: We're testing OWA Integration on private wikis with the goal of understanding its reporting characteristics and how sharing them publicly would work. We've begun a second engagement with OWA's author, Peter Adams, to ensure that it fully complies with our Privacy Policy while being able to publish summary reports.  We're evaluating public projects to run the next pilot against (since Fundraising is concluded).
 * Program managers: Rob Lanphier & Tomasz Finc

(find name in svn?) wikilytics

Mobile
Mobile site rewrite — Port of our existing gateway to another framework for easier support & collaborative development.
 * Status: We're still in hiring mode looking for a great developer to lead our effors. At the same time, we're also putting together a roadmap for our mobile development, and starting to coordinate research and development. We're drafting a survey now.


 * Program manager: Tomasz Finc

Offline
Wikipedia version tools — Support and development of a series of tools to select Wikipedia content for offline use.
 * Status: Currently, offline copies of Wikipedia content are generated by the Wikipedia 1.0 team ? . Since many in the community would like to see more options, Arthur Richards is actively assessing the codebase on the toolserver to understand the work involved in extending the current toolset.
 * Program manager: Tomasz Finc

OpenZim integration into the Collections extension — Support and development of a standard file format for offline Wikimedia content.
 * Status: PediaPress has wrapped up their first development push for adding openZim support to the collections extension. Testers are invited to test the new extension on PediaPress' test wiki. We're now collecting bug reports before deploying it to the live site.
 * Program manager: Tomasz Finc

Kiwix UX study — Evaluation of the user experience of the Kiwix mobile app to access offline Wikimedia content.
 * Status: We've finished our first UX pass over Kiwix and published the recommendations on the Kiwix wiki. Emmanuel Engelhart is implementing some of these new features while we gear up for the next phase of assesment. At the same time, we're engaging with the local Wikimedia community in India to see how well the tool is working.
 * Program manager: Tomasz Finc