Wikimedia Engineering/Report/2011/November

Major news in November include:
 * The completion of the Coding challenge, and two coding events in India and the UK;
 * Continued infrastructure work in our data centers to improve performance and reliability, and on the Labs project;
 * Progress on the Visual editor and its back-end;
 * New versions of the Feedback Dashboard and the Upload Wizard, bringing new and long-awaited features;
 * Fundraising engineering going full-swing, in parallel with the annual fundraising campaign;
 * The final release of MediaWiki 1.18.0.

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * October 2011 Coding Challenge (20 October–7 November, online) — The coding challenge was successfully wrapped up; the submissions have been published and will be judged in December.


 * India hackathon (18–20 November, Mumbai, India) — Approximately 80 Indian participants came to this week-end hackathon, focusing on Wikimedia mobile, offline, and language issues. Participants made great strides in mobile. The internationalization effort also benefited, with new input methods for MediaWiki, readying Narayam for Wikimedia Incubator, a prototype onscreen keyboard built in Narayam, Wikimedia Mobile ready for translation, new UI prototypes for language selection, and other new features. Participants localized Kiwix into several Indic languages. Volunteer Development Coordinator Sumana Harihareswara summarized the event on wikitech-l.


 * Brighton hackathon (19–20 November, Brighton, England) — Lewis Cawte and Tom Morris organized a general MediaWiki hackathon with approximately 10 participants. Attendees fixed bugs, reviewed outstanding patches from volunteers, enhanced OpenPlaques with the MediaWiki API, and discussed Wikinews and Wikimedia Commons.

Upcoming events

 * San Francisco hackathon (21–22 January 2012) — Erik Möller and Sumana Harihareswara continued to plan and started publicizing an outreach-focused developers' week-end. Experienced staff and volunteer developers will participate, teaching new developers about MediaWiki, the API and our framework for JavaScript feature development.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Developers and engineers:
 * Systems Engineer (Data Analytics)
 * Software Developer (Back-end, Data Analytics)
 * Software Developer (Rich Text Editing, Features)
 * Software Developer (Front-end)
 * QA Lead
 * Software Developer (Mobile)
 * Software Security Engineer.


 * Management & Product:
 * Director of Features Engineering
 * Product Manager


 * Requests for proposals:
 * API Logging Analysis — Help us analyze the query logs of our API to better understand third party application usage of Wikimedia content and services.
 * XML Dumps — Help us improve the infrastructure used to build XML dumps of Wikipedia content, for backups and reuse by third parties.
 * Mobile UX — Help us redesign our mobile platform and apps as more and more visitors access Wikipedia and its sister sites via mobile devices.

Short news

 * Benny Situ and Rob Moen joined the Features engineering team as Software Developers, to work on Editor Engagement features (announcement for Benny, announcement for Rob).
 * Andrew Bogott has joined the Operations team as a developer to work on the Labs Virtualization project
 * Aaron Halfaker, former Summer of research fellow, joined the Features team as a research contractor for data analysis on editor engagement features.

Site infrastructure

 * Data Centers — Eight new database servers for external storage were deployed in Ashburn and Tampa, to retire about 30 aging servers. The goal was to add capacity, improve performance, consume less power and provide cross data-center data recovery and redundancy (SDTPA & EQIAD). This gave us back one rack of server space, which is significant given the space constraint at SDTPA. Also in EQIAD, servers for bits.wikimedia.org and Fundraising reporting were migrated and are now in production. In Amsterdam, SSDs were added to the ESAMS Squid servers to improve read performance, and we upgraded the ESAMS core switch (csw2-esams). The KNAMS router was moved because we changed to a new hosting provider, though still within the same datacenter. In Tampa, 66 servers were retired, donated and shipped to various non-profits. More generally, our configuration management tool Puppet was upgraded and deployed to all of our lab and production servers; a Puppet dashboard was also implemented.
 * New MySQL Package —  A new Debian package for mysqlatfacebook 5.1.53 was successfully built, using the fresh version of the facebook patchset. Initial production data on two hosts look promising.


 * HTTPS — The HTTPS Everywhere project released version 1.2 of its Firefox plugin in November. This version includes the updated Wikimedia ruleset written by Roan Kattouw, which redirects users to the new HTTPS URLs rather than to the old secure gateway.


 * Site outage — There was a short outage on November 27th, caused by a combination of a surge in traffic and cache stampede.

Testing environment

 * Virtualization test cluster — OpenStack Nova was upgraded from cactus to diablo. A GlusterFS filesystem was added on all compute nodes via puppet, to act as storage for the instances. A default sudo policy was also added for instances: project members now have sudo permissions, excluding global projects. Shared home directories are also available in a per project manner. 15 projects and 36 instances have been created and 46 people have been given Labs accounts so far. The GlusterFS installation and the recent Puppet master and client upgrades were implemented and tested in Labs before going into production.

Backups and data archives

 * Data Dumps — While the dumps keep rolling along, we are talking with another organization interested in mirroring them. If you know someone with several terabytes of space who might be interested, please send them our way. All hosts had their kernel updated for security reasons, and new dump code was deployed as well. We rolled out a new experimental service this month of daily adds/changes dumps for all projects. No information about deleted/undeleted/moved pages from previous dumps is included, but it does include all new content since the run of the previous day. The first adds/changes run for the English language Wikipedia took less than 30 minutes to build.

Editing tools

 * Visual editor — Trevor Parscal fixed issues blocking the synchronization of structural edits to the user interface, refactored and cleaned up the code, and mapped out tasks and features to be supported. He also finished the document transaction functionality and made progress on an undo/redo system. Roan Kattouw added tests, rewrote some code to make the tests pass, and fixed a number of bugs and issues, notably in Internet Explorer. Inez Korczynski continued to work on content insertion, deletion and selection and fixed numerous bugs. Gabriel Wicke extended the PEG parser for robust larger-scale parsing. He converted the PEG parser into a combined wiki and HTML tokenizer that feeds to a HTML5 DOM tree builder. He implemented several wikitext features (lists, italics, bold) as token stream transformations. 139 of about 660 parser tests are now passing.
 * Internationalization and localization tools — The code of the WebFonts extension was reviewed, and deployment is planned for December. The Narayam input methods extension now remembers the five last used schemes, and a basic version of message group workflows was implemented in the Translate extension. A major focus of the team in November was the India hackathon, during which a lot of work got done. New Narayam keyboard mappings were developed, and WebFonts were tested further.

Participation and editor retention

 * Article feedback — Fabrice Florin completed feature requirements, mockups and project plan for Version 5, and led development of first beta code by OmniTI. Dario Taraborelli started to work on the version 5 data model, metrics requirements and data analysis plan. He also worked with Fabrice and Howie Fung on guidelines and requirements for marking of AFT-generated edits and account tracking. A series of real-time dashboards for AFT data were uploaded to the Toolserver. Roan Kattouw reviewed the proposed database schema, and showed the OmniTI team how to deploy code to the prototype server.
 * Feedback Dashboard — Dario Taraborelli updated the research page dedicated to MoodBar to include responses. Benny Situ fixed bugs, and implemented and deployed the feedback response API. Rob Moen deployed in-line reply functionality, giving experienced editors the ability to respond to MoodBar feedback while staying on the dashboard, as well as bug fixes and admin action enhancements.

Multimedia Tools

 * UploadWizard — Neil Kandalgaonkar and Ian Baker deployed a set of important improvements, including multi-file selection for browsers which support it, custom wikitext licenses, an improved licensing workflow, basic support for location data extraction, and more. Support for chunked uploading (which improves reliability of large file transfers) was temporarily backed out and is still being worked on.

MediaWiki infrastructure

 * ResourceLoader — Roan Kattouw wrote a maintenance script for migrating gadgets on a wiki from the old format to the new Gadgets 2.0 format. Roan and Timo Tijhof also fixed several bugs.

Media Labs

 * Multimedia — Ian Baker and Neil Kandalgaonkar continued to review Michael Dale's code to prepare it for deployment.

Mobile

 * Mobile Research — Mani Pande and Parul Vora presented the findings from India and Brazil report to the mobile team. They are currently working on consolidating the report, videos, photographs and other media into a Wiki format. The cleaning of data from the mobile survey continues.


 * MobileFrontend —


 * Android Wikipedia App — We're tidying up the app now and would love to get testers. Send a mail to the mobile-l-feedback list if you want to be involved.

Fundraising support

 * 2011 Fundraiser — The annual fundraiser launched, with support for 76 new credit card currencies, including some which have been long-desired: Rupees, Russian Rubles, and Brazilian Reals. The fundraising engineering team added support to the DonationInterface extension for JCB credit card donations, BPay in Australia, 3 new real time banking options including iDeal, direct debit for 6 countries, manual bank transfers in more than 50 countries, and Webmoney, with more on the way. DonationInterface also benefited from enhancements to the RapidHtml form templating system. From an operations perspective, databases were migrated for increased capacity and stability; the data center failover capacity was also consolidated by adding aluminium.wikimedia.org as a mirror of grosley.wikimedia.org (different data centers, identical hosts), which provide a bulk of our miscellaneous fundraising services (including CiviCRM). We also added the ability to have translated 'thank you' emails to our ThankYou module for Drupal/CiviCRM. The ContributionReporting extension was enhanced to allow custom selection of fundraising years to display, but the feature was disabled after it caused a site outage due to cache stampeding. The FundraiserLandingPage extension was developed and deployed, making it easier to dynamically construct template calls for fundraiser landing pages depending on a potential donor's country. Last, the team fixed a number of new bugs and issues, surfaced by the increased usage of the donation pipeline.

Offline

 * Kiwix UX initiative — Version 0.9 beta4 was released in November. Emmanuel Engelhart implemented a new filtering/sorting system in Kiwix's content manager. A new tool, kiwix-install, now makes it trivial to create new portable versions of Kiwix, with the ZIM file and full text search included. kiwix-install also made it possible to implement automated software and content updates. At the India hackathon, localisation for 3 new Indic languages was added, and the Kiwix team met with several organizations to discuss partnerships. Work has started on a Kiwix app for Android/ARM.

MediaWiki Core

 * MediaWiki 1.18 — Developers sprinted to fix the last blocker bugs, and the ones uncovered by testers. Sam Reed announced the first beta release and first release candidate of MediaWiki 1.18. Mark Hershberger followed up on comments on the English Wikipedia's Village pump related to the deployment of 1.18, bugs reported in bugzilla, and installation and upgrade reports. Sam announced the final 1.18.0 release on November 27th, as well as the 1.17.1 security release.
 * Code review management — Foundation engineers signed up in greater numbers for weekly slots to devote to the development community (including code review), and discussed how to speed the code review that precludes deploying MediaWiki 1.19. Rob Lanphier also set code review goals and made projections for 1.19.
 * Continuous integration — Chad Horohoe and Antoine Musso created a Debian package for TestSwarm, so it can be installed in our common infrastructure with Jenkins. Antoine and Timo Tijhof wrote a script to fetch individual revisions of MediaWiki's code, in order to run tests against each one of them. PostgreSQL testing in Jenkins is planned for implementation in the coming weeks.
 * Git conversion — Chad Horohoe got a lot of help from the community on identifying unknown committers; user mapping is now complete. Gitorious was installed on a virtual machine, and a test git conversion of the MediaWiki code repository is now imminent. Meanwhile, there have also been discussions about e-mail aliases for LDAP users, to avoid disclosing private e-mail addresses. Chad and Brion Vibber worked on changes to the development workflow introduced by the move to git (e.g., when continuous integration tests get run).
 * VipsScaler — Tim Starling and Antoine Musso reviewed the VipsScaler extension, written by Bryan Tong Minh to use VIPS as an alternative, and possible replacement, for ImageMagick as image thumbnailing system. Bryan and Antoine also wrote a comparison tool, [//test2.wikipedia.org/wiki/Special:VipsTest enabled on the test2 wiki], to test both systems. The initial deployment is planned to be limited to images that would give an error with ImageMagick (for instance large PNG files).

Wikimedia analytics

 * Wikimedia Report Card 2.0 — Erik Zachte added new visualizations showing the geographical distribution of page view and mobile page view. Asher Feldman tweaked kernel parameters and Tim Starling made changes the logging script (udp2log) to fix the packet loss issue on our logging servers.

Technical Liaison; Developer Relations

 * The "Technical Liaison; Developer Relations" team was featured on the Wikimedia Tech blog this month.


 * Bug management — In November, Mark Hershberger and Sumana Harihareswara led themed bug triage sessions focusing on non-MySQL databases, MediaWiki 1.18 bugs, and UploadWizard. A session was also dedicated to reviewing patches in bugzilla; volunteer Rusty Burchfield wrote a tool to check if the patches could be applied to trunk, and only 50 were not obsolete due to bitrot. Mark watched for bugs and comments on local village pumps following the deployment of MediaWiki 1.18 to Wikimedia sites. He also continued to prioritize bugs and find developers to address those of highest priority.
 * Summer of Code 2011 — Sumana Harihareswara, Salvatore Ingala, Kevin Brown, and Yuvi Panda followed up on improvements and fixes, towards the goal of getting Ingala's, Brown's, and Panda's projects deployed on Wikimedia projects.
 * Engineering project documentation — Guillaume Paumier expanded the Platform engineering hub to include the list of current projects. He also created a similar hub for Features engineering and a portal for all Wikimedia Engineering. Last, he performed continuous maintenance of engineering project pages, and assembled this report.
 * Volunteer coordination and outreach — Sumana Harihareswara continued to follow up on contacts from the New Orleans hackathon and the GSoC mentor summit; she also provided support in the #mediawiki IRC channel. She did a lot of outreach for the India Hackathon 2011 and attended it to facilitate volunteer training and development, and worked on planning for the January 2012 San Francisco Hackathon. She administered the commit access review process and communicated about improved process on wikitech-l.  12 developers received commit access in November, of whom 2 were Foundation staffers. Sumana and Guillaume Paumier started to consolidate training documentation to facilitate the onboarding of new developers.
 * MediaWiki architecture document — The Architecture of Open Source Applications book editors reviewed the first revision of the document and provided feedback. Guillaume Paumier addressed their comments, with the help of Sumana Harihareswara, who reached out to developers to research additional details. A second revision of the document was submitted to the editors, who sent it to technical reviewers for another review. The content of the second revision was integrated to the MediaWiki history and Manual:MediaWiki architecture pages.
 * Wikimedia blog maintenance — Guillaume Paumier fixed bugs and made tweaks to the theme, which is still hosted on github (until Wikimedia switches to git). There are plans to host a prototype blog in Wikimedia Labs to facilitate testing of new functionality.

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.