Wikimedia Engineering/Report/2011/December

Major news in December include:
 * 2011/12/13/help-test-the-first-visual-editor-developer-prototype/
 * http://blog.wikimedia.org/2011/12/20/a-new-way-to-contribute-to-wikipedia/
 * http://blog.wikimedia.org/2011/12/20/a-new-way-to-contribute-to-wikipedia/

Hover your mouse over the green question marks to see the description of a particular project.

Recent events

 * Judging for the October 2011 Coding Challenge continued and winners will be announced in January.

Upcoming events

 * San Francisco hackathon (21–22 January 2012) — Erik Möller and Sumana Harihareswara continued to plan and publicize this outreach-focused developers' week-end. Heather Walls developed a more attractive homepage for the event. Sumana began arranging for tutorials and activities for the event, focusing on mobile, the web-accessible API and our framework for JavaScript feature development.  Registration opened and more than 70 participants registered.


 * Pune hackathon (10-12 February 2012) — Preparation began and registration opened for an outreach-focused developers' week-end to take place in Pune, India, and led by Alolita Sharma. Approximately 70 participants are expected, focusing on the gadgets framework, mobile Wikimedia access, and internationalisation.


 * GLAMCamp in Washington, DC (10-12 February 2012) — Ryan Kaldari and Asaf Bartov planned to attend the technical track of this GLAM conferece. Engineers will work on mass upload and analytics functionality.

Job openings
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


 * Developers and engineers:
 * Interaction Designer
 * Systems Engineer (Data Analytics)
 * Software Developer (Back-end, Data Analytics)
 * Software Developer (Rich Text Editing, Features)
 * Software Developer (Front-end)
 * QA Lead
 * Software Developer (Mobile)
 * Software Security Engineer


 * Management & Product:
 * Director of Features Engineering
 * Product Manager


 * Requests for proposals:
 * Executive Dashboard - Analytics — Help us improve and centralize the dashboard summarizing the most important data for both Wikimedia Foundation staff and projects such as Wikipedia to understand overall community health.
 * XML Dumps — Help us improve the infrastructure used to build XML dumps of Wikipedia content, for backups and reuse by third parties.
 * Mobile UX — Help us redesign our mobile platform and apps as more and more visitors access Wikipedia and its sister sites via mobile devices.

Short news

 * Yuvaraj Pandian and Max Semenik joined the mobile team as contract developers.
 * Sara Smollett joined the operations team as a part-time contractor.
 * Diederik van Liere, formerly with the Community Department, is now helping the engineering department as a contractor for analytics work.

Site infrastructure

 * Data Centers —


 * As part of our preparation for the migration of our media service to SWIFT, a distributed storage backend, we need to keep the current system afloat a bit longer. We noticed an abrupt uptick in the rate at which disk space for thumbs storage was used, but we've traced it to the source and now have a plan for dealing with it.  In the meantime, we reclaimed some space by purging thumbnails not newly generated and not in use on any of our projects.
 * Performed Swift (Thumbnail) integration and stress testing. Initial results helped us to identify potential bottlenecks and now working on mitigating those risks. Read performance is about 10x what we need on the performance test cluster so we're good on that front. Write performance is only 2x what we need, but sufficient to move forward.  We plan to run a test to see how adding a fourth storage node changes write throughput; hopefully it increases, meaning that we'll be able to scale out more nodes as our write throughput needs increase.  Both read and write performance slowed down a bit between 6 and 11 million objects in a single container.  While they're still good enough, it's not a good sign for commons.  The web also suggests that, in general, performance drops over 10m objects.  The easiest path forward is to shard the commons container using the existing hashed characters in the URL, splitting the container into 256 containers.


 * Deployed puppetmaster dashboard
 * deployed UDP-based profiling tool to identify potential performance issues - http://wikitech.wikimedia.org/view/UDP_based_profiling
 * DB servers refresh
 * OTRS migrated to new infrastructure
 * S6 cluster - db50


 * HTTPS
 * HTTPS support was added for mobile, for Wikipedia. After an initial testing period we'll enable this for further mobile sites. A number of miscellaneous services also had HTTPS either set up or fixed.

Testing environment

 * Wikimedia Labs —
 * Create a SAL for every project, and a combined SAL (https://labsconsole.wikimedia.org/wiki/Server_Admin_Log)
 * adminbot was packaged and puppetization is underway thanks to hyperon
 * Petrb added nagios to labs, and used Semantic MediaWiki queries to configure host service checks
 * Cluebot was deployed in labs, for bot infrastructure testing
 * testswarm was configured, tested, and puppetized in labs by krinkle
 * The reportcard service was moved from project2 to labs, for further development and testing
 * OpenStackManager 1.3 was released and deployed for labs
 * LdapAuthentication 2.0a was deployed for labs - release pending
 * Live migration of instances has been enabled for the OpenStack Nova infrastructure, allowing updates and upgrade of hardware without bringing instances down
 * A gluster storage cluster has been ordered for use as volume storage
 * There are now 33 projects, 52 instances, and 74 users

Backups and data archives

 * Data Dumps — The end of the year closed out with another full dump of the English language Wikipedia on schedule.  More work was done on code to allow restart of the history phase of a dump from a specified point without a long catchup delay.  An experimental service was tested this month: a newly formatted file of article content and an accompanying index, more convenient for data analysts and for use with offline readers.  The first such files, only available for the English Wikipedia, are available here along with a brief explanation of their contents here.

Editing tools

 * Visual editor —
 * Internationalization and localization tools —

Participation and editor retention

 * Article feedback —
 * Feedback Dashboard —

Multimedia Tools

 * UploadWizard —
 * TimedMediaHandler —

MediaWiki infrastructure

 * ResourceLoader —

Mobile

 * Mobile Research — Mani Pande and Parul Vora consolidated all the research findings from Brazil, India, and the USA into one report. It's currently being converted to PDF and wikitext to facilitate its publication.


 * MobileFrontend —


 * Android Wikipedia App — Several release candidates were released over the month and we're nearing completion of the first version of the app. Nightly builds are available for testing.


 * WikipediaZero — We began work on the infrastructure for zero rated Wikipedia access. Next month, we'll start testing with one of our partners to work out the kinks of giving users free data access to Wikipedia.


 * GPS Storage/Retrieval — Max Semenik joined the mobile team and began prototying an API to store and retrieve GPS coordinates on our wikis. This will be a critical component of the mobile projects and will replace our existing use of GeoNames.org


 * Featured Article RSS — Max Semenik built the first version of an extension to expose Featured Articles, In the news, and other main page content so that our partners can better re-use our data.

Fundraising support

 * 2011 Fundraiser —

Offline

 * Kiwix UX initiative —

MediaWiki Core

 * The "MediaWiki Core" team was featured on the Wikimedia Tech blog this month.


 * MediaWiki 1.19 —
 * Continuous integration —
 * Git conversion —
 * VipsScaler —
 * SwiftMedia —
 * HipHop deployment —

Wikimedia analytics

 * Wikimedia Report Card 2.0 —

Technical Liaison; Developer Relations

 * Bug management —
 * Summer of Code 2011 —
 * Engineering project documentation —
 * Volunteer coordination and outreach —
 * MediaWiki architecture document —
 * Wikimedia blog maintenance —

Future
The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.