Wikimedia Engineering/Report/2014/September

Major news in September include:
 * a call for candidates for the Free and Open Source Software Outreach Program for Women;
 * a roundtable discussion between the Language engineering team and editors from the Catalan language Wikipedia, focusing on the Content Translation tool.

Engineering metrics in September:
 * 151 unique committers contributed patchsets of code to MediaWiki.
 * About 27 shell requests were processed.

Upcoming events
There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.

For a more complete and up-to-date list, check out the Project:Calendar.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

* Senior Software Engineer - Services
 * Software Engineer - Maps & Geo - Mobile
 * Software Engineer - Mobile - iOS
 * Release Engineer
 * Technical Writer
 * Full Stack Developer - Analytics
 * Research Analyst
 * Agile Coach/ScrumMaster - Team Practices Group
 * Senior Technical Product Manager
 * Community Liaison
 * Community Liaison (PT Contract)
 * Operations Security Engineer
 * UX Senior Designer
 * UX Senior Design Researcher
 * UX Visual Design Fellowship
 * Mobile Partnerships Regional Manager

Announcements

 * Damon Sicore joined the Wikimedia Foundation as Vice President of Engineering (announcement).
 * Rachel Farrand joined the Engineering Community Team as Events Coordinator (announcement).
 * Jeff Hobson joined the Wikipedia Zero engineering team (announcement).
 * Daisy Chen joined the UX Research team as Associate Design Researcher (announcement).

Technical Operations
Dallas data center
 * In September we have setup (backup) replication of most project data, including core databases and external storage. Work on Swift images and system backups was still ongoing into October. Essential system infrastructure such as an installation server, DNS, LVS, NTP etc. has been deployed as well.

Tampa data center
 * We started the last push to get the remaining services & systems out of our Tampa data center, with a deadline for shutdown of all systems on October 1st. The remaining services included PDF generation, mail servers, noc.wikimedia.org and LDAP.

Labs metrics in September:
 * Number of projects: 146
 * Number of instances: 415
 * Amount of RAM in use (in MBs): 1,996,288
 * Amount of allocated storage (in GBs): 20,435
 * Number of virtual CPUs in use: 977
 * Number of users: 4,083

 Wikimedia Labs
 * Wikitech (the Labs web interface) is now managed via the standard WMF deployment system. This should allow for more frequent MediaWiki updates and overall greater stability.
 * The last historic remaining dependencies on our old Tampa datacenter (e.g. LDAP and Labs DNS backup servers) were finally stamped out and replaced with dependencies on Dallas hardware.
 * One of the labs virtualization hosts (virt1006) was suffering intermittent problems, so all affected instances were migrated to other hosts in order to stave off possible future disaster. Consequently, Labs is a bit short on virtualization space, but new hardware procurement is under way.
 * Several long-unused instances and projects were cleaned up in order to free up more space.
 * The last of the ToolLabs replica DB servers was upgraded to MariaDB 10.

Editor retention: Editing tools
VisualEditor  edit In September, the team working on VisualEditor expanded browser support, improved some features, and fixed nearly 60 bugs and tickets.

Users of Internet Explorer 10, who we were previously preventing from using VisualEditor due to some major bugs, will now be able to use VisualEditor; this follows on from Internet Explorer 11 support last month. When editing a template with a required field, VisualEditor now warns you to avoid leaving it blank, and you can now create auto-numbered links using VisualEditor.

Improvements and updates were made to a number of interface messages as part of our work with translators to improve the software for all users, based on feedback from users and user testing. We made progress on table structure editing and auto-filled citations, both of which will be coming soon.

The deployed version of the code was updated five times in the regular release cycle (1.24-wmf20, 1.24-wmf21, 1.24-wmf22, 1.25-wmf1 and 1.25-wmf2). Editing  edit In September, the Editing Team made substantial progress on front-end standardisation, as well as the work on VisualEditor which is reported separately. The team welcomed Bartosz "MatmaRex" Dziewoński as a new team member, and existing student member Moriel Schottlender converted to full-time status.

The team's work on front-end standardisation is focussed on improving libraries and infrastructure, and in particular, the OOjs UI library. This included the creation of a MediaWiki theme in collaboration with the Design team, which can be explored in the online demo; this will be deployed into MediaWiki's use of OOUI in the next few weeks. A number of bugs were fixed, including working around window and popup sizing, over-flow item placement, and working around some browser bugs in Firefox and Safari. The code documentation has a number of minor issues corrected, and the build process was extended to create a minified distribution. The OOjs library was updated to fix a minor bug in, with a new version (v1.1.1) released and pushed downstream into MediaWiki, VisualEditor and OOjs UI.

The TemplateData extension now supports the " " parameter property, a wikitext value that a parameter can be set to have inserted by default if desired. Also, the specification for TemplateData was re-written to be clearer and more consistent. Next month the TemplateData GUI editor will be made available on all Wikimedia wikis. Parsoid  edit In September, we continued to fix bugs, upgraded libraries, and made additional progress towards improving compatibility with PHP parser + Tidy rendering. Specifically, Parsoid's paragraph wrapping now targets the PHP parser + Tidy output rather than PHP parser output. We also continued to update Parsoid's CSS / rendering to more closely match the current rendering. We also improved Parsoid's robustness handling edge case scenarios (pathological backtracking, parsing of very large wikitext tables). Part of the Parsoid team was also busy with launching the PDF rendering service which was successfully launched end of September.

Services
Services  edit September saw a lot of activity on the RESTBase storage and API service. A new 'pagecontent' composite bucket type using revisioned blob buckets was introduced. This uses the by-now fairly rich table storage backend to provide functionality similar to MediaWiki's revision table, and supports any number of revisioned types of content (like HTML, wikitext, JSON metadata) associated with each revision.

Work on secondary index updates continued at full steam, and is now close to being merged.

Core Features
Flow/Project information  edit In September, the Flow team enabled new test pages on French WP and Hebrew WP. The French test is for the Forum des Nouveaux, a Help space for new contributors (similar to the Teahouse on English WP). The Forum des Nouveaux hosts reached out to the Flow team after Wikimania, excited to try out the new discussions system. The Hebrew WP test is helping the team diagnose problems for Right-to-Left languages, and general i18n issues.

The team also refined the new Echo notifications functionality, with lots of feedback from contributors on Mediawiki.org and En.wp. New topic notifications are now bundled in Echo, and we fixed several bugs related to the behavior of the Alerts and Messages tabs, and getting excess mention notifications.

Growth  edit  In September, the Growth team shut down, with workflows shifting into the mainstream of other teams.

Mobile
Wikimedia Apps  edit In September, the Mobile Apps Team released a new version of the iOS app containing the Nearby feature which shows you articles about things that are near your location, and a references panel that pops up whenever you tap a reference. The team also released an iOS 8 compatibility build to market. The team also spent time performing code quality improvements and refactoring on both the iOS and Android apps. Mobile web projects <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Mobile web projects/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Mobile web projects" data-statuspage="Mobile web projects/status" data-entrydate="2014-09-monthly">This month the mobile web team focused on the first prototype of WikiGrok, a new contribution feature that asks users who are reading Wikipedia articles to help add Wikidata that is missing about the article subject. Over the course of the month, we built and user-tested the first experimental interface for allowing users to input Wikidata: a simple binary question mode that provides the user with a suggested occupation on biographies that are missing this information in Wikidata but contain a possible occupation in the Wikipedia article. In this early test phase we are storing the replies in a separate database, not pushing to Wikidata. We plan to add suggestions for more Wikidata fields and test this version against a slightly more complex tagging interface in beta in October. Wikipedia Zero <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Wikipedia Zero/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Wikipedia Zero" data-statuspage="Wikipedia Zero/status" data-entrydate="2014-09-monthly">During September 2014, the team built more Partners Portal architecture, including Graph extension integration components for eventual display of aggregate statistics to zero-rating partners (it already works and is being reviewed in house). The team also grew support for dynamic zero-rating banners while enhancing JSON configuration extension code and issuing bugfixes. Additionally, the team shrunk the size of the Wikipedia favicon to reduce bandwidth usage by users across the web. And on the partner side, we launched Telenor Myanmar in September.

Finally, the team welcomed its newest software engineer, Jeff Hobson, to the Wikimedia Foundation!

Language Engineering
Language tools <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Language tools/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Language tools" data-statuspage="Language tools/status" data-entrydate="2014-09-monthly">The CLDR extension was updated to version 26 and entries identical to CLDR were removed from LocalNamesEn.php. The team made RTL fixes in core, Echo and Wikibase, and tested Flow for RTL support. Maintenance of the Translate extension continued, and the performance of translation memory was improved on ElasticSearch with the help of Nik Everett Language engineering communications and outreach <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Language engineering communications and outreach/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Language engineering communications and outreach" data-statuspage="Language engineering communications and outreach/status" data-entrydate="2014-09-monthly">The Language Engineering team hosted a round-table with Catalan Wikipedia editors who use ContentTranslation to gather feedback about their experience with the tool. OPW mentors started coordinating with candidates interested in internationalization projects Content translation <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Content translation/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Content translation" data-statuspage="Content translation/status" data-entrydate="2014-09-monthly">The second version of the tool was released. This version has not yet been deployed due to technical issues in the Labs setup. This is currently being resolved with the Ops team. Notable improvements include: The team is performing ongoing tests with users for Spanish-Portuguese, Portuguese-Spanish translations, and we started planning for the third release.
 * a basic formatting toolbar (for Chrome);
 * more accurate warnings for unchanged machine translated content;
 * design improvements for the top bar and progress bar;
 * bi-directional support for Spanish-Portuguese machine translation;
 * link adaptation improvements.

MediaWiki Core
Admin tools development <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Admin tools development/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Admin tools development" data-statuspage="Admin tools development/status" data-entrydate="2014-09-monthly">The majority of admin tools resources are currently diverted towards the SUL finalisation project. In September, minor UX enhancements were applied to the Special:CentralAuth page. Search <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Search/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Search" data-statuspage="Search/status" data-entrydate="2014-09-monthly">In September we worked to mitigate the performance bottleneck that we found in August. We found there to be no silver bullet but used the information we learned to pick and order appropriate hardware to handle the remaining wikis. We also implemented out significantly improved wikitext Regular Expression search.

In October we've begun rolling out the wikitext Regular Expression search and received some of the hardware we need to finish cutting over the remaining wikis. We believe we'll get it all installed in October and cut the remaining wikis over in November. SUL finalisation <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="SUL finalisation/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="SUL finalisation" data-statuspage="SUL finalisation/status" data-entrydate="2014-09-monthly">In September, the team wrapped up the feature development for SUL finalisation. One part of the work (the steward end of the rename request form) is outstanding and will be finished in October. In October, the team is planning to proceed into deployment and testing of the features. Security auditing and response <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Security auditing and response/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Security auditing and response" data-statuspage="Security auditing and response/status" data-entrydate="2014-09-monthly">We published the 1.23.4 security release, and completed review for the Graph and Imagemetrics extensions.

Multimedia
Multimedia <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Multimedia/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Multimedia" data-statuspage="Multimedia/status" data-entrydate="2014-09-monthly">In September, the multimedia team developed and released a first round of new improvements to Media Viewer, based on feedback from our recent community consultation and ongoing user research.

These improvements aim to make Media Viewer easier to use by readers and casual editors, with these features: a more prominent "More Details" button, linking to the File: page; separate icons for "Download" and "Share or Embed" features; and an easier way to enlarge images by clicking on them. Next, we plan to work on an easier way to disable Media Viewer for personal use and a caption or description right below the image. We would like to thank all the community members who suggested these improvements. Our research suggests that they offer a better user experience, that is both clearer and simpler.

This month, we also ramped up the Structured Data project, in collaboration with community members and the Wikidata team: in October, we will start developing a first prototype for a high-end API that can read and write machine-readable data on Wikimedia Commons, to be followed by a wider deployment in coming months. In parallel, the foundation is also launching a file metadata cleanup drive to add machine-readable attributions and licenses on files that lack them, spearheaded by Guillaume Paumier. To learn more, join our Structured Data Q&A on Thursday, October 16 at 18:00 UTC, for an office hours chat on (Freenode IRC).

We also continued our code refactoring for the UploadWizard, and started to collect metrics for an upload funnel analysis, to find out how many users drop out at each step of the upload and where failure is occurring, so we can prioritize bug fixes. For more information about our work, join the multimedia mailing list.

Engineering Community Team
Bug management <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Bug management/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Bug management" data-statuspage="Bug management/status" data-entrydate="2014-09-monthly">The "Commons App" product was closed as no further development is planned currently. A new product was created for the new PDF renderer infrastructure and numerous components were created. Phabricator/Migration <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Phabricator/Migration/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Phabricator/Migration" data-statuspage="Phabricator/Migration/status" data-entrydate="2014-09-monthly">phabricator.wikimedia.org got set up with tickets imported from the previous Labs instance (public registration will be enabled once all remaining tasks have been sorted out). Restricting access to Phabricator tasks based on project membership was implemented. Inbound email was configured so Phabricator lets you interact with external (non-Phabricator) users via email. A certificate for phab.wmfusercontent.org for file uploads was set up and Operations set up SNI on misc-web-lb, made it work with nginx, added IPv6 access, and fixed an error when trying to log in via HTTP. The legal footer required was set up with its license, terms of use and correct links. Reedy added phabricator to the interwiki map, and Helder started to the 'tracked' gadget on MediaWiki to query the Phabricator API. Many further smaller fixes took place. Furthermore, Quim improved the Phabricator documentation and help. Andre showed the very basics of Phabricator in a video. Daniel and Yuvi set up a new Phabricator test instance on https://phab-01.wmflabs.org/ that anybody is welcome to test.

Analytics
Analytics/Wikimetrics <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Analytics/Wikimetrics/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Analytics/Wikimetrics" data-statuspage="Analytics/Wikimetrics/status" data-entrydate="2014-09-monthly"> Work was done on the following metrics: Analytics/Data Processing <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Analytics/Data Processing/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Analytics/Data Processing" data-statuspage="Analytics/Data Processing/status" data-entrydate="2014-09-monthly"> A terrific weekly summary is posted to the Analytics mailing list with a summary at the top of each email. Here are the links to related posts in the archives. Analytics/Editor Engagement Vital Signs <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Analytics/Editor Engagement Vital Signs/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Analytics/Editor Engagement Vital Signs" data-statuspage="Analytics/Editor Engagement Vital Signs/status" data-entrydate="2014-09-monthly">The Vital Signs dashboard is now live. We are calling it “Vital Signs” because it will eventually display content and readership metrics, not just Editor Engagement metrics.
 * Rolling New Active Editor - Implemented
 * Rolling Surviving New Active Editor - Implemented
 * Pages Created and Edits - Updated to include reporting configuration to include changes to deleted pages (this is a default).
 * Metrics with ‘Namespaces’ as a parameter let you specify “All Namespaces.” Leave the input field blank to do so.
 * Rolling recurring old active editor is implemented, but does not perform sufficiently rapidly for us to enable it on the production servers.
 * The status of the implementation of Standardized Metrics defined by the Research Team is here: https://meta.wikimedia.org/wiki/Research:Metrics_standardization/Implementation
 * 2014-09-01--2014-09-07
 * 2014-09-08--2014-09-14
 * 2014-09-15--2014-09-21
 * 2014-09-22--2014-09-28

Vital Signs was presented at the Analytics Quarterly Review as well as the October WMF Metrics meeting. Analytics/EventLogging <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Analytics/EventLogging/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Analytics/EventLogging" data-statuspage="Analytics/EventLogging/status" data-entrydate="2014-09-monthly">Work was performed to clean up some EventLogging tables per the privacy policy. Analytics/Research and Data <span class="plainlinks noprint mw-statushelper-editlink" style="margin: 0 0 0 1em; font-size:80%; background:#e4e4e4;" data-statuspage="Analytics/Research and Data/status" data-entrydate="2014-09-monthly"> edit <div style="margin: 0 0 0 2em;" class="mw-statushelper-entry" id="Analytics/Research and Data" data-statuspage="Analytics/Research and Data/status" data-entrydate="2014-09-monthly">This month we onboarded Ellery Wulczyn as the newest addition to the Research & Data team. Ellery recently finished a Computer Science Masters program at Stanford and joins us as a full-time research analyst after completing a summer fellowship with University of Chicago's Data Science for Social Good program. His focus at WMF is going to be fundraising research and analytics. Welcome, Ellery!

We completed the definitions, documentation and requirements for a new set of metrics to be implemented in Vital Signs.

We completed a first draft of a page view definition, which is currently being discussed. We supported the mobile team with baseline traffic reports for Apps and Mobile Web.

We participated in the preparatory sessions for the design of an open consultation led by the Community Liaison team as well as in regular meetings to support the strategy consultation process.

We held our Q1-2015 quarterly review, reviewed the team's progress against Q1 goals and posted our proposed Q2 goals.

Kiwix
The Kiwix project is funded and executed by Wikimedia CH.


 * We made progress in our work with our partner Bookeen to get an e-ink reader able to read Wikipedia offline. We managed to get a first version of the device firmware working, and it will be tested in the field as part of the Malebooks pilot deployment.


 * As a consequence of a bug fixing sprint with Parsoid and Wikisource developers at Wikimania, we were finally able to generate usable ZIM files from Wikisource (example with fr.wikisource.org).


 * Work on the offline project Gutenberg continued and we are now almost ready to release. A few ZIM files are in testing, for example in German and in Spanish.


 * Kiwix was represented at the Selenium conference where we held a 2-day bug hunting session: 120 bugs were reported, of which 50% were fixed.


 * mwoffliner was improved to make it easier to use for everyone, in particular to make ZIM files for only a selection of articles. As a demonstration, we prepared a ZIM files containing all Wikipedia articles about medicine.


 * After many years, a new version of a tool to generate a live CD including Kiwix and Wikipedia content was released.

Wikidata
The Wikidata project is funded and executed by Wikimedia Deutschland.


 * In September, the Wikidata team focused on improving performance, doing groundwork for the new user interface design, and making it possible to track where data from Wikidata is used. Next to that, they worked on tests and prepared for a week-long meeting with the WMF multimedia team and volunteers to discuss and plan structured data support for Wikimedia Commons.

Future

 * The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.