Wikimedia Engineering/Report/2014/November

Major news in November include:
 * the release of the second version of the Content Translation tool, which heavily relies on Apertium for machine translation;
 * updates to MediaWiki's internationalization based on new CLDR data;
 * the move from Bugzilla to Phabricator as the new collaboration platform for the Wikimedia technical community.

Upcoming events
There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.

For a more complete and up-to-date list, check out the Project:Calendar.

Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

* Director of Engineering
 * Senior Software Engineer - Services
 * Software Engineer - Mobile - Android
 * Software Engineer - Wikipedia Zero
 * Software Engineer - Flow (Front-end)
 * Release Engineer
 * Application Security Engineer
 * Full Stack Developer - Analytics
 * Agile Coach/ScrumMaster - Team Practices Group
 * Senior Technical Product Manager
 * Community Liaison
 * Community Liaison (PT Contract)
 * Operations Security Engineer
 * UX Senior Designer
 * UX Senior Design Researcher
 * UX Visual Design Fellowship
 * Mobile Partnerships Regional Manager

Announcements

 * Andrew Garret joined the Wikimedia Foundation as a full time Software Engineer (announcement).
 * Yuvaraj Pandian joined the Wikimedia Technical Operations team (announcement).
 * Tracy Beasley joined the Design Research Team as Participant Recruiter (announcement).
 * James Douglas joined the Platform engineering team as part of the Services group (announcement).
 * Stas Malyshev joined the Platform engineering team as part of the MediaWiki Core group (announcement).

Technical Operations
Dallas data center

Tampa data center

Labs metrics in November: Tool metrics:
 * Number of projects: 154
 * Number of instances: 440
 * Amount of RAM in use (in MBs): 2,131,456
 * Amount of allocated storage (in GBs): 21,555
 * Number of virtual CPUs in use: 1,047
 * Number of users: 4,426
 * Number of tools: 976
 * Number of tool maintainers: 543

 Wikimedia Labs
 * Yuvi has officially joined the labs team.
 * We updated the labs OpenStack install from version 'Havana' to version 'Icehouse'.
 * Ldap (used for sign-in on many WMF services) is now de-coupled from the Labs hardware. Ldap has a dedicated server in each of eqiad and codfw.
 * Hardware to expand Labs VM capacity in eqiad is now racked. Work on the OS and OpenStack install is ongoing.
 * Trusty instances can now pull ssh keys directly from ldap, so logins (on Trusty instances) will still work in case of shared-storage outage
 * We now have redirects from tool server to toollabs. This is one of the last steps in sunsetting the tool server.
 * Marc added a few experimental Trusty nodes to toollabs.

Front-end
Front-end libraries standardization

Kiwix
The Kiwix project is funded and executed by Wikimedia CH.



We have released, for the first time, a complete offline version of the Gutenberg project, a 50.000 big public domain online library. This new software solution is able to create easily a complete offline snapshot proposing all the books in HTML and EPUB format. We make the books accessible via a custom and really easy-to-use interface. It's consequently trivial to have this big library available everywhere on your PC, local network or even smartphone. This was the first step of a broader effort to increase outreach of public domain literature, further development will take place in 2015. If you want to know more, read the release announcement.

Industrialization of the Wikimedia projects dumping process continues its progress. Beside the continuous improvement of mwoffliner, a new small tool called mwmatrixoffliner was released, it uses Mediawiki Matrix extension API to allow dumping of all linguistic versions of a project. As a result, we have started to produce on a monthly base, systematic ZIM snapshot of the following projects: Wikivoyage, Wikinews, Wikiquote, Wikiversity, Wikibooks and Wikispecies. For all theses projects, we make available on download.kiwix.org, per BitTorrent and HTTP: complete dumps with or without pictures, as a raw ZIM file or pre-packaged with Kiwix in a so called portable version. We will provide soon this services for bigger projects like Wikitionaries or Wikipedias and have therefore started to setup new server instances on Wikimedia labs.

We have also made a new release of TED talks ZIM files. This follows an effort to improve the user interface, thus these new updated version benefits of a slightly reviewed user interface.

Wikidata
The Wikidata project is funded and executed by Wikimedia Deutschland.

Wikidata won the Open Data Award in the category Publisher by the Open Data Institute. Development was focused on: Wikidata was also a big topic at the GLAM hackathon in Amsterdam and was used for many great applications like the Sum of all Paintings. An office hour about structured data on Commons was well-attended.
 * performance improvements
 * introducing language fallbacks (so you will see labels in other languages you likely speak if they are not available in your language)
 * statements on properties (so you can indicate that one property is the inverse of another property or that a given property on Wikidata corresponds to another one on another website)
 * redesigning the sitelinks section.

Future

 * The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.