Final negotiations have completed with the 3 remaining data center bids in February, and the Wikimedia Operations team will make a decision in the first week of March. Expect a public announcement soon.
Labs metrics in February:
Number of projects: 129
Number of instances: 458
Amount of RAM in use (in MBs): 1,812,992
Amount of allocated storage (in GBs): 24,540
Number of virtual CPUs in use: 906
Number of users: 2,714
The Wikimedia Labs infrastructure in the eqiad data center has been deployed with the OpenStack Havana release, and testing completed in February. Labs users will have 2 weeks to migrate their own projects & instances starting in March. During the last two weeks of March, the Wikimedia Operations team will handle the transfer of the remaining instances that have not been migrated by users themselves.
During a short deployment of our West Coast data center ulsfo in October 2013 several reliability problems were found with some of our network service providers, which forced us to take this site out of service until they could be resolved. We have worked since to improve reliability and increase redundancy of network transit and transport to this site. As of the week of February 3rd ulsfo is in full production usage again, and is now serving traffic for the US west coast, Oceania and large parts of Asia. A blog post is being prepared describing the improvements in user perceived site performance.
eqiad data center capacity expansion
The Wikimedia Foundation has expanded the capacity of its main data center site eqiad in Ashburn, Virginia by 33%. A fourth row of racks has been added, and all power & networking infrastructure has been installed and configured in February. The added rack space is available for new equipment as of February 24th.
In February, the VisualEditor team continued their work on improving the stability and performance of the system, and added some new features and simplifications. Media item editing is now much richer, allowing the setting of position, alt text, size (or setting as default size) and type for most kinds of media item. When adding links, redirects and disambiguation pages are now highlighted to help editors select the right link, and changing the format or style of some text was tweaked to make editing clearer and more obvious. Adding and editing template usages is now a little smoother, auto-focussing on parameters and making them clearer to use. Page settings have expanded to set redirects, page indexing and new section edit link options. The extensive work to make insertion of "citation" references based on templates quick, obvious and simple neared completion. The deployed version of the code was updated four times in the regular releases (1.23-wmf13, 1.23-wmf14, 1.23-wmf15 and 1.23-wmf16).
In February, the Parsoid team continued with bug fixes and improved image support. See the deployment page for a summary of deployments and fixed bugs in February.
Part of the team has continued to mentor two Outreach Program for Women (OPW) interns. This internship ends mid-March. Others are mentoring a group of students in a Facebook Open Academy project to build a Cassandra storage back-end for the Parsoid round-trip test server.
We have a first version of a Debian package for Parsoid ready. This package is yet to find a home base (repository) from which it can be installed. This will soon make the installation of Parsoid as easy as apt-get install parsoid.
This month, Flow was launched on the talk pages of two English Wikipedia WikiProjects that volunteered to be a part of the first trial, WikiProject Breakfast and WikiProject Hampshire. We've continued to iterate on the front-end design of the discussion system based on user feedback, releasing a new visual treatment during the trial and starting work on a front-end rewrite for better cross-browser and mobile compatibility (to be released sometime in March). We also spent time making sure Flow integrates better with vital MediaWiki tools and processes (e.g., suppression and checkuser) and improving the handling of permalink URLs.
In February, the Growth team first focused on releasing the new Wikipedia onboarding experience on additional projects. The GettingStarted extension was deployed to 30 Wikipedias, including all of the top 10 projects by number of page views. This marks the first time its task suggestions and guided tours were available outside English projects. The GuidedTour extension was also deployed to those projects (as a dependency of GettingStarted), as well as the Czech Wikipedia and se.wikimedia.org. Late in the month, the team also presented its work at its first Quarterly Review of the 2014 calendar year (see slides and minutes).
For the first half of the month, we focused on the current Education Program extension. We fixed many old and new bugs—including a few remaining database-related problems—and improved the UI for editing courses. Also, two Facebook Open Academy students started work on new notifications for the extension. In mid-February the team shifted our focus to creating new software for many kinds of collaborative editing, including, but not limited to, Education Program courses. The first phase of this work, called editor campaigns, is being carried out with the Growth team.
We've been working on bringing VisualEditor to tablets (currently in alpha). This is a requirement for redirecting tablets to mobile later on. Specifically, we've been working on enabling inspectors, especially the link inspector. We've also been fixing a variety of bugs to ensure that the basic editing functionality works as expected.
During the last month, the team added zero-rating for HTTPS for select carriers in cooperation with the Operations team. In collaboration with the Mobile Apps team, we integrated Wikipedia Zero into the forthcoming rebooted versions of the Android and iOS apps, including API and client-side code for zero-rating detection. We updated the legacy Firefox OS app with bugfixes from January (make spinner background opaque, remove mozmarket.js legacy JS); we also prepared other bugfixes for that app (keep last page browsed on low memory crash, avoid text overlaying <select> dropdwon, ensure 'X' clicks stop processing and not send user to Main Page). Discussion with the Operations team and Platform Engineering continued on the ideal portal hosting approach concurrent with sprint planning; portal work is probably deferred until the hosting strategy is formalized. The team also started work on the core API to allow dynamic category pages based on search terms, as well as continuing the discussion on core ResourceLoader features, in support of a proof of concept HTML5 webapp riding atop MobileFrontend. We also started a patch to make contributory features (not just banners and rewritten URLs) present for Wikipedia Zero users on carriers supporting HTTPS zero-rating. Last but not least, Yuri Astrakhan performed extensive analytics work on pageviews and page bandwidth consumption for gzip-capable Wikipedia Zero clients across all Wikipedia Zero-scoped partner pageviews; Yuri also conducted additional analytics work on SMS/USSD data.
Wikipedia Zero (partnerships)
In February, we launched Wikipedia Zero with MTN South Africa (Opera Mini browser only). MTN South Africa responded directly to the kids of Sinenjongo High School with an open letter to the students and the youth of South Africa. They said they agree that Wikipedia could give a boost to their education system, and that offering Wikipedia Zero is a small thing that could change everything (see video on YouTube).
We also launched Wikipedia Zero with Safaricom, the largest operator in Kenya. We now have three partners in Kenya, covering 90% of all mobile subscribers. South Africa is our 23rd country to launch, and Safaricom is our 27th operator partner.
The Mobile Partnerships team attended Mobile World Congress in Barcelona, where we met with existing operator partners, prospective partners and tech companies who want to support the mission. At the conference, our Wikipedia Text pilot with Airtel Kenya and the Praekelt Foundation was nominated as a finalist for the GSMA Global Mobile awards in the education category.
Runa Bhattacharjee is setting up a Test Case Management System, to facilitate manual testing inside the team and helping volunteer translators test new versions of language tools and report the results.
The prototype ContentTranslation server was created in Node.js, mostly by Santhosh Thottingal and David Chan. The server will be responsible for syncing the translations between all the languages, storing translated parallel texts (using Redis) and retrieving caching the results of language tools queries (machine translation, translation memory, dictionaries, segmentation, etc.).
Some front-end components for the translation interface were made, mostly by Sucheta Goshal and Amir Aharoni.
Work is starting back up on this project, with the goal of having at least one production service running on HipHop by the end of the quarter. Tim Starling is working with the HHVM upstream to finish off a compatibility layer for running Zend extensions (ext_zend_compat) under HipHop, with the goal of using it for our Lua module. Ori Livneh is working on packaging and deployment issues, as well as generally wrangling the overall development effort. Aaron Schulz is starting to investigate what is needed for wmferrors support.
While this workstream is still officially on hold, the related Global CSS/JS extension to provide per-user global modules was deployed to beta labs for testing. Additionally, patches were contributed by volunteer developers.
This month, almost all LuceneSearch and MWSearch bugs have either been closed as problems that are fixed in CirrusSearch, or moved to the CirrusSearch component. We then prioritized all CirrusSearch bugs. After clearing out any remaining high priority issues, engineering work for an update to the design of the search results page is due to commence on March 10.
The application automatically transitioned from the active scholarship collection period to the review-only period on 2014-02-17. No major issues were reported for February. The back-end features of the application were demoed for the IEG team as part of their information gathering process for implementing a more structured review tool for grants.
The month of February saw a lot of work on WMF deployment tooling.
To see a real life example of what it looks like to deploy code on the WMF server cluster, watch this screencast created by Bryan Davis. That shows you what the person deploying the code sees when doing a localization (translations) update. A deployment that includes new changes to the code (e.g. MediaWiki and extensions) on the servers would be different.
The suite of tools that make up the current MediaWiki deployment tooling is continuing to be updated and rewritten in Python. You can see the work of this in the repository's history.
There is now a matrix showing the requirements for deployment tooling for 3 projects (MediaWiki, Parsoid (and related), and ElasticSearch (and related)). This is not a fixed document and will grow/change as more is learned.
In February, we updated our 3rd-party Jenkins instance to use Jenkins job builder configuration rather than Jenkins templates. Now our 3rd-party Jenkins builds matches the WMF Jenkins build scheme, giving us maximum flexibility for when and how these jobs are run in the future. Also, we laid the groundwork for several significant new test features to be announced in the near future.
Not much happened on the beta cluster beside the usual maintenance and the platform being used to detect nasty bugs before they land on the production cluster. It is being used successfully for staging various features, bugfixes and extensions as well as for browser tests tracking regressions.
Next month will see the beta cluster migrating from the pmtpa datacenter to the eqiad datacenter.
Our test coverage of MediaWiki extensions continues to prove itself. In February, using the automated browser tests running against beta labs and test2wiki, we found and fixed several critical errors that would have disrupted production wikis severely if they had been released.
Fabrice Florin managed product development for Media Viewer and prepared the release plan for a gradual deployment of Media Viewer out of beta in coming months, based on the team's latest development goals. We also hosted an IRC chat to discuss Media Viewer with the rest of the community and plan our next steps together. Lastly, the video RfC we started last month was closed with a community recommendation to not support the proprietary MP4 video format on our sites; as a result, we will only support open video formats like WebM and Ogg in the next version (v0.3) of Media Viewer. For more updates, we invite you to join the multimedia mailing list.
After summarizing community input into consolidated requirements, Andre Klapper and Guillaume Paumier listed the different options mentioned during the consultation process. Those go from keeping the status quo to changing a single tool, to consolidating most tools into one. They also continued to research the main candidates by reading articles and testing demo sites. Once the list of options has been shortened collaboratively, the community RFC will start.
Getting Facebook Open Academy projects up to speed is becoming even more complex than expected, but we are getting there slowly. All students and mentors met at the kick-off hackathon at Facebook headquarters on February 7−9 (see Marc-André Pelletier's report).
In February, Guillaume Paumier continued to provide ongoing communications support for the engineering staff, and contributed to writing, simplifying, publishing and distributing the weekly technical newsletter. He also edited essays from Google Code-in students for publication on the Wikimedia blog.
We held several architecture meetings to review Requests for Comment on IRC, and continued discussion and implementation of work begun at the architecture summit in January. We also worked on improvements to the architecture guidelines and on a draft of performance guidelines for developers.
We continue to make progress on the Hadoop/Kafka roll-out. We've encountered some issues with cross-data center latencies with Varnish-Kafka that we are currently debugging. We are also testing the Kafka-tee component that provides backwards compatibility for udp2log subscribers. Finally, we are finishing a report for the Mobile team on browser breakdowns using Kafka-provided data on Hadoop.
Work progresses on enhancing Wikimetrics into a more flexible general tool. This month we completed work on a Vagrant deployment environment which will make it easier for the community to work on Wikimetrics. We've also made progress on the scheduler, reporting enhancements and a deployment issue.
This month, we welcomed Leila Zia as the newest addition to the team. Leila joins the Foundation as a research scientist after completing a PhD in management science and engineering at Stanford University. Her work will initially focus on modeling editor lifecycles to better understand what affects their survival and retention.
We attended the 17th ACM Conference on Computer-supported cooperative work and Social Computing (CSCW '14) in Baltimore. Research on Wikipedia and wiki-based collaboration has been a major focus of CSCW in the past, and this year three Wikipedia research papers were presented. We hosted a session to discuss collaboration opportunities for researchers interested in tackling problems of strategic importance for Wikimedia (a detailed CSCW '14 report will follow on wiki-research-l).
We started creating public documentation for data sources and tools used by the team for research and data analysis and porting docs previously hosted on internal wikis (for example: analytics/geolocation).
We continued to provide ad-hoc support to various teams at the Foundation and worked closely with the Growth and Mobile teams to prepare and review results for their respective quarterly reviews.
For the first time, we have released a ZIM file of the entire Wikipedia in English with all encyclopedic articles and thumbnails (download the 90GB file via torrent). In our announcement, we've also explained how we generate those archives and advertised the tools we've been working with, like mwoffliner and zimwriterfs. This month, a student also worked on the creation of ZIM files containing TED talks. The internship is now over and was a success; ZIM files will be published soon. Preparation work for our Usability Hackathon has started.
Wikisource now has access to the the data in Wikidata like ISBNs and the date of birth of an author. The Lua interface for Wikidata has been extended significantly to make it more powerful and easier to use. Support for article badges has seen more work and is now missing mostly the user interface part. Loading time of items on Wikidata has been improved drastically. Everyone is asked to provide input for the upcoming redesign of Wikidata's user interface.
The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.