Throughout July, the cabling work of all racked servers and other equipment was nearly completed. We're still awaiting the installation of the first connectivity to the rest of our US network in early August before we can begin installation of servers and services.
San Francisco data center
Due to a necessary upgrade to power & cooling infrastructure in our San Francisco data center (which we call ulsfo), our racks have been migrated to a new floor within the same building on July 9. The move completed in a very smooth fashion without user impact, and the site was brought back online serving all user traffic again in less than 24 hours.
Through the help of volunteer work and research, our staff enabled Perfect Forward Secrecy on our SSL infrastructure, significantly increasing the security of encrypted user traffic.
Labs metrics in July:
Number of projects: 173
Number of instances: 464
Amount of RAM in use (in MBs): 1,933,824
Amount of allocated storage (in GBs): 20,925
Number of virtual CPUs in use: 949
Number of users: 3,500
We've made several minor updates to Wikitech: we added OAuth support, fixed a few user interface issues, and purged the obsolete 'local-*' terminology for service groups.
OPW Intern Dinu Sandaru has set forms for structured project documentation. This should will help match new volunteers with existing projects, and will make communication with project administrators more straightforward.
Sean Pringle is in the process of updating the Tool Labs replica databases to MariaDB version 10.0. This may reduce replag, and should improve performance and reliability.
We're setting up new storage hardware for the project dumps. This will resolve our ongoing problems with full drives and out-of-date dumps.
In July, the team working on VisualEditor converged the design for mobile and desktop, made it possible to see and edit HTML comments, improved access to re-using citations, and fixed over 120 bugs and tickets.
The new design, with controls focussed at the top of each window in consistent positions, was made possible due to the significant progress made in cross-platform support in the UI library, which now provides responsively-sized windows that can work on desktop, tablet and phone with the same code. HTML comments are occasionally used on a few articles to alert editors to contentious or problematic issues without disrupting articles as they are read, so making them prominently visible avoids editors accidentally stepping over expected limits. Re-using citations is now provided with its simple dialog available in the toolbar so that it is easier for users to find.
Other improvements include an array of performance fixes targeted at helping mobile users especially, fixing a number of minor instances where VisualEditor would corrupt the page, and installing better monitoring of corruptions if they occur, and better support for right-to-left languages, displaying icons with the right orientation based on context.
The mobile version of VisualEditor, currently available for beta testers, moved towards stable release, fixing a number of bugs and editing issues and improving loading performance. Our work to support languages made some significant gains, nearing the completion of a major task to support IME users, and the work to support Internet Explorer uncovered some more issues as well as fixes. The deployed version of the code was updated five times in the regular release cycle (1.24-wmf12, 1.24-wmf13, 1.24-wmf14, 1.24-wmf15 and 1.24-wmf16).
In wider news, the team expanded its scope to cover all MediaWiki editing tools as well, as the new Editing Team (covered below).
In July, the newly re-named and re-scoped Editing Team was formed from the VisualEditor Team. We are responsible for extending and improving the editing tools used at Wikimedia – primarily VisualEditor and maintenance for WikiEditor. We exist to support new and existing editors alike; our current work is mostly on desktop, and we are working with Mobile to take responsibility for all editing across desktop, tablet and phone platforms, spanning approximately 50 different areas of MediaWiki and extensions related to editing. We will continue to report progress on VisualEditor separately.
The biggest Editing change this month was in the Cite extension (for footnotes) – this now automatically shows a references list at the end of the page if you forget to put in a <references /> tag, instead of displaying an ugly error message. The Math extension (for formulæ) was improved with more rigorous error handling and LaTeX formula checking, as part of the long-term volunteer-led work to introduce MathML-based display and editing. The TemplateData GUI editor was deployed to a further six wikis – the English, French, Italian, Russian, Finnish and Dutch Wikipedias.
A lot of work was done on libraries and infrastructure for the Editing Team and others. The OOjs UI library was extensively modified to bring in a new window management system for comprehensive combined desktop, tablet and phone support, as well as other updates to improve Internet Explorer compatibility and accessibility of controls. In the next few months the team will continue working on OOUI to support other teams' needs and implement a consistent look-and-feel in collaboration with the Design team. The OOjs library was updated to fix a minor bug, with a new version (v1.0.11) released and pushed downstream into MediaWiki, VisualEditor and OOjs UI. The ResourceLoader framework was extended to allow skins to set the "skinStyles" property themselves, rather than rely on faux dependencies, as part of wider efforts led jointly by a volunteer and a team member to improve MediaWiki's skin support.
In July, the Parsoid team continued with ongoing bug fixes and bi-weekly deployments.
With an eye towards supporting Parsoid-driven page views, the Parsoid team strategized on addressing Cite extension rendering differences that arise from site-messages based customizations and is considering a pure CSS-based solution for addressing the common use cases. We also finished work developing the test setup for doing mass visual diff tests between PHP parser rendering and Parsoid rendering. It was tested locally and we started preparations for deploying that on our test servers. This will go live end-July or early-August.
The GSoC 2014 LintTrap project continued to make good progress. We had productive conversations with Project WikiCheck about integrating LintTrap with WikiCheck in a couple different ways. We hope to develop this further over the coming months.
Overall, this was also a month of reduced activity with Gabriel now officially full time in the Services team and Scott focused on the PDF service deployment that went live a couple days ago. The full team is also spending a week at a off-site meeting working and spending time together in person prior to Wikimania in London.
The PDF render service is now deployed in production, and can be selected as a render backend in Special:Book. The renderer does not work perfectly on all pages yet, but the hope is that this will soon be fixed in collaboration with the other primary author of this service, C. Scott Ananian.
Prototyping work on the storage service and REST API is progressing well. The storage service now has early support for bucket creation and multiple bucket types. We decided to configure the storage service as a backend for the REST API server. This means that all requests will be sent to the REST API, which will then route them to the appropriate storage service without network overhead. This design lets us keep the storage service buckets very general by adding entry point specific logic in front-end handlers. The interface is still well-defined in terms of HTTP requests, so it remains straightforward to run the storage service as a separate process. We refined the bucket design to allow us to add features very similar to Amazon DynamoDB in a future iteration. There is also an early design for light-weight HTTP transaction support.
Matt Walker is sadly leaving the Foundation by the end of this month to follow his passion of building flying cars. This means that we currently have three positions open in the service group, which we hope to start filling soon.
In July, the Flow team built the ability for users to subscribe to individual Flow discussions, instead of following an entire page of conversations. Subscribing to an individual thread is automatic for users who create or reply to the thread, and users can choose to subscribe (or unsubscribe) by clicking a star icon in the conversation's header box. Users who are subscribed to a thread receive notifications about any replies or activity in that thread. To support the new subscription/notification system, the team created a new namespace, Topic, which is the new "permalink" URL for discussion threads; when a user clicks on a notification, the target link will be the Topic page, with the new messages highlighted with a color. The team is currently building a new read/unread state for Flow notifications, to help users keep track of the active discussion topics that they're subscribed to.
Following on from the successful launch to Android, the Mobile Apps team released the new native Wikipedia app to iOS on July 31. The app is the iOS counterpart to the Android app, with many of the same features such as editing, saving pages for offline reading, and browsing history. The iOS app also contains an onboarding screen that is shown the first time the app is launched, asking users to sign up, a feature which was also launched on Android this month (see below).
On Android this month we released to production accessibility and styling features which were requested by our users, such as a night mode for reading in the dark and a font size selector. We also released an onboarding screen that asks users to sign up.
Our plan for next month is to get user feedback from Wikimania, wrap up our styling fixes, and begin work on an onboarding screen the first time that someone taps edit.
This month, the team continued to focus on wrapping up the collaboration with the Editing team to bring VisualEditor to tablet users on the mobile site. We also began working to design and prototype our first new Wikidata contribution stream, which we will build and test with users on the beta site in the coming month.
During the last month, the team worked on software architecture features that allow for expansion of the Wikipedia Zero footprint on partner networks and that get users to content faster with support for lowered cache fragmentation on Varnish caches. Whereas the previous system supported one-size-fits-all configuration for heterogeneous partner networks, inhibiting some zero-rated access, the new system supports multiple configurations for disparate IP addresses and connection profiles per operator. Additionally, lightweight script and GIF-ified Wikipedia Zero banner support has been added and is being tested; in time this should drastically reduce Varnish cache fragmentation, making pages be served faster and reducing Varnish server load. A faster landing page was introduced for "zerodot" (zero.wikipedia.org, legacy text-only experience) landing pages when operators have multiple popular languages in their geography. Work on compression proxy traffic analysis for header enrichment conformance with the official Wikipedia Zero configurations was also performed after more diagnostic logging code was added to the system. Finally, watchlist thumbnails, although low bandwidth, were removed from the zerodot user experience, as was the higher bandwidth MediaViewer feature for zerodot; mdot will have these features, though.
In side project work, the team spent time on API continuation queries, Android IP editing notices, Amazon Kindle and other non-Google Play distribution, and Google Play reviews (now that the Android launch dust has settled, mobile apps product management will be triaging the reviews). In partnerships work, the team met with Mozilla to talk about future plans for the Firefox OS HTML5 app (e.g., repurposing the existing mobile website, but without any feature reduction) and how Wikimedia search might be further integrated into Firefox OS, and also spoke with Canonical about how Wikipedia might be better integrated into the forthcoming Ubuntu Phone OS.
Routine pre- and post-launch configuration changes were made to support operator zero-rating, with routine technical assistance provided to operators and the partner management team to help add zero-rating and address anomalies. The team also continued its search for a third Partners engineering teammate.
Wikipedia Zero (partnerships)
We served an estimated 68 million free page views in July through Wikipedia Zero. We continue to bring new partners into the program, though none launched in July. Adele Vrana met with prospective partners and local Wikimedians in Brazil. We published our operating principles to increase transparency.
CLDR extension was updated to use CLDR 25; this work was mostly done by Ryan Kaldari. The team made various internationalization fixes in core, MobileFrontend, Wikipedia Android app, Flow, VisualEditor and other features. In the Translate extension, Niklas Laxström fixed ElasticSearchTTMServer to provide translation memory suggestions longer than one word; and improved translation memory suggestions for translation units containing variables (bug 67921).
An initial version was released on Beta Labs; it supports machine translation between Spanish and Catalan. The machine translation API leverages open source machine translation with Apertium. The tool supports experimental template adaptation between languages. Numerous bug fixes were made based on testing and user feedback. We worked on matching the Apertium version to the cluster, and planning for the next round of development has started.
Most admin tools resources are currently diverted towards SUL finalisation, which will greatly help in reducing the admin tools backlog. July saw the deployment of the global rename tool (bug 14862), and core fixes including the creation of the "viewsuppressed" userright (bug 20476).
Our deployment of CirrusSearch to larger wikis as the primary search back-end turned out to be too ambitious. After encountering performance issues, we rolled back this change. We are now addressing the root of the problem, by getting more servers (nearly doubling the cluster size) and putting together more optimizations to the portion of Cirrus that fell over (working set). If everything goes as planned, it'll be reduced by about 80%, by reducing indexing performance in return of search performance. These optimizations will slightly change result relevance; please let us know if you notice any issues.
Most work was spent on SUL Finalization tasks. Phpunit and browser tests were added for CentralAuth, global rename was deployed, and lots of small fixes were made to CentralAuth to clean up user accounts in preparation for finalization.
In July, the SUL finalisation team began work on completing the necessary feature work to support the SUL finalisation.
To help users with local-only accounts that are going to be forcibly renamed due to the SUL finalisation, the team is working on a form that lets those users request a rename. These requests will be forwarded onto the stewards to handle. The SUL team is currently in consultation with the stewards about how they would like this tool to work. When this consultation is wrapped up, the team will begin design and implementation.
To help users get globally renamed without having to request renames on potentially hundreds of wikis, the team implemented and deployed GlobalRenameUser, a tool which renames users globally. As the tool is designed to work post-finalisation, it only performs renames where the current name is global, and the requested name is totally untaken (no global account and no local accounts exist with that name).
To help users who get renamed by the finalisation and, despite our best efforts to reach out to them, did not get the chance to request a rename before the finalisation, the team is working on a feature to let users log in with their old credentials. The feature will display an interstitial when they log in, informing them that they logged in with old credentials and that they need to use new ones. We are also considering a persistent banner for those users, so that they definitely know they need to use their new credentials. An early beta version of this feature is complete, and now needs design and product refinements to be completed.
To help users who get renamed by the finalisation and, as a result, have several accounts that were previously local-only turned into separate global accounts, the team is working on a tool to merge global accounts. We chose to merge accounts as it was the easiest way to satisfy the use case without causing further local-global account clashes that would cause us to have to perform a second finalisation. The tool is in its preliminary stages.
The team also globalised some accounts that were not globalised but had no clashes. These accounts were either created in this local-only form due to bugs, or are accounts from before CentralAuth was deployed where the user never globalised. As these accounts had no clashes, there were no repercussions to globalising these accounts, so we did this immediately.
At present, no date has been chosen for the finalisation. The team plans to have the necessary engineering work done by the end of the quarter (end of September 2014), and have a date chosen by then.
Next month the team plans to continue work on these features.
This month, the Release and QA Team became the Release Engineering Team, mostly reflecting the transition of this team from being made up of members of other distinct teams to that of a coherent self-contained (mostly) team. This will, hopefully, allow better coordination of "Release" and "QA" things (broadly spreaking).
A lot of progress was made on making Phabricator suitable as a task/bug tracking system for Wikimedia projects. You can see the work to be sorted and completed at this workboard.
The Beta Cluster now runs with HHVM, bringing us much closer to full HHVM deployment. In addition, the Language Team deployed the new Content translation system on the Beta Cluster with the help of the Release Engineering team.
The second round of public RFP for third-party MediaWiki release management was conducted and concluded.
We now no longer use the third-party Cloudbees service for any of our Jenkins jobs and run all jobs locally. This will enable us to better diagnose issues with our build process, especially as it pertains to our browser tests (which still mostly run on SauceLabs).
This month, the QA team finished two significant achievements: after porting all the remaining browser tests from the browsertests repository to the repositories of the extensions being tested in June, as well as porting a significant set of tests to MediaWiki core itself, we completely retired the Jenkins instance running on a third-party host in favor of running test builds from the Wikimedia Jenkins instance, and we deleted the /qa/browsertests code repository. These moves are the result of more than two years of work. In addition, we have added more functions to the API wrapper used by browser tests, improved support for testing in Vagrant virtual machines, added new Jenkins builds for extensions, and improved the function of the beta labs test environments by preventing database locks and stopping users from being logged out by accident.
The browser tests are now all integrated with builds on the Wikimedia Jenkins host. We added browser tests for MediaWiki core that will validate the correctness of a MediaWiki installation regardless of language, or of what extensions may or may not exist on the wiki, so that the tests may be packaged with the distribution of MediaWiki itself and used on arbitrary wikis. We saw a lot of browser test activity for Flow development, and we are preparing to support even more extensions and features in the very near future.
In July, the multimedia team reviewed more feedback about Media Viewer, from three separate Requests for Comments on the English and German Wikipedias, as well as on Wikimedia Commons. Based on this community feedback, the team worked to make the tool more useful for readers, while addressing editor concerns. We are now considering a new 'minimal design', which would include: a much more visible link to the File: page; an even easier way to disable the tool; a caption or description right below the image; removing additional metadata below the image, directing users to the File: page instead.
As described in our improvements plan, these new features are being prototyped and will be carefully tested with target users in August, so we can validate their effectiveness before developing and deploying them in September. You can see some of our thinking in this presentation.
This month, we continued to work on the Structured Data project with the Wikidata team and many community members, to implement machine-readable data on Wikimedia Commons. We prepared to host a range on online and in-person discussions to plan this project with our communities, and aim to develop our first experiments in October, based on their recommendations. We also continued a major code refactoring for the UploadWizard, as well as fixed a number of bugs for some of our other multimedia tools.
Wikimetrics can now generate vital sign metrics for every project daily. Rolling Monthly Active Editor metric has been implemented; the reports are in JSON format, in a logical path hosted on a file server and downloadable. The team also worked on backfilling data for the daily reports on Newly Registered and Rolling Active Editor, and numerous optimizations to backfill the data quickly.
New nodes were added to the cluster this month and all machines were upgraded to run CDH5. The team decided not to preserve any data on the cluster during the upgrade and started fresh. The team hosted a Tech Talk on our Hadoop installation (see video and slides). Duplicate monitoring has also been implemented in Hadoop to monitor the incoming Varnish logs.
The culmination of our efforts this month can be visualized in a prototype built for Wikimania. This was made possible thanks to many back-end enhancements (optimizations) to Wikimetrics, along with research and selection of the optimal technologies to implement the stack to display a dashboard.
This month, we completed the documentation for the Active Editor Model, a set of metrics for observing sub-population trends and setting product team goals. We also engaged in further work on the new pageviews definition. An interim solution for Limited-duration Unique Client Identifiers (LUCIDs) was also developed and passed to the Analytics Engineering team for review.
We analyzed trends in mobile readership and contributions, with a particular focus on the tablet switchover and the release of the native Android app. We found that in the first half of 2014, mobile surpassed desktop in the rate at which new registered users become first-time editors and first-time active editors in many major projects, including the English Wikipedia. An update on mobile trends will be presented at the upcoming Monthly Metrics meeting on July 31.
Development of a standardised toolkit for geolocation, user agent parsing and accessing pageviews data was completed.
We supported the multimedia team in developing a research study to objectively measure the preference of Wikipedia editor and readers.
We have pre-release binaries of the next 0.9 (final) release. Except for OSX everything seems to work file as far. The support of RaspberryPi was finally merged to the kiwix-plug master branch; this offers new perspectives because the price to create a Kiwix-Plug has dropped to around USD 100. We also started an engineering collaboration with ebook reader manufacturer Bookeen (in the scope of the Malebooks project) to be able offer an offline version of Wikipedia on e-ink devices.
We participated in the Google Serve Day at Google Zurich. The goal was to meet Google engineers during one day and have them work on open source projects. The result was a dozen of fixed bugs and implemented features, mostly on Kiwix for Android, but also in Kiwix for desktop and MediaWiki.
Four developers had a one-week hackathon in Lyon, France to develop an offline version of the Gutenberg library. We're currently polishing the code and plan a release soon; our partners and sponsors plan the first deployments in Africa in Autumn.
Last but not least, a proof-of-concept of a Kiwix iOS app was made, so we might release a first app before the end of the year.
The biggest improvement around Wikidata in July is the release of the entity suggester. It makes it a lot easier to see what kind of information is missing on an item. Helen and Anjali, Wikidata's Outreach Program for Women interns, continued improving user documentation and outreach around Wikidata as well as worked on a new design for the main page. Guided Tours were published, helping newcomers find their way around the site. The developers further worked on supporting badges (like "featured article"), redirects between items, the monolingual text datatype (to be able to express things like the motto of a country) as well as the first implementation steps for the new user interface design. Additionally the first JSON dumps were published.
The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.