Wikipedia.org Portal Improvements

The Wikipedia.org Portal Improvements project is an effort to improve http://www.wikipedia.org portal page, to better serve Wikipedia users. Some of the earliest efforts in this project are around measuring how our users use the portal page, learning our user expectation around this page and how we can better serve them.

Rationale
The Wikimedia Discovery team wants to improve the experience for the users visiting Wikipedia.org portal page and help them get to the information they are looking for as fast as possible. Wikipedia.org portal page gets around 10 million daily pageviews, which is roughly 1.5-2% of our total page views. To better serve this large audience, we want to make sure that this page is easy to understand and use. We want to increase the usability of the search text box, so more people will run a search query and get to their preferred article faster. We want to decrease the number of people who leave the page without continuing on to Wikipedia. We also want to better understand what people expect from this page so we can address their needs. In the future, we want to serve language, location and time specific content on this page too.

Initial Goals

 * Decrease bounce rate on wikipedia.org portal page
 * Increase searches and clicks on results from the search textbox
 * Increase clickthrough to non-top-10-language wikis
 * Increase clickthrough for other Wikimedia projects

A/B tests
As part of this initiative the Discovery team will be running a few quick A/B tests on Wikipedia.org to measure if the proposed improvements are beneficial to our users. Here is a tentative list of the first few A/B tests with the corresponding visual mocks ups of the proposed changes:
 * 1) A bigger and more prominent search box and search button.
 * 2) Showing thumbnail images and Wikidata descriptions in typeahead suggestion box
 * 3) Collapsing the Wikipedia globe, moving search box to the top
 * 4) Collapsing non-top-10-language Wiki links into a Universal Language Selector for better usability

Technical issue: Shifting the portal code to gerrit
NOTE: This section is a DRAFT

Although the existing system (storing the portal content in a template on meta) has worked until now, it cannot effectively support the workflows that will be necessary to make the desired improvements. Basically, the portal itself needs to be treated more like a piece of code, and less like static content.

Working closely with the long-time portal maintainer (Minh Nguyễn), the Discovery team came up with a plan to shift the portal code and content into a git (gerrit) repository. This would allow multiple developers to work on the portal without interfering with each other, using tools and practices already in use on mediawiki and other wikimedia projects.

Benefits of shifting to gerrit
Easier manipulation of files. The portal consists of multiple files, including HTML, Javascript, and CSS. In the current system, each file is a separate wiki page, so to commit a change that involves multiple files, you would have to separately edit and save each page, and each save would be treated as a separate commit. With Git, all the changes to all the affected files would be stored as a single, automic commit, which could easily be reviewed, merged, or rolled back.

Also, it would be useful to split some of the existing files into multiple files. Git would handle that easily, whereas in the current system, it would make the commit situation even more cumbersome and risky.

Allows development on multiple branches. With the existing system, experimental branches would have to be done on separate copies of the template pages. Merging work in either direction, between the mainline and a branch, or between two branches, would be extremely painful. Git has excellent branching features, which are widely known and used.

Allows the use of standard and modern software development tools. The existing meta template system leverages the easy editing of a wiki, but has an unusual back-end pipeline that converts the meta templates into files that can be served. Moving to git would allow a more conventional deployment system that would copy files from a repository onto a server.

Development would also become easier, because there are so many tools to deal with locally-stored files, including code formatters and style validators, post-processors, previewers, and debuggers. Gerrit commits can automatically be run through a test suite, making it harder for bugs to get through. Most developers are familiar with git, and gerrit is the standard version control system for Mediawiki-related software.

Easier to replace manually-updated values with code snippets to update them automatically. Although this is NOT a part of the Discovery portal improvement initiative, this would be a nice side benefit. The portal maintainer believes that shifting to gerrit would make it easier for him (or others) to automate some of the content on the page.

Objections to shifting to gerrit
Limits on who will be allowed to contribute. Commit rights will be granted liberally. The goal is not to limit who can work on the page, so every effort will be made to allow interested people to contribute.

Higher barrier for user contributions. It is true that committing via gerrit is more difficult than simply editing a template on meta. However, these pages don't receive a lot of external contributions, so the actual effect shouldn't be large. When looking at the portal as software, rather than content, a higher technical barrier is not unreasonable.

Changes won't show up on the site as quickly. The current mechanism allows edits to the page to be viewed in production within about an hour. Initially, with the new mechanism, commits will not appear in production unless/until they are manually deployed, which typically would not be more than once per day. However, a new deployment tool (scap3) is being developed, which might eventually allow changes to be deployed automatically. In that case, it might deploy new portal code hourly, or at whatever interval makes sense.

"Gerrit has less monitoring and integration with Meta-Wiki". The Discovery team, and the long-time portal maintainer, will both be monitoring commits in gerrit. While the current integration with the meta-wiki community is helpful, the trade-off is that the current system is less integrated with the engineering community.

Why not add functionality through the existing template system first? The Discovery team is responsible for achieving a quarterly goal of starting to improve the wikipedia.org portal in measurable ways. Two developers are ready to add event logging, an A/B test framework, and actual UI changes to the page. All of that work would be easier (and safer) in git, so it makes sense to switch now.

Tentative Future Ideas

 * Bringing in relevant content from other Wikimedia projects on this page
 * Recommending articles to read and edit
 * Showing trending articles to read and edit
 * Localize the content on this page depending on where the visitor is visiting from
 * Show more timely content
 * More visual content
 * Announcements for new tools and features (wmf & community)

Portal Experiments, Research Documentation
For more information about how our data analysis team is working on this project please refer to: Research: Portal experiments