Page Previews/2016 A/B Tests

=Hovercards A/B Test Results=

To determine changes in reading behavior and evaluate the success of the hovercards beta feature, three A/B tests were launched on Hungarian, Italian, and Russian Wikipedias. The A/B test on Hungarian was launched June 7, 2016, the A/B tests on Italian and Russian were launched Sept 23, 2016.

Our goals were to establish the success of the hovercards feature, measure the effect on user engagement, and compare to previous quantitative and qualitative results. Hovercards performed well in an earlier test on Greek and Catalan Wikipedias, with users reporting positive interactions with the feature. We wanted to double-check these results on a larger scale, while learning more about user’s reading behavior.

From the results of these tests, we can conclude that hovercards facilitate positive changes in reading behavior by increasing the precision with which users select the pages they read, allowing users to selectively focus on a single topic by providing context within a page, and reducing the cost of exploration. We do see a single-digit drop in pageviews when hovercards are enabled, and tie this directly to users answering more of their questions without needless navigation.

Disclaimer: We would like to point out that a number of issues with the instrumentation of the hovercards tests were recorded during this period. As a means to mitigate for these issues, we will be listing the results from the Firefox browser only while noting that open questions on the accuracy of the results still remain.

Introduction
Hovercards are designed to reduce the cost of exploration of a link, as well as to promote learning by allowing readers to gain context on the article they are reading or to define an unfamiliar term, object, event, or idea without navigating away from their original topic.

Through our analysis, we wanted to study user’s behavior towards the hovercards feature to answer the following questions:


 * 1) Do people use hovercards to preview other pages? Since interacting with the content of a different page no longer requires users to navigate away from the article they are currently reading, we expect users to interact with more links within a given page when hovercards are turned on.
 * 2) Rates of users who disable hovercards - a high rate in disabling hovercards would indicate that this is not a feature users would want. A low rate indicates user like the feature and would continue using it.

We also wanted to study how hovercards change reading behavior:
 * 1) Changes in session depth - readers will only be opening the pages they are particularly interested in and viewing summaries on all other pages. Thus, we expect to see a decrease in pages viewed per browser session when hovercards are turned on.
 * 2) Changes in back-button usage - since users will be able to gauge the value of opening a new page through reading its hovercards summary, we expect users to be less likely to navigate to a page they are not interested in, and thus less likely to press the back button when hovercards are turned on. Additionally, since users may open other pages to gain context on the pages they are currently reading, we predict sufficient context will be provided by the hovercards in some cases, making the opening of another page for this purpose unnecessary.
 * 3) Impact on fundraising - we are hypothesizing that implementing hovercards will result in a decrease in pageviews. We wanted to measure the impact of hovercards on donations.

Methodology
Hungarian Wikipedia was selected for the first experiment, as it was a mid-sized Wikipedia where the community was receptive to testing hovercards. For the latter experiments, we wanted to replicate the results within larger Wikipedias, selecting Russian and Italian based on community response. For Hungarian Wikipedia, hovercards were enabled for 50% of anonymous and logged-in browsers sessions on June 7. For Italian Wikipedia, hovercards were enabled for 20% of anonymous browser sessions on Sept 23. For Russian Wikipedia, hovercards were enabled for 10% of anonymous browser sessions on Sept 23. The sample rates for each Wikipedia were designed such that sufficient data could be collected without burdening our EventLogging servers. The hovercards test is scheduled to end on Oct 31, with the possibility of extending the test to account for technical issues in the instrumentation observed while running the tests. Data collection: Due to instrumentation issues within our data collection, we have employed the following restrictions for presenting the results below:
 * Removal of duplicates - more info: https://phabricator.wikimedia.org/T140485
 * Restricting data to Firefox - more info: https://phabricator.wikimedia.org/T146840
 * Other concerns we are currently considering:
 * https://phabricator.wikimedia.org/T145665
 * https://phabricator.wikimedia.org/T141922

Frequency of Hovercards usage
We examined how often readers looked intentionally at hovercards (determined as a card showing for at least one second without the reader either clicking through to the linked article or dismissing the card by moving away the mouse). The average number of cards looked at per pageview was 0.28 on the Hungarian Wikipedia, 0.3 on the Italian Wikipedia, and 0.42 on the Russian Wikipedia. Per browser session, 0.63 cards were viewed on the Hungarian Wikipedia, 0.87 on the Italian Wikipedia and 1.42 on the Russian Wikipedia. [speculate a bit about the inter-wiki differences?]

Interactions within a given page
All three Wikipedias reported increased interactions with links per pageview, with a x% increase for Hungarian Wikipedia (Dates), a y% increase for Italian Wikipedia, and a z% increase for Russian Wikipedia (Dates), with interactions restricted to >300ms.

No significant changes were observed in the amount of link interactions per session, with a x% difference for Hungarian Wikipedia (Dates), a y% difference for Italian Wikipedia, and a z% difference for Russian Wikipedia (Dates)

The clickthrough ratio - the amount that users clicked to another page after viewing a hovercard was X% for Hungarian Wikipedia (dates), Y% for Italian Wikipedia, and Z% for Russian Wikipedia (dates)

Changes in session depth
Session depth (average number of pages viewed per session) decreased for all three wikipedias, with a x% decrease for Hungarian Wikipedia (Dates), a y% decrease for Italian Wikipedia, and a z% decrease for Russian Wikipedia (Dates)

A decrease was also observed in the number of links clicked per page (number of pages opened from a single page), with a x% decrease for Hungarian Wikipedia (Dates), a y% decrease for Italian Wikipedia, and a z% decrease for Russian Wikipedia (Dates)

Impact on fundraising
Results from fundraising test on hungarian: https://www.mediawiki.org/wiki/Extension:Popups/Fundraising_test_1 Fundraising test on italian and russian wikipedias in progress

Rate of disabling hovercards
The rate of disabling hovercards was very low - x% for Hungarian (dates), y% for Italian, and x% for Russian (dates)

Frequency of Hovercards usage:
An increase in link interactions per page suggests that users are more likely to interact with other pages using the hovercards feature when not required to navigate away from the article they are reading. This confirms the hypothesis that hovercards lower the cost of exploration of a link, making users more likely to explore multiple pages while reading without the cost of navigating to a new page. However, the observed flat rate of link interactions per session indicates that the feature may not be promoting readers to interact with more pages overall. The discrepancy between the two suggests that readers may be reading further into each selected page. However, this hypothesis will have to be confirmed by measuring the amount of time users spend on individual hovercards in the hovercards off and on groups.

Note: we are planning on adding this metric for future analysis. For more info, see: https://phabricator.wikimedia.org/T147314

Changes in session depth
The overall decrease in session depth and number of links clicked per page suggests that the utility provided by hovercards is sufficient to users. We can hypothesize that users were previously opening new pages to view the information provided by the hovercard. With that information readily available, users no longer need to open as many pages as before hovercards were implemented.

Decreased back-button usage
{note: Confirm first before writing this section}

Rate of disabling hovercards
The rate of disabling hovercards was extremely low for all three tested Wikipedias, with X% for Hungarian, y% for Italian, and z% for Russian. To confirm that this rate was not due to usability issues, we cross-checked our qualitative test results, where all tested users disabled the feature without difficulty. We can thus conclude that users do not disable the feature because they like it.

For more info on our qualitative data: https://www.mediawiki.org/wiki/Wikimedia_Research/Design_Research/Reading_Team_UX_Research/Hovercards_Usability

Conclusion
Hovercards lower cost of exploration: pages which users would have clicked through to in the past are now opened less often, indicating that users are gaining the value they would require from these pages from the summaries they view within the hovercards. A decrease in back-button usage indicates that users were able to conclude whether an article was not useful or interesting from its hovercards summary. In the past, users would have to open the new article, decide they were not interested, and click the back button to navigate to their previous location. Now, users can selectively target the articles they are particularly interested in.

Hovercards save time for users, and allow them to explore topics in depth, without being distracted by other topics or confused by unfamiliar references.

The high rates of usage and low rates of disabling also indicate that users like the feature and find it valuable.

Next Steps
Due to the positive results above, we plan on promoting hovercards out of beta, beginning with smaller wikipedias, and working up to all appropriate wikimedia projects. .

Data sources:
SELECT wiki, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_sessionToken) AS cards_viewed_per_session, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_pageToken) AS cards_viewed_per_page, COUNT(DISTINCT event_sessionToken) AS sessions, COUNT(DISTINCT event_pageToken) AS pageviews FROM log.Popups_15906495 WHERE wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND event_popupEnabled = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161023' AND (event_action = 'pageLoaded'  OR (event_totalInteractionTime >= event_popupDelay + event_perceivedWait + 1000 )) # card seen for at least one second AND INSTR(userAgent,'Firefox') AND NOT INSTR(userAgent,'Seamonkey') GROUP BY wiki, event_popupEnabled ORDER BY wiki, event_popupEnabled;

++--+---+--+---+ | wiki   | cards_viewed_per_session | cards_viewed_per_page | sessions | pageviews | ++--+---+--+---+ | huwiki |                   0.6343 |                0.2777 |    91726 |    209548 | | itwiki |                  0.8708 |                0.3016 |    30381 |     87720 | | ruwiki |                  1.4167 |                0.4198 |    13960 |     47114 | ++--+---+--+---+ 3 rows in set (10 min 11.54 sec)