Page Previews/2016 A/B Tests

To determine changes in reading behavior and evaluate the success of the Hovercards beta feature, three A/B tests were conducted on the Hungarian, Italian, and Russian Wikipedias. The A/B test on the Hungarian Wikipedia started on 7 June 2016. The A/B tests on Italian and Russian Wikipedia started on 23 September 2016.

The team's goals were to establish the success of the Hovercards feature, measure the effect on user engagement, and compare these tests to previous quantitative and qualitative results. Hovercards performed well in an earlier test on Greek and Catalan Wikipedias, with users reporting positive interactions with the feature. We wanted to double-check these results on a larger scale, while learning more about user’s reading behavior.

From the results of these tests, we can conclude that the Hovercards feature is employed by users on all tested Wikipedias and that Hovercards facilitate positive changes in reading behavior in three ways. Disclaimer: We would like to point out that a number of issues with the instrumentation of the Hovercards tests were recorded during this period. As a means to mitigate for these issues, we will be listing the results from the Firefox browser only while noting that open questions on the accuracy of the results still remain.
 * By increasing the precision with which users select the pages they read.
 * Reducing the cost of exploration of other pages.
 * Allowing users to selectively focus on a single topic by providing context within a page.

Introduction
Hovercards are designed to reduce the cost of exploration of a link, as well as to promote learning by allowing readers to gain context on the article they are reading or to define an unfamiliar event, idea, object, or term without navigating away from their original topic.

Through our analysis, we wanted to study user behavior towards the Hovercards feature to answer the following questions:


 * 1) Do people use Hovercards to preview other pages? Since interacting with the content of a different page no longer requires users to navigate away from the article they are currently reading, we expect users to interact with more links within a given page when Hovercards are turned on.
 * 2) Rates of users who disable Hovercards - a high rate in disabling Hovercards would indicate that this is not a feature users would want. A low rate indicates user like the feature and would continue using it.

We also wanted to study how Hovercards change reading behavior:
 * 1) Changes in session depth - readers will only be opening the pages they are particularly interested in and viewing summaries on all other pages. Thus, we expect to see a decrease in pages viewed per browser session when Hovercards are turned on.
 * 2) Changes in link interactions - as users will be interacting with links differently before, we expect a decrease shorter link interactions for the Hovercards on group - as users will be avoiding opening Hovercards for articles they do not want extra information on. We expect an increase in longer link interactions, as users will be hovering over links for longer periods of time to ensure their Hovercards open.
 * 3) Impact on fundraising - we are hypothesizing that implementing Hovercards will result in a decrease in pageviews. We wanted to measure the impact of Hovercards on donations.

Methodology
Hungarian Wikipedia was selected for the first experiment, as it was a mid-sized Wikipedia where the community was receptive to testing Hovercards. For the latter experiments, we wanted to replicate the results within larger Wikipedias, selecting Russian and Italian based on community response. For Hungarian Wikipedia, Hovercards were enabled for 50% of anonymous and logged-in browsers sessions on June 7. For Italian Wikipedia, Hovercards were enabled for 20% of anonymous browser sessions on Sept 23. For Russian Wikipedia, Hovercards were enabled for 10% of anonymous browser sessions on Sept 23. The sample rates for each Wikipedia were designed such that sufficient data could be collected without burdening our EventLogging servers. The Hovercards test is scheduled to end on Nov 18, with the possibility of extending the test to account for technical issues in the instrumentation observed while running the tests. Results displayed below are from the period of Sept 23 - Nov 10.

Data collection: Due to instrumentation issues within our data collection, we have employed the following restrictions for presenting the results below:
 * Removal of duplicates - more info: https://phabricator.wikimedia.org/T140485
 * Restricting data to Firefox - more info: https://phabricator.wikimedia.org/T146840
 * Other concerns we are currently considering:
 * https://phabricator.wikimedia.org/T145665
 * https://phabricator.wikimedia.org/T141922

Frequency of Hovercards usage
We examined how often readers looked intentionally at Hovercards (determined as a card showing for at least 1.5 seconds without the reader either clicking through to the linked article or dismissing the card by moving away the mouse). The average number of cards looked at per pageview was 0.28 on the Hungarian Wikipedia, 0.30 on the Italian Wikipedia, and 0.43 on the Russian Wikipedia. Per browser session, 0.65 cards were viewed on the Hungarian Wikipedia, 0.89 on the Italian Wikipedia and 1.43 on the Russian Wikipedia (as of Nov. 1)

Hovercards usage was on average highest for Russian Wikipedia. No notable changes in usage were observed over time.



Rate of Disabling of Hovercards Feature
The disable rate (the ratio of sessions in which users decided to disable the Hovercards feature) was very low: the rate of clicking the settings cog for Hovercards was 0.02% for Hungarian Wikipedia, 0.034% for Italian Wikipedia, and 0.016% for Russian Wikipedia. The rates for disabling the feature were even lower.

Note: We know from users tests (on English Wikipedia) that readers are likely to find this settings menu if they want to disable Hovercards.

Changes in session depth
Average session depth (average number of pages viewed per session) remained relatively equal between the Hovercards on and off groups. Session depth was higher for the Hovercards off group for Russian Wikipedia, and higher for the Hovercards on group for Hungarian and Italian Wikipedias.

An increase in session depth was observed for Hungarian and Italian Wikipedia throughout the duration of the A/B test. We will be investigating the reasons behind the increase in further iterations of our analysis.



Changes in link interactions
Link interactions, when restricted to >300ms to confirm for intentionality, were relatively similar for the Hovercards on and off groups, with more interactions recorded for the Hovercards off group for Italian and Hungarian Wikipedias, and less for Russian Wikipedia. However, when interactions were restricted to greater than 1s (to model the relative amount of time for a user to begin processing in information provided by the Hovercard), an average of 41% increase was observed. Link interactions were highest for Russian Wikipedia.

Link interactions > 1000ms

Link interactions > 300ms

Clickthrough Rates for Hovercards
The average clickthrough rate for Hovercards ON groups was 0.079 across the three tested Wikipedias. Details below:

Impact on fundraising
Results from fundraising test on Hungarian: https://www.mediawiki.org/wiki/Extension:Popups/Fundraising_test_1. Fundraising test on Italian and Russian Wikipedias is in progress.

Frequency of Hovercards usage:
All three tested Wikipedias showed very high rates of Hovercards usage, indicating that readers find this feature useful.

Further, if we define page interactions as any time a user interacts with the content of another page and compare the rate of page interactions for the Hovercards off and Hovercards on groups, we get a 31% increase in overall page interactions, allowing us to conclude that Hovercards make content of all articles more available for readers. This confirms the hypothesis that Hovercards lower the cost of exploration of a link, making users more likely to explore multiple pages while reading without the cost of navigating to a new page. (note: the number of total interactions per session includes multiple interactions with the same page at different times. We are correcting for users interacting with the same page simultaneously through a hovercard and a pageview by subtracting the clickthrough for hovercards):

$$\text{total page interactions}_{Hovercards ON} = \text{Total Hovercards interactions per session} + \text{Total pages viewed per session} - \text{Clickthrough when a hovercard is seen} = $$ $$0.99 + 2.88 - 0.080 = 3.79 \text{ page interactions/session}$$ $$\text{total page interactions}_{Hovercards OFF} = 2.88 \text{ page interactions/session}$$

We seem to also be observing significant differences within each Wikipedia, with the average frequency of usage on Russian Wikipedia notably higher than Hungarian and Italian Wikipedias. These differences could potentially be due to the larger availability of articles and page links on larger Wikipedias, although we would have to continue with further analysis to establish this claim. No noted increase or decrease of usage was observed over time.

Rate of disabling Hovercards
The rate of disabling Hovercards was extremely low for all three tested Wikipedias, with an average of 0.02%. To confirm that this rate was not due to usability issues, we cross-checked our qualitative test results, where all tested users disabled the feature without difficulty. We can thus conclude that users like the feature and do not wish to disable it. We can also conclude that the feature does not get in the way of a user’s ability to read each article.

For more info on our qualitative data: https://www.mediawiki.org/wiki/Wikimedia_Research/Design_Research/Reading_Team_UX_Research/Hovercards_Usability

Changes in Session Depth
Contrary to our previous hypothesis, we did not see a large decrease in session depth across all Wikipedias. While we would like to investigate further into the validity of these results, an early hypothesis is that Hovercards provide users with more precision when opening an article. Thus, the cost of exploration of a new page is lowered and users are opening pages they would have previously skipped.

Interactions within a given page
An increase in longer link interactions per page suggests that users are more likely to interact with other pages using the Hovercards feature when not required to navigate away from the article they are reading. However, the observed flat rate of total link interactions per session indicates that the feature may not be promoting readers to interact with more pages overall. The discrepancy between the two suggests that readers may be reading further into each selected page. However, this hypothesis will have to be confirmed by measuring the amount of time users spend on a page in the Hovercards off and on groups.

Note: we are planning on adding this metric for future analysis. For more info, see: https://phabricator.wikimedia.org/T147314

Conclusion
The sets of A/B tests allowed us to confirm our previous results seen in the Greek and Catalan tests for Hovercards. Users are employing the Hovercards feature at a high rate, indicating that they find the feature useful. Rates of disabling the feature are very low, indicating that Hovercards are not getting in the way of the user’s preferred reading experience, as well as beneficial to their current reading experience.

Changes in link interactions and session depth give us further glimpses into the ways in which Hovercards may alter the reading workflow. These sections will be studied further in the future. Based on our initial results, we would like to explore whether users are viewing more pages due to a decrease in the cost of exploration of other pages. We would also like to study individual user sessions, to note other changes in reading patterns.

Hovercards save time for users, and allow them to explore topics in depth, without being distracted by other topics or confused by unfamiliar references.

Next Steps
Due to the positive results above, we plan on promoting Hovercards out of beta, beginning with smaller Wikipedias, and working up to all appropriate Wikimedia projects.

Data sources
Cards Viewed SELECT wiki, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_sessionToken) AS cards_viewed_per_session, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_pageToken) AS cards_viewed_per_page, COUNT(DISTINCT event_sessionToken) AS sessions, COUNT(DISTINCT event_pageToken) AS pageviews FROM log.Popups_15906495 WHERE wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND event_popupEnabled = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161023' AND (event_action = 'pageLoaded'  OR (event_totalInteractionTime >= event_popupDelay + event_perceivedWait + 1000 )) # card seen for at least one second AND INSTR(userAgent,'Firefox') AND NOT INSTR(userAgent,'Seamonkey') GROUP BY wiki, event_popupEnabled ORDER BY wiki, event_popupEnabled;

Link interactions per page (>300ms) SELECT wiki, event_popupEnabled, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_pageToken) AS link_interactions_per_page FROM log.Popups_15906495 WHERE (event_action = 'pageLoaded' OR event_totalInteractionTime > 300) AND wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161030' AND INSTR(userAgent,'Firefox') AND NOT INSTR(userAgent,'Seamonkey') AND (event_action = 'pageLoaded' OR event_linkInteractionToken IS NOT NULL) GROUP BY wiki, event_popupEnabled

Link interactions per session (>300ms) SELECT wiki, event_popupEnabled, COUNT(DISTINCT event_linkInteractionToken)/COUNT(DISTINCT event_sessionToken) AS link_interactions_per_session, COUNT(DISTINCT event_sessionToken) AS sessions FROM log.Popups_15906495 WHERE (event_action = 'pageLoaded' OR event_totalInteractionTime > 300) AND wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161030' AND INSTR(userAgent,'Firefox') AND NOT INSTR(userAgent,'Seamonkey') AND (event_action = 'pageLoaded' OR event_linkInteractionToken IS NOT NULL) GROUP BY wiki, event_popupEnabled

Disables (upper bound: settings cog clicks SELECT wiki,  SUM(1) AS Hovercards_shown,   SUM(IF(event_action = 'tapped settings cog',1,0)) AS cogtaps  FROM log.Popups_15906495  WHERE   event_popupEnabled = 1  AND wiki IN ('huwiki', 'itwiki', 'ruwiki')  AND event_isAnon = 1  AND LEFT(timestamp, 8) >= '20160925'  AND LEFT(timestamp, 8) < '20161030'  AND event_linkInteractionToken IS NOT NULL  AND event_totalInteractionTime > event_perceivedWait # i.e. card was shown  GROUP BY wiki  ORDER BY wiki;

Pageviews per session (session depth) SELECT wiki, event_popupEnabled, COUNT(*) AS pageloaded_events, COUNT(DISTINCT event_pageToken) AS pageTokens, COUNT(DISTINCT event_sessionToken) AS sessions, COUNT(DISTINCT event_pageToken)/COUNT(DISTINCT event_sessionToken) AS page_tokens_per_session FROM log.Popups_15906495 WHERE wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161030' AND event_action = 'pageLoaded' AND INSTR(userAgent,'Firefox') AND NOT INSTR(userAgent,'Seamonkey') GROUP BY wiki, event_popupEnabled ORDER BY wiki, event_popupEnabled;

'''Clickthrough rate (for "seen" Hovercards): SELECT wiki, SUM(IF(event_action LIKE 'opened%',1,0))/SUM(1) AS clickthrough_ratio, SUM(1) AS cards_seen FROM log.Popups_15906495 WHERE event_popupEnabled = 1 AND wiki IN ('huwiki', 'itwiki', 'ruwiki') AND event_isAnon = 1 AND LEFT(timestamp, 8) >= '20160925' AND LEFT(timestamp, 8) < '20161030' AND event_totalInteractionTime > event_perceivedWait + 1000 AND event_linkInteractionToken IS NOT NULL GROUP BY wiki