Collection Extension 2

This document describes a new version for the "collection extension" a.k.a. "book creator".

In a 2011 survey among Wikipedia readers, a large percentage of respondents expressed their interest in using Wikipedia offline (41%) and export articles to PDF (40%). Technically, this functionality is already provided by the collection extension, but obviously many users do not know about this feature and do not use it.

The goal of this effort is to make the collection and export functionality available to more Wikipedia users. We want to fundamentally question our assumptions on the way the collection extension works and come up with something that is significantly easier to use. This naturally entails changes in labeling, functionality and placement of the collection extension in mediawiki.

This document is a work in progress. Feedback is highly welcome.

Statistics for the collection extension
The collection extension was enabled in the English Wikipedia in May 2010. To understand the usage patterns, we analyzed HTTP logs covering 252 Days from Feb. 16, 2011 - Oct. 25, 2011. The logfiles contained anonymized session keys that made it possible to identify user behavior.

Data Source
The logfiles included only collection extension requests (special:Book). Page views to "regular" Wikipedia articles were not included. The findings were normalized per session: e.g. multiple PDF downloads of the same article in one session are counted only once. All numbers presented in the next section are averages per day over the 252 days of our analysis. We analyzed only data from the English Wikipedia.

A word of warning on the numbers: A cross-check of the number of downloads from HTTP logs with the logs of the PDF render servers (pdf[1-3].wikimedia.org) revealed significant differences. The PDF render servers reported about five times as many downloads. A plausible explanation might be that the HTTP logs captured only a fraction of the actual traffic.

The Data for uploads and orders for PediaPress books is taken from PediaPress logs.

Results
The collection extension is used only by a tiny fraction of Wikipedia users. Wikipedia receives about 8 billion page views and 400 million unique users per month. Just by dividing page views by unique users we assume an average of 20 page views per session. Approximately 250 million page views per day would therefor translate to about 12.5 million user sessions per day.



Compared to these 12.5 million sessions, the number of PDF download sessions is tiny: only 12,698 sessions (0.1%) interacted with the collection extension and only 7,450 sessions contained downloads (0.06%).



In the 7,450 download sessions the users downloaded 15,309 files. 15,230 of these files were PDF files, 66 were in OpenDocument format, and 13 in ZIM format. Only 462 of the PDF files contained a collection of more than one article.



Most of the users download single articles in PDF format. The book creator is used only by very few users. Our logfile recorded 2,958 clicks on the "Create a book" link in the left navigation column. Only 966 proceeded past the "Start book creator" page and actually activated the book creator toolbar. 232 uploaded their collection to PediaPress and 4 actually ordered a book.

Metaphor
The collection extension allows Wikipedia users to collect articles and store them for later. Right now the collection extension uses a book metaphor to present its functionality to the users ("create a book", "insert chapter"). The results from the user survey and the logfile analysis show that this metaphor does not work too well. Wikipedia users said they wanted to "save articles for offline reading" (41%) or "bookmark articles for later viewing or repeated viewings" (36%). So it does not come as a surprise that people do not look for this functionality under "Create a book".

The goal of this rebranding effort is to search for a metaphor that better relates to the mental model of users.

The concept or "interaction pattern" of collections is very well known on the web, but it is quite often disguised in different metaphors. Erin Malone for example divides Collecting into the related articles "Saving", "Favorites", "Tagging" and "Displaying". These activities only partially match what a user can do with the collection extension:


 * The Articles themselves are not "saved" by the collection extension- the user only saves a pointer to the original item and not a copy.
 * "Favorites" are a closer match in terms of functionality but they imply a rating of the article that might not be appropiate for our collections.
 * "Tagging" is a completely different activity and
 * "Displaying" is implemented in the "Manage your book" page.

Another solution might be the "shopping cart" metaphor. It fits pretty well in terms of functionality, but implies commercial activities that are not required and not intended by most users.

Only recently, a new variation of the collection pattern "Read later" emerged that was propagated by tools/companies like Instapaper, Readability and ReadItLater. "Read later" seems to be the best match to the user needs expressed in the survey. Although the collection extension offers a slightly different functionality, we will explore this concept further.

Information architecture
The following images show the current user flow when interacting with the collection extension.



When a user clicks on "create a book" the extension checks (via Cookie) whether the user already has an existing collection. If so, it displays a JavaScript popup and asks if he wants to continue his previous collection. If the user clicks "OK", he is taken to the "Manage Collection" page. Otherwise the "Start Book Creator" with an explanation of the functionality is shown. When the user clicks on "Start Book Creator" he is taken back to the previous article and the Collection Toolbar is enabled. The whole starting process seems to be overly complicated and could be greatly simplified.





The new collection extension should be always active and operate modeless. This means that the controls that are currently presented in the toolbar should be visible all the time and integrated into the layout. That way, articles can be added to and removed from the collection at any time.

The "Suggest Articles" feature should be integrated into the "Manage Collection" page. There is no need for a separate page.



Website Placement and Layout
Commercial websites often use a toolbar or toolbox to link to print versions and display social media features. This functionality is most often placed "above the fold" to be instantly accessible for readers. Wikipedia uses a top navigation area to display a search box and various buttons that are useful mainly for editors. Therefor, print and collection functionality should be placed inside the main content of the page.

Wireframe Layout for placing the print and Collection extension buttons
The following wireframes illustrate the new ideas for the collection extension. They are work in progress and not pixel-perfect.




 * The "Print" and "Read Later" buttons are placed inside the "content pane".
 * The buttons are aligned with the page title.
 * A click on each of the buttons opens a panel that floats on top of the page.
 * In the "Print"-Panel users can find the link to the "Printable version" and to "Download as PDF"
 * Articles can be added and removed directly from the "Read Later"-panel.
 * The collection extension is always active. There is no need to activate it manually.
 * The "Show List"-link takes users to the "Manage Collection" page.

Wireframe for the "Manage collection" page
The layout of the "Manage Collection" page can also be simplified and adjusted to fit the new "Reading List" metaphor.




 * Changed The title of the page is changed to "Manage Your Reading Lists" to reflect the change in metaphor.
 * New The page is divided into two columns.
 * New The left left side displays an active list of articles, the right side displays related functions like Multi-list management, export, and sharing.
 * Changed New items are added on top of the list. By default, the displays the items in reverse chronological order. (This is the default for reading lists.)
 * New When you hover over a list item, three buttons are displayed ("Move", "Delete" and "Rename").
 * Changed When you click on the article title, the article opens in the current window (normal link behavior like everywhere else in Wikipedia).
 * Changed The default title of the list is "Reading List". The list can be renamed.
 * Changed On top of the reading list is a horizontal toolbar with "Insert Folder", "Sort", and "Empty List".
 * New The list of suggested articles is displayed below the regular Reading List. As this section might not be of interest to all users, it could possibly be closed or inactive by default.
 * New Users can easily manage multiple lists. The content of the lists is stored either via HTML5 local storage or -for logged in users- directly in Wikipedia. The latter has the advantage of keeping your lists when accessing Wikipedia through multiple devices.
 * Changed Only one list can be active at a time. Lists are automatically saved. Lists can be swapped by clicking on the list name in the "Your Reading Lists" section.
 * New Lists can be deleted by hovering over a list and clicking a "delete" button.
 * Changed Instead of a drop-down-field the various export formats are displayed with graphical icons.
 * Changed Free export formats are displayed before the PediaPress export.
 * Changed When a list is shared, it has to be transformed into a book first. This is handled by a different form which is not part of the "Manage Collection" page. There a user can enter the title, subtitle, editor and description for the book and select the cover image and color.