The Wikipedia Library/Search

The Wikipedia Library is integrating a search tool into the Library Card platform to enable users to search across the library's collections from one place. This project page summarises this work and will provide updates as it progresses.

Please leave comments, feedback, and questions on the talk page.

Background
Users of The Wikipedia Library have access to content from more than 60 publishers, most holding numerous different collections of content. Available content totals more than 100,000 periodicals, comprising countless individual sources that editors may wish to access, in addition to books, data, and other sources.

In its current form, the Library directs users to each publisher’s website individually, where users can then use the unique search and discovery capabilities of that website to search across their content. This presents a number of challenges to users. They ideally need to know which publishers have the content they want to access before searching, and must navigate a new website interface for every publisher they access. Advanced searches or filters, such as date ranges, need to be re-entered on each website. We therefore require users to have a high level of research literacy to identify publishers with relevant content and to potentially spend a long time searching before finding the right information. This leads to user frustration and confusion.

We want to provide editors with an easy way to search across all of their available collections from a single location, removing the need to visit individual websites and allowing cross-cutting filtering. We will present Library Bundle content (for which users simply need to meet an automatically verified activity threshold) as the default results, ensuring that users can navigate the results with confidence. We will also index free-to-read content and provide links to open access versions where possible.

Building a cross-publisher search platform is well out of scope for our team, and is a problem that other organisations are already solving. Major search products are already being used by libraries around the world, including Primo, WorldCat, and EBSCO Discovery Service. These products provide fully fledged search platforms and index collections from publishers, keeping them up-to-date for libraries.

EBSCO Discovery Service
We have a hosted instance of EBSCO Discovery Service (EDS) - a library holdings search platform - for this project. The platform was chosen for three primary reasons: our ongoing good relationship with EBSCO, a high level of customisability, and an interface with a substantial number of translated languages (~30 at the time of writing).

EDS can be configured to index content The Wikipedia Library has access to through its partnerships, and these databases are kept up to date by EBSCO, meaning we only need to flag the collections we want to index, and their contents will be updated automatically. EDS has a wide range of configurable settings for the interface presented to users. Most importantly, we can add additional Javascript and CSS to the interface to customise the user experience in more detail.

EBSCO also makes available a range of EBSCO Apps - technical solutions that have been developed to support specific workflows and use cases and are made available to all libraries. So far we have installed the following apps:


 * Unpaywall, which adds links to open access versions of results.
 * Zotero, which allows users to save a result to the bibliographic management software Zotero.

User stories

 * As a Library Card user, I want to search for content from all my collections in one place so I can find the right sources faster
 * As a Library Card user, I want to browse content from each collection I have access to in the same place so that I don’t need to learn and use multiple interfaces
 * As a Library Card user, I want flexible filtering and advanced search options so that I can find the most suitable content
 * As a Library Card user, I want to see open access links so that I can add free-to-read links to Wikipedia articles
 * As a Library Card user with little research experience, I want guidance on how to use the interface to enable effective research.
 * As a Wikipedia editor, I want to browse content available through The Wikipedia Library so that I can identify collections to apply for

Design
EBSCO Discovery Service comes with an out-of-the-box design which we would like to further customise. Many interface elements may not be needed, or could be confusing, and we want to ensure the design is consistent with the Library Card platform.

Design iterations will be posted here as we work on them.

Implementation
Integration details are tracked at https://phabricator.wikimedia.org/T240128 and its subtasks.

We will only be indexing content from the Library Bundle in the default view presented to users. This totals more than 60% of our content across ~25 collections. While we would ideally index all of our content, we feel that this would lead to a confusing user experience, where users can't easily understand which results they do or don't have access to. Users will have an option to browse all content, but individual results will not - at least in the initial deployment - highlight whether that content is accessible or not.