Note that there was a discussion with the community and previous Executive Director of the Wikimedia Foundation on this topic: Knowledge Engine/FAQ
What is the Knowledge Engine?
"Knowledge Engine" was an early term used to describe a number of initiatives that related to search and discovery of content across Wikimedia projects. It was referenced by the Knight Foundation under ("What we fund / Journalism / Knowledge Engine By Wikipedia"), and stated: "To advance new models for finding information by supporting stage one development of the Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet."
There were requests to publish the details of the grant and Lila Tretikov shared her thoughts and the grants activities and outcomes. The Knight Foundation agreed to the publication of the grant application in February 2016.
As the concept evolved, it included a wide variety of ideas, many of which ended up being discarded. Although the original grant referred to "Knowledge Engine by Wikipedia" as "the Internet's first transparent search engine," the term "Knowledge Engine" was also used to easily refer to what the Discovery team was focusing on rather than a new project. The Wikimedia Foundation has stopped using the term as of Q3 FY15-16 as it caused confusion.
Regardless of what "Knowledge Engine" may have meant to different people at different times in the past, this page reflects the current thinking and plans, as understood by the Discovery department.
Are you building a search engine to compete with Google?
No, as we said when this FAQ was created in early November 2015, we are not creating a general internet search engine. The team has been tasked with improving a search function for Wikimedia sites.
In the sense that every site on the Internet competes for attention, and especially as a "go-to" resource for users, Wikipedia and its sister sites compete with Google and other search engines for certain kinds of inquiries. Namely, people who want to learn. Our visitors are people who want information and knowledge, and our mission is to serve them well. Our current search function does not serve our users well, so we are improving the existing CirrusSearch infrastructure with better relevance, multi language, multi project search and incorporating new data sources (like Maps) for our projects. We want a relevant and consistent experience for users across searches for both wikipedia.org and our project sites. Looking farther forward, we will explore including other sources of open knowledge. We remain fully committed to the movement's vision and values.
In the past some within the WMF talked about building a noncommercial search engine. The WMF thought about that direction but that did not go anywhere as an actual deliverable. By the the time this page was created the focus was on the search function of Wikimedia projects and the joint announcement with Knight noted that the focus of initial research was improving the in-house function. That announcement also leaves open the possibility of broadening the reach of the program as it develops to include other publicly available sources of information. In February 2016 the management of the WMF clarified that the WMF is "not building a global crawler search engine."
Nobody at the WMF intended for the search function at WMF sites to do things to help you find things to buy, as Google does, and there is no evidence that the WMF wanted to do all the many, many things that Google/Alphabet does.
Could search results in Wikipedia include more information from its sister projects?
The Wikimedia movement's vision is to make the sum of all human knowledge freely available to everyone. Wikipedia is our largest and most well-known project, but there are many other projects like Wikimedia Commons and Wikidata that move us towards making our vision happen. These projects have millions of users every month! So, can we make a search system that's good and meets the needs of our users and show content from around our movement? We think people would use it. As a result it would bring more attention to the great work in projects across the movement by using one of the larger, more well-known places people visit.
What licenses will those new data sources be under?
This will need more discussion as we want to be able to conform to the standards and policies of the Wikimedia projects they would need to serve. Our first exploration was with OSM licensing and legal and we'll want to learn from that in any further work.
Does that mean we are looking to shift search traffic away from third parties?
We love all the third party traffic that we get and we hope that it increases over time.
What we want is to improve people's ability to find what they want, once they are here. Too many times, users experience this:
- Search on Google, Bing, etc
- Follow Wikipedia Link
- Leave and search Google, Bing, etc again because you are specifically looking for more information but couldn't find it using the on-wiki search (CirrusSearch).
We want people who come to the Wikipedia and Wikimedia projects to find the information they want, and the more effective we are at serving them, the more they will come and the more they will stay. This means we are better serving our mission.
In addition, the WMF survives on donations, and the more people come directly to us, and the more they find what they need here, the more likely they are to donate.
What is this grant for?
The Knight Foundation has awarded the Wikimedia Foundation an exploratory grant to research and evaluate ways to measure and improve search results on Wikimedia projects.
This is a restricted grant, and the funds may only be used by the Discovery team for the deliverables specified in the grant.
This grant does not increase the team's budget for this fiscal year.
- Grant agreement, with scope-of-work at the end - awarded 2015-09-18, accepted by WMF 2015-11-20
- Joint press release - Knight Foundation web site, 2016-01-06
- "Exploring how people discover knowledge on Wikipedia and its sister projects" - Knight Foundation blog, 2016-01-06
- "Wikimedia Foundation to explore new ways to search and discover reliable, relevant, free information with $250,000 from Knight Foundation" - Wikimedia Foundation blog, 2016-01-06
- Grant announcement - Knight Foundation web site, date unknown, identifying the grant period as running from 2015-09-01 to 2016-08-31
- Knowledge Engine FAQ with Lila
Who is the Knight Foundation?
The Knight Foundation is a philanthropic organization dedicated to supporting media, journalism and fostering communities and the arts.
As part of their efforts, they have previously given grants to the Wikimedia Foundation to support Wikipedia Zero.
What are the activities the WMF said they would conduct under the grant?
- Answer key questions:
- Would users go to Wikipedia if it were an open channel beyond an encyclopedia?
- Can the Wikimedia Foundation get Wikipedia embedded via carriers and Original Equipment Manufacturers?
- Use Key Performance Indicators (KPIs) to inform product iteration, and establish key understanding and feature development for the prototypes
- Conduct tests with potential users
- Create a public-facing dashboard of key KPis
- User satisfaction (by analyzing rate at which queries surface relevant content)
- User-perceived load time
- No results rate
- Application Programming Interface (API) usage
- Test results exploring relevancy of content surfaced
- Test results from research and user testing
- An improved search engine and API for Wikipedia searches
- A public-facing dashboard of core metrics used in product development
- A sample prototype on a small dataset to showcase possibilities
- User testing and research on current user flows to understand the search and discovery experience
- The release says: "Over the last decade, the world has seen a surge in digital information. People can access large amounts of information online, mostly through a small number of closed technologies. Through this project, the Wikimedia Foundation will test ways to make relevant information more accessible and investigate transparent methods for collecting, connecting and retrieving this information consistent with the values of Wikipedia and the open Web."