Talk:Cross-wiki Search Result Improvements
Add topicThis page used the Structured Discussions extension to give structured discussions. It has since been converted to wikitext, so the content and history here are only an approximation of what was actually displayed at the time these comments were made. |
How do we want these new, additional, relevant search results to be displayed?
[edit]Some starter questions:
- Should the results from whatever wiki you're on to be shown first and then have an option to show more from other wikis?
- Should the additional results be inter-mixed with the local wiki results?
- Should the additional results be displayed off to the side (or maybe the bottom) of the results page?
- Should we have the option to turn off these other relevant search results (a user and/or project opt-out)?
- This could be a keyword search term or maybe a button for a visitor to click
- This could also be similar to the
local:
keyword that will only search for images on the local wiki and not Commons files, for instance.
- Would the additional results be best displayed as a list or a grid design?
- Should we include relevant metadata (images and/or a short description) with the search results?
- Do the results need to have the size of the article (i.e.:
848 bytes (104 words)
) and the date it was created/modified? - Should we indicate that clicking on a result will take you to another wiki project?
- How many results from other wikis should we show - 1, 2, 3, or more?
- Should we limit the existing method of displaying results from the wiki that you searched on?
- We currently show up to 10,000 results in a paginated manner, but testing shows that generally only the first 3 results are ever acted upon. CKoerner (WMF) (talk) 20:36, 2 September 2016 (UTC)
- In my opinion:
- Numbers 1 - 3, 5 & 8: Show the search results of the Wiki you're currently on, and then a seperate column besides it for results from other Wikiprojects (in a list), entitled like Results from other Wikiprojects or something like that. This means that the results of other Wikis can be found, instead of being somewhere at the bottom where most people don't look. Don't mix the additional results with local Wiki results, since it could look (in my opinion) very messy when that is the case. For example: English results mixed with languages that don't use Latin letters (like Arabic) and vice versa.
- Number 4: It would be useful to provide an option to turn it off, in either the Preferences page or when viewing the search results (or maybe both?). In my opinion it is better to avoid using prefixes, especially when someone doesn't know about that.
- Number 6: Images (the first image that is used on the page) would be useful, so that the (possible) reader globally knows what the subject of the page is. Kind of what currently happens when searching on the mobile Wikipedia site.
- Number 9: I think showing the most relevant 1 to 3 results at the top, and then at the bottom of those results the option Show more results (hot-loaded (not loaded when loading the search page for the first time (or something like that)), so that searching stays as good as fast as it was before the implementation.
- Number 10: I don't even know why you would look through 10,000 pages to find a subject... I think capping it at about 500 to 1,000 should be more than enough, since the testing shows that generally only the first 3 links are being used.
- BTW, this message may contain some Dunglish DarkShadowTNT (talk) 14:30, 8 September 2016 (UTC)
- Thanks for the feedback!
- I think we can add in a setting in the preferences that could be used to turn on (or off) the cross-wiki search results for the more experienced users. But I do like the idea of having it visible on the page to turn it on/off easily. DTankersley (WMF) (talk) 18:49, 8 September 2016 (UTC)
- Show the best match above a threshold for each of the other projects, in a single list, and with a "more" link which opens additional entries inline.
- An alternative is a "more like this" and then opening additional entries on the side. Think of it as a sideways stack, where l-to-r languages have the new column on the right. That would make it possible to drill down in a relevance ranked set, simply by clicking in the additional columns.
- If you select (click) on a page from an other project, then that project should take precedence in the new column, but when you have selected two or more projects then all of them should take precedence.
- It follows from this that the page from column A could be a Wikipedia article and the page from column B could be a page from Wikitionary. The resulting relevance ranked column should then be pages that rank according to the selected pages from those two projects. Jeblad (talk) 09:02, 9 September 2016 (UTC)
- Hi @Jeblad - that sounds quite intriguing!
- Our initial focus is to gather that first selection of related (or "more like this") articles from the sister projects to display the first time a user enters a query into the search box.
- Your idea is to expand this process to display a second set of related results on the page result (that the user selected) and expanding to include results from the first wiki and the second clicked-through to wiki site. Sounds cool!
- We'll keep your idea on the backlog - as once we've launched the first iteration of showing additional relevant articles across wikis, we'll need to closely monitor it to see if our community likes it. :) DTankersley (WMF) (talk) 14:02, 9 September 2016 (UTC)
- I don't think a user has a clear idea of where a "more like this" starts, and thus the first level (s)he open are probably pretty close to a simple "more" in the users mental model. That could imply that it is wise to start with a simple solution, and then later on twist the idea into full relevance ranking.
- Forgot to mention that on smaller screens the additional columns can be rendered and then slide in horizontally, and the old columns sliding out, thereby signaling to the user how (s)he can navigate back to previous results. On large screens (4k++) the columns can stack up side by side. Jeblad (talk) 18:19, 9 September 2016 (UTC)
Do we want these new search results to work across all Wikimedia projects?
[edit]- For example, if I'm on Wikiquote, do I want to also see relevant search results from Wikivoyage, Wikipedia or Wikinews?
- Or, if I'm on Wikipedia, just show me results from other projects? CKoerner (WMF) (talk) 20:37, 2 September 2016 (UTC)
- I wonder if specific projects have a given relevance for other projects, like Wikitionary have a higher relevance for Wikipedia, and a lower for Wikispecies. It will probably also change given the categorization of pages within the projects. Wikispecies has a high relevance for articles in Wikipedia within biography, but would have a low relevance for art.
- If you do a search in a project, then the categories could be used as an indicator for how relevant (likely) some other project would be, given this specific result set. If a project is highly relevant, then the number of hits could be increased from 1 to 3 (just an example, use whatever number). Jeblad (talk) 09:18, 9 September 2016 (UTC)
- It really depends on the nature of the question. If someone is looking for the meaning of the Latin word ''vicesimanus'', Wiktionary information will be of most use, and it may not matter which language Wiktionary the results come from, as the word may only appear in a few projects, and might be illustrated with a picture, with a list of translations into other languages, or at least with an explanation in another language besides Latin. Likewise if someone is looking up the pronunciation of a word, or its syllabification for the purposes of hyphenating it, or synonyms. All of these features of a word may be presented on any Wiktionary, and may be found independently of the project language. EncycloPetey (talk) 02:17, 13 September 2016 (UTC)
- I don’t think the average user searching English Wiktionary would be happy with a definition of a Latin term that was in Finnish, Russian, or Chinese—generally in any non-Indo-European language or any language that doesn't use the Latin alphabet. The lack of readable cognates makes those pages useless. Look at the Russian page for gato (Spanish "cat"). If you don't at least know some Cyrillic, you can't get much out of that page. Finnish gato is actually better than I expected, but only because there are some cognates (Espanja, Portugali, and substantiivi). You can translate those pages using your browser or online tools, but I think that's getting into the realm of “power users” unfortunately.
- My intuition is that what most people want is results in the language of the project they are on, or projects in the same language. (Exception: when their query is clearly in another language. Exception to the exception: when they are on Wiktionary—which is where I often go for words I don’t know even when they are not in English.) Users could also use results in other languages they can read (which they need to specify or we need to surmise, say, based on browser settings). Only power users and researchers are going to dig into results for languages they don't know. This may change over time as machine translation gets better and people become more sophisticated about handling text in other languages—but I think most people aren't there yet.
- I’m open to other opinions on user preferences and the typicality of any given use cases, of course!
- However, there may be some technical limitations. We can’t index English Wiktionary both with all the other English projects and with all the other Wiktionary projects. Searching across all Wiktionaries without a shared index is probably too resource intensive to be practical. TJones (WMF) (talk) 17:32, 13 September 2016 (UTC)
- Re: "Only power users and researchers are going to dig into results for languages they don't know." I disagree. During the time I was seriously active on Wiktionary, requests for translations into languages the user did not know were very common. We had daily requests for assistance. EncycloPetey (talk) 18:26, 13 September 2016 (UTC)
- Interesting! Requests for translations into, say, Russian, seems very different from using a Wiktionary page in Russian (without machine translation). TJones (WMF) (talk) 18:35, 13 September 2016 (UTC)
- For example, if I'm on Wikiquote, do I want to also see relevant search results from Wikivoyage, Wikipedia or Wikinews?
- Or, if I'm on Wikipedia, just show me results from other projects?
- This answers it better than anything (in short both):
- https://xkcd.com/214/
- To be clear, this "problem" should be expanded to most projects so that anyone can keep jumping from wiktionary or commons to others, and back again in a perfect loop. If nothing else it helps with cross-wiki vandalism detection. 197.218.89.65 (talk) 19:59, 24 June 2017 (UTC)
- Great suggestion and yes, we've thought about it and Wikivoyage is actively talking about it (adding in results from other projects into their project). :)
- I've added a phab task to keep it on our backlog for now. DTankersley (WMF) (talk) 14:15, 26 June 2017 (UTC)
- Considering this has been ~1% deployed for years in the italian wiki (it.wiktionary , it.wikivoyage, etc) projects(https://it.wiktionary.org/w/index.php?search=~rome&title=Speciale:Ricerca&profile=default&fulltext=1&searchToken=4lpgbyehwomrct7tasvm04c7), and probably without complaints, it seems that this would be pretty much welcome on most sister projects.
- It might be worth considering sister search for mediawiki.org . The natural sisters could be :
- meta.wikimedia.org - (it is also confusing how documentation is split between this wiki and meta and this might bridge the gap)
- commons.wikimedia.org -pdfs (e.g. for about lua programming , programming in general and other illustrations)
- wikitech.wikimedia.org - (there seems to be documentation tidbits there that are useful for general mediawiki users)
- The meta wiki on the other hand could probably search all wiki projects (https://wikimediafoundation.org/wiki/Our_projects) as suggested here: https://phabricator.wikimedia.org/T87632, and https://meta.wikimedia.org/w/index.php?title=Meta:Babel&oldid=11078192#footer. Perhaps wikimediafoundation.org or wikimedia.org could also be considered.
- Enabling it on meta should be pretty non-controversial and would make it the go to place to search all wikis and already has users who agreed to it.
- Also, the original task to enable it everywhere seems to be this. 197.218.80.220 (talk) 00:21, 27 June 2017 (UTC)
- Thanks! I've added your suggestions to the ticket I created yesterday and I also added a comment onto the original task as well. :) DTankersley (WMF) (talk) 18:32, 27 June 2017 (UTC)
Would these other relevant search results be useful and encourage deeper exploration into various topics?
[edit]- Is it annoying to see the other wiki search results?
- Conversely, does it encourage a user to discover more knowledge?
- How much weight do we give results from other wiki projects in the results? CKoerner (WMF) (talk) 20:37, 2 September 2016 (UTC)
- I believe so, but this is a research issue. (The problem is formulated as a question about convictions and feelings.) Jeblad (talk) 09:20, 9 September 2016 (UTC)
Will the display of the additional search results from other wikis encourage contributions from editors?
[edit]- i.e.: if you search for
Piazza del Duomo
and don't see a Wikivoyage article about it (while I'm searching on Wikiquote), would that encourage you to start an article for it? CKoerner (WMF) (talk) 20:37, 2 September 2016 (UTC)
- I believe so, but this is a research issue. (The problem is formulated as a question about convictions and feelings.) Jeblad (talk) 09:21, 9 September 2016 (UTC)
- If the intention is to replace the existing search function with the proposed new one, then the answer to the question will vary by project. If someone is searching from a Wikipedia, it may encourage work on the other projects. If someone is searching on Wiktionary, it may promote contributions to Wiktionaries in other languages. But a search from projects like Wikisource would pull contributors away from a project that is already struggling to gain new contributors, and would be irritating to users who are trying to search Wikisource itself. EncycloPetey (talk) 02:22, 13 September 2016 (UTC)
- Hi @EncycloPetey - thanks for the feedback.
- We're initially thinking of only putting the additional relevant search results that are gathered from the sister projects on the Wikipedia project itself. We'd like to see how this new feature is used by the broader community before determining if the new search results would work on the other projects (i.e.: searching Wikisource might not need or want results from Wikipedia or Wikivoyage). DTankersley (WMF) (talk) 12:09, 13 September 2016 (UTC)
Should we limit the amount of languages we search in?
[edit]- i.e.: only use the top 50 languages to implement this in?
- Or, only use the languages that we are detecting queries in an other language than the wiki the user is on? CKoerner (WMF) (talk) 20:37, 2 September 2016 (UTC)
- ''within the same language'' is stated clearly in Cross-wiki Search Result Improvements#A New Goal. So start easily.
- If no results are found in the current language, it could be considered in a next step to search other languages, starting with those the user seems to understand (for instance: other language wiki's he has contributed to). Top-languages may result in more results, but also require more capacity (larger) and not everyone is capable of reading the top 5-50 languages.
- Another option is to stick to related languages, for instance Roman if requested from a Spanish or Itailan wiki or German languages if the request comes from a Danish of Norwegian site. RonnieV (talk) 13:43, 8 September 2016 (UTC)
- If the number of languages are not limited, then smaller languages (and projects) will be swamped by hits from larger, and that would be very unproductive.
- The actual languages should be the ones the user knows, ie. the babel list of languages. I don't think it should be limited to the language of the source wiki. Jeblad (talk) 09:24, 9 September 2016 (UTC)
- My comment above applies here as well: It really depends on the nature of the question. If someone is looking for the meaning of the Latin word ''vicesimanus'', Wiktionary information will be of most use, and it may not matter which language Wiktionary the results come from, as the word may only appear in a few projects, and might be illustrated with a picture, with a list of translations into other languages, or at least with an explanation in another language besides Latin. Likewise if someone is looking up the pronunciation of a word, or its syllabification for the purposes of hyphenating it, or synonyms. All of these features of a word may be presented on any Wiktionary, and may be found independently of the project language. EncycloPetey (talk) 02:18, 13 September 2016 (UTC)
- For myself (and perhaps most logged-in users), I think it would be ideal to allow quick searching from a specified list of wikis. (either by default, or as a new supplementary button in the Special:Search page).
- For myself, as a somewhat meta-oriented and monolingual editor, I often want to be able to search (all of, or specific namespaces in):
- English Wikipedia
- Metawiki
- Mediawikiwiki
- Outreachwiki
- WikimaniaXXXX
- for example when I'm looking for documentation, or an essay, or an old presentation, or an old discussion.
- Speculation: For multilingual logged-in users, I wonder if we could somehow re-use the info that Compact Language Links stores about languages we've purposefully visited? Or use a new shared storage location with a user-configurable override? (i.e. so that I can manually add a list of n languages.) –Quiddity (talk) 22:09, 2 February 2017 (UTC)
- Yes, we've discussed this in the past and we have a few tickets in Phabricator to investigate further, at least on the wikipedia.org portal page. One ticket would add in a button or a link to switch the query string to search in different languages. Another ticket would display regional language links, but that probably wouldn't fit your need here. Just as an FYI, on the mobile apps, there is a way to preset languages to use for searching and it's fairly easy to switch between multiple languages.
- However, you're requesting searching by project or namespace all at once, based on a pre-determined set of sites. I've added that idea into a new ticket to investigate how we could do this. I can envision a variety of ways we could do this, but we'd need to test and see which is more effective and intuitive to users (logged in or not).
- Thanks for the suggestion! DTankersley (WMF) (talk) 04:45, 13 February 2017 (UTC)
How can we help users better understand the functionality of other wikis through verb-driven language?
[edit]Suggestion for language on boxes: of TIs there a way to indicate to users what they might find in each Wiki through a verb-driven phrase that prompts them to take an action? This could be hover-text on added tags or banners, or specific verb-driven language. For example, instead of "free dictionary" we may want to say "Look up the meaning of a word" which drives the user to take a specific actioin on Wikidictionary and also tells them what Wikidictionary does, instead of what it is. We might be able to test this, which would indicate might help users understand what these other resources are. srousers what h MKramer (WMF) (talk) 16:41, 7 September 2016 (UTC)
- Yup - those all sound like great suggestions!
- We've got several initial design mocks on the /Design page and these suggestions sound like a new mock candidate(s). DTankersley (WMF) (talk) 19:04, 7 September 2016 (UTC)
- Thanks @DTankersley (WMF)! I'm not sure why additional letters were added to that question! Apologies for that!
- I'm happy to write up verb-driven action sentences for each platform, although I don't know if we have standards floating around somewhere. MKramer (WMF) (talk) 17:06, 9 September 2016 (UTC)
- Hi! A few examples would be cool to put on a mock, but I don't know if there are standards written up anywhere. :)
- Thanks! DTankersley (WMF) (talk) 17:18, 9 September 2016 (UTC)
- Awesome. Here are some examples, but these are drafts (and I am very open to feedback and edits...)
- Wikidictionary - Look up the meaning of a word
- Wikiquote - Discover quotes from thousands of people
- Wikibooks - Read free textbooks about a variety of subjects
- Wikipedia - Read a free encyclopedia that anyone can edit
- Happy to contribute more if these are useful! MKramer (WMF) (talk) 12:57, 13 September 2016 (UTC)
Which wiki projects would be included in the search results?
[edit]Currently, the Discovery Search team is focusing the new cross-wiki search results to be gathered only on the sister wiki projects.
For example, when searching on fr.wikipedia - cross-wiki results could be shown from fr.wikivoyage, fr.wikiquote, fr.wikiversity, fr.wikisource, fr.wikinews, fr.wikispecies, fr.wiktionary and fr.wikibooks - if results are found in those sister projects. DTankersley (WMF) (talk) 19:09, 7 September 2016 (UTC)
- Wikidata can show really pertinent results.
- Also, have you noticed Extension:ArticlePlaceholder? Trizek_(WMF) (talk) 17:19, 8 September 2016 (UTC)
- Hi @Trizek (WMF) - I chatted with the team and we feel that Wikidata is very excellent and a lovely pivot that we can maybe use for searching across languages- it might be a good candidate as a companion to our language detection efforts.
- But for the cross-wiki search functionality, we really don't have any control on what data is actually searched for - because everything is merged into one big field. So, we don't think Wikidata would be a good fit for this particular functionality. DTankersley (WMF) (talk) 20:26, 8 September 2016 (UTC)
- Note that some sister projects are very small and using them will give very poor quality results. With few and poor quality pages on a sister project those will end up on a lot of searches on some other project. That could trigger a lot of discussions, and prompt inclusion of a user option to turn the results off.
- A solution could be to set a threshold on some kind of quality on the included hits, thereby limiting the number of pages where a certain hit can show up.
- This could also give more incentive to clean up the low quality pages on other projects, so it could be "a wanted effect" even if it will create a lot of discussions. Jeblad (talk) 09:31, 9 September 2016 (UTC)
“The Plan” needs update
[edit]The Plan section looks outdated, showing items (that should be done by now) in the future tense. Can someone from the Engineering or Discovery teams update it? Eduardogobi (talk) 00:49, 9 January 2017 (UTC)
- Will do, thanks for the reminder! :) DTankersley (WMF) (talk) 17:29, 9 January 2017 (UTC)
Preview of audio files
[edit]I tried the searching ogg files example, and some results are shown in a "multimedia card". Since all results are audio files all look the same (a speaker icon), I have to hover to get some more details based on the file name and make several clicks to actually listen to it.
Maybe we should consider a more informative way of preview results when they are audio files. This may involve including part of the name of the file, showing a relevant related image in addition to the speaker, playing the audio on clic/hover, or something else. Pginer-WMF (talk) 13:15, 9 February 2017 (UTC)
- Thanks for the suggestion - I've created a Phabricator ticket for this work. DTankersley (WMF) (talk) 07:32, 10 February 2017 (UTC)
Suggestion: Use Wikidata to fetch the main image for multimedia results
[edit]Problem: As a non-English reader, search may often return irrelevant and inappropriate results due to using the image caption / description or inappropriate page-image.
Background: Consider if someone searches for monkey , and a man shows up due to the label or description of the image. In certain contexts this may be insulting. Aside from that, in non-English wikis, a lot of the times the multimedia results will yield wrong images due to the fact that the image descriptions are not translated.
Proposed solution:
When the search matches a wikidata item / alias use the image property as the primary image.
Compare:
(Shows these: https://fr.wikipedia.org/wiki/Fichier:Poisson%20distribution%20lambda%20001.svg, https://fr.wikipedia.org/wiki/Fichier:Poisson%20distribution%20CMF.png, https://fr.wikipedia.org/wiki/Fichier:Poisson%20distribution%20PMF.png)
VS
https://www.wikidata.org/wiki/Q152#P18
(https://commons.wikimedia.org/wiki/File:Abramis_brama_drawing.jpg) 197.218.80.233 (talk) 18:01, 17 February 2017 (UTC)
- Thanks for the suggestion, but using wikidata to search Commons for images matching a search query is a bit different than what we're trying to do - right now.
- Sometimes the images that are returned could indeed be offensive but we don't censor things in regards to how an image is tagged. We just do the search and then display the top most relevant results.
- On other hand, this presents an excellent opportunity to edit images that are mis-tagged or mis-labeled to avoid them from showing in search results where they really don't belong. :)
- Using your sample URL, I think if I was on frwiki and I was trying to do a search for a fish and I saw some math diagrams, I'd try a different search term. According to fr.wiktionary, the more common term for 'fish' in French is 'poiscaille' - https://fr.wikipedia.org/wiki/Special:Search?search=poiscaille&fulltext=1&cirrusUserTesting=recall_sidebar_results&searchToken=7bt1at9xmpdr5s4pch1ksd904 which does return one image of a fish.
- Or, being redundant in your search query by using 'poisson' and 'fish' returns an image of a fish, a fossilized fish and a large rock structure in the shape of a fish: https://fr.wikipedia.org/wiki/Special:Search?search=poisson+fish&fulltext=1&cirrusUserTesting=recall_sidebar_results&searchToken=cioy7r5b1uycvbaoczumkffvz Still not optimal. Clicking on the multimedia link (on the search results page) does indeed show more images of fishes: https://fr.wikipedia.org/w/index.php?title=Sp%C3%A9cial:Recherche&profile=images&search=poisson+fish&fulltext=1&searchToken=chdtvl1pm555wt1d03fk7ojxb but it needs to have the dual search query terms. Again, not optimal at this time.
- We have a new project that is starting soon that will help with this in the long term. The new project will be for structured data in Commons and we will be updating the API for searches like this. DTankersley (WMF) (talk) 02:04, 24 February 2017 (UTC)
- Thanks for the response.
- Perhaps it might be prudent to not let perfect be the enemy of good by implementing something like https://phabricator.wikimedia.org/T95026. This would be a short term solution until the structured metadata project comes along and eventually replaces it. Currently people seem to ignore inaccurate images because they aren't really visible. The search interface doesn't surface them except on mobile, and on wikipedia.org.
- Once this becomes widely deployed, you're very likely to frequently receive a bunch of "bug reports" of inaccurate multimedia content showing up as every search will potentially show up some image, audio or video, whereas currently only text is shown. Labeling these images won't work because the search engine seems to emphasize the image filename, instead of its description.
- The idea is not to only use wikidata, but simply choose 1 image from it in addition to the normal ones that are already displayed. 197.218.91.5 (talk) 16:30, 24 February 2017 (UTC)
- One alternative idea is simply to pull the pageimage of the matching page. For example, for the "poisson" search string above, the first article in the search results is an exact match, and its page image would be perfect to illustrate the fish. Currently it seems that the search engine relies on simply searching the file namespace or commons for the article title, and it yields worse results than the first image in the article itself:
- https://fr.wikipedia.org/wiki/Poisson
- https://upload.wikimedia.org/wikipedia/commons/thumb/2/23/Georgia_Aquarium_-_Giant_Grouper_edit.jpg/330px-Georgia_Aquarium_-_Giant_Grouper_edit.jpg
- Even articles with no images may yield some illustration picked up from wikidata, if the task above is solved. 197.218.91.214 (talk) 18:04, 24 February 2017 (UTC)
- This is pretty bad even on english wikipedia, searching for something as common as a "leg"ː
- https://en.wikipedia.org/w/index.php?search=leg&title=Special:Search&uselang=en&fulltext=1&cirrusUserTesting=recall_sidebar_results
- Imagesː
- https://en.wikipedia.org/wiki/File:Wooden%20Leg%201913.jpg
- https://en.wikipedia.org/wiki/File:Las%20Limas%20left%20leg.svg
- https://en.wikipedia.org/wiki/File:Las%20Limas%20right%20leg.svg
- The article itself contains a pretty strange image too (without the caption it was hard to tell what it is)ː
- https://en.wikipedia.org/wiki/Leg
- https://upload.wikimedia.org/wikipedia/commons/thumb/3/30/InsectLeg.png/220px-InsectLeg.png
- But wikidata has more easily recognizable images of legsː
- https://www.wikidata.org/wiki/Q133105
- Images
- https://commons.wikimedia.org/wiki/File:Legs_of_woman.jpg
- https://commons.wikimedia.org/wiki/File:Beine.JPG 197.218.91.148 (talk) 22:40, 25 February 2017 (UTC)
- I agree that there could be better images for 'leg' but unfortunately, without more context about 'leg' it's hard to get really good results back.
- For instance - using 'human leg' works pretty well:
- https://en.wikipedia.org/w/index.php?search=~human+leg&title=Special:Search&cirrusUserTesting=recall_sidebar_results&searchToken=5hsh03wb9249tpoued9vs818s
- or using 'dog leg' which shows a dogleg (zig-zag) and two images of a dog: https://en.wikipedia.org/w/index.php?search=~dog+leg&title=Special:Search&cirrusUserTesting=recall_sidebar_results&searchToken=d80f0qxtteiygpfabs0c0dmu9 DTankersley (WMF) (talk) 16:19, 3 March 2017 (UTC)
- Well, that's partly true. Although leg is a pretty simple term, it wouldn't be that surprising if it at least managed to show a "table leg". There are even more cases where it fails for common terms:
- silver mineral - https://en.wikipedia.org/w/index.php?search=~silver+mineral&title=Special:Search&cirrusUserTesting=recall_sidebar_results&searchToken=cl9rccdv6ojlr3d34mijhmv9q
- human tears - https://en.wikipedia.org/w/index.php?search=~human+tears&title=Special:Search&cirrusUserTesting=recall_sidebar_results&searchToken=14t8ew06qrr90dzyz95mod3kq
- ice
- The thing to remember is that many non-native English speakers use these resources. It may be quite easy for a native speaker to try and narrow their search but this isn't always possible for someone with limited knowledge especially in cases where a wikipedia / wikimedia project doesn't have resources in their native language. Relevant images would make it far easier to ensure that search results are relevant even before clicking any of them. Consider the discussion in this forum:
- http://ell.stackexchange.com/questions/87976/is-there-an-english-word-for-the-fruit-we-call-paterna-in-el-salvador
- That individual would likely recognise the image faster than the text in the article. In fact the article description might just confuse them. A picture is worth a thousand words after all.
- Anyway, hopefully structured commons will come eventually. 197.218.90.68 (talk) 18:50, 4 March 2017 (UTC)
- Thanks for the real-life examples - they're always helpful. :)
- I chatted with the Search team about this topic this morning - to be sure there wasn't anything that I was missing. Creating a new method, right now, to search for content on Commons will be a bit of an exercise in futility. Once the Structured Data team ramps up and gets their new format of metadata established, the Search team will incorporate it into the widely used CirrusSearch API and any extra work we do now will be trashed.
- The goal behind the sister project search results is to give our readers and editors more information about their search query - to enable discovery into the other projects that maybe they didn't know about.
- I'm confident that adding in the new sister project search results will aid in that discovery for millions of users - even though a better method of utilizing our search APIs will be coming in the near future.
- For the example about 'paterna' -- if the 'inga feuilleei' term was used instead, it would indeed show lovely images of the fruit the user was hoping to find. Maybe those images could be tagged with the term 'paterna' by some very kind contributors, to make it easier for all to find?
- https://en.wikipedia.org/w/index.php?search=~Inga+feuilleei&title=Special:Search&cirrusUserTesting=recall_sidebar_results&searchToken=20m0hy0wegstk363rj4jmb7no DTankersley (WMF) (talk) 19:57, 6 March 2017 (UTC)
Suggestion: Reduce the weight of the description page and add a reporting tool
[edit]One of the drawbacks of simply searching the file description and title is that it can be very unreliable and can easily be used to vandalize. This means that searching for "whore", surfaces obvious vandalism like this [1], [2]. The image of a famous woman being "shown in results" for whore, or a "molester" (https://en.wikipedia.org/wiki/File:Photo_of_molester.jpg, Carl_Sagan)
Proposed solutions:
- Reduce the weight of the file description page - it can often be misleading and just plain bad especially on non-English wikis.
- Add a reporting tool to make it easier for people to report / remove vandalism - this will also help get more eyes on commons, and potentially somewhat reduce their workload.
- Prioritize media used locally in multimedia results - images used locally might be more relevant to the project and to the search. 197.218.81.62 (talk) 20:14, 14 April 2017 (UTC)
- We'll be launching the sister project snippets soon and based on a RfC on enwiki, we will not display the multimedia results on the English Wikipedia search results pages. You can test the new look by using this url: https://en.wikipedia.org/wiki/Special:Search?search=~test&fulltext=1&cirrusUserTesting=recall_sidebar_results
- Cheers! DTankersley (WMF) (talk) 17:31, 6 June 2017 (UTC)
Ignore image annotations on commons
[edit]From https://en.wikipedia.org/wiki/Special:Search?search=ugly&fulltext=1&cirrusUserTesting=recall_sidebar_results&searchToken=362e1qz8z5cc9qtw1y48rb7gj it appears that Cross-wiki search tries to read the (history of) annotations on commons. It finds strings like " ugly stitching error" here: https://en.wikipedia.org/w/index.php?title=Special:Search&profile=images&search=ugly&fulltext=1 Vexations (talk) 12:31, 1 June 2017 (UTC)
- Hi @Mduvekot, the testing URL that you're using in your sample is currently awaiting a new code update to not display the commons / multimedia results in the sister project snippets on English Wikipedia. DTankersley (WMF) (talk) 12:55, 1 June 2017 (UTC)
Suggestion: Show page image (thumbnail) for wiktionary results
[edit]Issue
Search results don't provide enough visual clues to make it easy to find content.
Background
While normal search results from commons are unstructured and may result in a lot of false positives. Wiktionary's narrow focus makes it quite useful to use as a "visual" dictionary, and it often avoids controversial or images.
Proposed solution
When wiktionary returns search results show, the page image used in the page as a thumbnail show the image:
- A small icon near the wiktionary results (the pageimage)
- A multimedia result at the top of the side box
This seems to be quite good results compared to commons, e.g.:
Term | Commons | Wiktionary( see image) |
---|---|---|
hair | hair | https://en.wiktionary.org/w/index.php?title=hair&action=info |
ram | ram | https://en.wiktionary.org/w/index.php?title=ram&action=info |
unicorn | [3] | https://en.wiktionary.org/w/index.php?title=unicorn&action=info |
shag | [4] | https://en.wiktionary.org/w/index.php?title=shag&action=info |
Honey (mel, honig) | [5][6] | [7][8][9] |
As demonstrated by the last row, it also returns useful images for the same term in various languages. Until a better system comes around this seems to be a reasonable alternative. 197.218.89.1 (talk) 21:35, 2 June 2017 (UTC)
- Thanks for the suggestion and samples! We'll take it into account when we start working on the thumbnail icons next to the search results. :) DTankersley (WMF) (talk) 22:43, 2 June 2017 (UTC)
How is the ranking of sister projects determined?
[edit]I just tried out searching for some popular travel locations, and Wikivoyage came out last or next to last (i.e. below the fold) in each of these examples:
- "Cape Cod" on enwiki
- "Mallorca" on dewiki
- "Hawaii" on enwiki
- "Rome" on enwiki
- "Alaska" on enwiki(voy:Alaska ranks below q:Alaska Thunderfuck 5000 and b:Solitaire card games/Alaska)
- "Ko Samui" on enwiki (voy:Ko Samui ranks below wikt:come and q:Urusei Yatsura)
(Examples are not cherry-picked, these were just the first few that that came to my mind.) Tbayer (WMF) (talk) 23:45, 27 June 2017 (UTC)
- It seems to be pretty well documented:
- https://wikimedia-research.github.io/Discovery-Search-Test-CrosswikiSidebar/
- https://commons.wikimedia.org/wiki/File:Second_Test_Of_Cross-wiki_Search_-_Helping_More_Users_Discover_Content_On_Wikipedia%E2%80%99s_Sister_Projects.pdf
- https://phabricator.wikimedia.org/T149806
- (Props to the project managers and analysts for reporting it so well).
- It will probably more interesting to read the outcomes and analysis (spoiler: it is not random). 197.218.81.178 (talk) 00:29, 28 June 2017 (UTC)
- Projects should – in theory – be ordered according to recall (most to least number of articles returned from each project). And this is mostly true if you open each sister project's search results page separately.
- Looking at InterwikiSearchResultSetWidget in MediaWiki Core, it does not appear there is explicit front-end code for ordering the projects when the SERP is rendered and the order is determined by Cirrus (IIRC) on the back-end when it returns the interwiki results – which should be according to recall.
- Looking at InterwikiSearcher.php in Cirrus source, we still have code from the first cross-wiki search A/B test where results can be returned in a random order if the configuration requests it, but they should be returned according to recall in production. Although maybe we're accidentally still using the static order that the switch statement defaults to. I'll reach out to @EBernhardson (WMF) and @DCausse (WMF) for clarification. MPopov (WMF) (talk) 21:18, 28 June 2017 (UTC)
- Absolutely, the wiki blocks are ordered by recall. Large wikis are likely to be ordered first frequently.
- Concerning wikivoyage there's a small variation. During the RFC it was requested to strongly filter wikivoyage results on title. Today we ensure that 80% of the search terms (stop-words excluded) appears in a title for wikivoyage results. In other words it decreases recall for wikivoyage and probably one of the reason you feel that wikivoyage is ranked so badly.
- Without the title filter wikivoyage would be ranked #3 (just below wiktionary) for the query Alaska. DCausse (WMF) (talk) 07:26, 29 June 2017 (UTC)
- Might be good to expand the documentation :Extension:CirrusSearch/Scoring#Cross-wiki as this will be asked again ... 197.218.81.79 (talk) 13:24, 3 July 2017 (UTC)
Simple English Wikipedia
[edit]Why isn't the Simple English Wikipedia shown in the cross-wiki search results for articles which have identical titles in the English Wikipedia? The Simple English Wikipedia is not known by many Wikipedia users, and I believe that it would be beneficial to include results from it with other cross-wiki results on the English Wikipedia. Daylen (talk) 00:21, 28 June 2017 (UTC)
- Hi @Daylen, thanks for the question.
- The Simple English Wikipedia has it's own sister projects displaying in the search results page, but only Simple Wiktionary and Simple Commons (multimedia), as shown in this query for Paris (Simple Wikiquote is locked).
- As you mentioned, Simple English Wikipedia is not well known by most Wikipedia users; however, including results from SimpleEnWIki in the sister projects on English Wikipedia would probably cause a lot of confusion for the general population, because they don't know the project exists.
- It'd be great if there was a better way to encourage discovery, reading and editing in Simple English Wikipedia; please let us know if you have any ideas. :) DTankersley (WMF) (talk) 21:30, 28 June 2017 (UTC)
Suggestion: Always make wiktionary results as the first in the sidebar
[edit]- Issue:
- As a user, it frequently happens that when I mistype a search string I give up searching because it shows no local results.
- Background:
- Wiktionary results are almost 100% always relevant. The reason is pretty simple, it adds a natural disambiguator, and it may serve as an improved "did you mean". It may help the reader / user to correct their results and search again. As it contains a lot of words and relevant synonyms, it also helps in the scenario where one looks for "automobile", "vehicle", "car" helping non-native speakers find more common words.
- Finally the primary reason is that it is often the case that someone may be interested in a very simple fact about something, and they may find exactly what they are looking for without clicking any search results, by just looking at the wiktionary text snippet.
- Proposed solution
- Always put the wiktionary results as the first box in the sidebar cross-wiki results
- Allow snippets to store more info so the wiktionary snippet can show more info at a glance
- Potentially look into using Extension:TextExtracts to parse the page and show only the most relevant snippets without all maintenance templates and other unnecessary content.
- It might also be fruitful to deploy the wiktionary results to all projects, including this one. The context it provides tends to help users retype their search queries and find what they are looking for. 197.218.90.174 (talk) 21:45, 28 June 2017 (UTC)
- Hi,
- Thanks for your suggestions—we have a future update to the search results page that is a Wiktionary widget that I hope would be good for you and all our users. You can read more here and the A/B test page is here as well as a self-guided test you can add to your own logged in account.
- Once we get done with all the testing and chat with the community about it, I think it would be great if we can put it on all Wikipedias and sister projects.
- However, Wiktionary results won't always display, it just depends on the query the user inputs. But, we'll see what surfaces in our testing. :) DTankersley (WMF) (talk) 20:37, 30 June 2017 (UTC)
- It looks pretty good. But it is missing the most interesting thing, clearly highlighted synonyms !
- The results don't seem to clearly emphasize them: https://en.wikipedia.org/w/index.php?search=trunk&title=Special:Search&profile=default&fulltext=1&searchToken=3h3zlq4ixyinhbe07ko2gqjp1
- It currently shows synonyms jumbled together with basic definitions of the word, when these are pretty useful for anyone searching and should at least be in bold.
- It might also be useful to include the related image (https://upload.wikimedia.org/wikipedia/commons/thumb/e/e5/Yellow_birch_trunk.jpg/220px-Yellow_birch_trunk.jpg).
- In cases where wiktionary doesn't show anything, the other projects might make up for it. 197.218.83.157 (talk) 16:49, 1 July 2017 (UTC)
- Correction: As a user, it frequently happens that when I mistype a search string I give up searching because it shows no local results. 197.218.90.174 (talk) 22:20, 28 June 2017 (UTC)
- 197.218.90.174 I have updated you initial question. If you press the three dots next to a post, an option to edit the post appears. Have a nice day! Daylen (talk) 22:42, 28 June 2017 (UTC)
- Thanks. In case you weren't aware logged out users cannot edit their posts using Flow , they can only edit the title... 197.218.90.174 (talk) 23:30, 28 June 2017 (UTC)
- Thank you for letting me know. It was my understanding the IP users could edit posts that they created, why can't they @Quiddity (WMF)? Daylen (talk) 00:08, 29 June 2017 (UTC)
- Hi, it's configured that way because IP addresses can be shared between many people (hundreds or even thousands). (Known addresses are sometimes tagged with a template, such as w:en:Template:Shared IP header templates at Enwiki, but it's not always obvious, and is inconsistently & manually applied.)
- Note that aside from that restriction, there is also a restriction on who can edit other people's posts (Flow#Can I edit other people's posts?).
- There was some discussion (a long time ago) about the desire to "enable IPs to edit their posts for x minutes after save" (5 or 30 etc). But I don't think anyone ever filed a task for it. I've now filed phab:T169167. Quiddity (WMF) (talk) 01:06, 29 June 2017 (UTC)
Suggestion: Provide search suggestions using synonyms from wiktionary when the search matches a title
[edit]Issue
As a user I'd like to be presented with suggestions to improve my search.
Background
Currently search depends entirely on a word either matching the search terms, or matching the title of a page. This reduces the usefulness of the search when a word can mean so many things, for example, looking for "trunk", one may mean a proboscis ("elephant's trunk"), boot (a part of a car), a part of a tree, part of a body, and so forth.
Proposed solution
- Extract these from the page with a matching title for wiktionary search results much like the widget Cross-wiki Search Result Improvements/self-guided testing#Wiktionary. For example(https://en.wiktionary.org/wiki/trunk)
- Provide a search suggestion: "you may be interested in : proboscis, boot ' using words extracted from the Synonyms sub-heading
Considering the different wiktionaries and different headings or rules in each wiki, this may not be feasible until there is some way to store these in a structured manner.
Even so, just showing the contents under the synonym (and similar ones in other wiktionaries) heading will be a good short term improvement. 197.218.83.157 (talk) 16:27, 1 July 2017 (UTC)
- There's also :https://en.wiktionary.org/wiki/Wiktionary:Wikisaurus, see https://en.wiktionary.org/wiki/Wikisaurus:greed. 197.218.83.157 (talk) 17:06, 1 July 2017 (UTC)
- Hi, this is an intruging suggestion, thanks for posting.
- However, using 'food' or 'greed' or 'house' as a search term goes directly to the article page on that search query. Are you suggesting to add the 'you maybe interested in ___' phrase with synonym subheading within article pages? DTankersley (WMF) (talk) 18:18, 6 July 2017 (UTC)
- The suggestion was actually to add it either just below the search box in Special:Search .
- ----
- >search term goes directly to the article page on that search query
- Well, that's because those are single word queries, and also because that's English wikipedia, and they are addicted to creating redirects for everything. In fact, the reason it works is probably because they used a bot to find synonyms and add redirects to make up for the limitations in the search engine.
- In smaller wikipedia, uncommon words will always have such problems:
- bugzut (gluton)- https://en.wiktionary.org/wiki/buzgut, enwiki
- paupérrimo (synonym poor) - ptwiki
- méphitique - (foul smelling ) - [10]
- In some of the above results wiktionary sister search provides enough context for a person to improve their search. In other cases, it doesn't help.
- Seems like a somewhat similar idea has been suggested (although not exactly the same):
- https://phabricator.wikimedia.org/T127874
- https://phabricator.wikimedia.org/T85770
- The automobile example still doesn't show the expected results, and wiktionary snippets are not that helpful in that case, but its synonyms are (https://en.wiktionary.org/wiki/car#Synonyms) "private vehicle that moves independently): auto, motorcar, vehicle; automobile (US), motor (British colloquial), carriage (obsolete)".
- If the user had been suggested "US automobile", and they used it, they would have likely found the page they were looking for. 197.218.80.160 (talk) 23:48, 6 July 2017 (UTC)
- Also, it might be worth evaluating the possibility of disabling (for unregistered users) the "automatic go to article".
- It can often be very confusing to type something and suddenly be taken to an article, which in some cases may be unrelated, as the user may merely want to have an idea of the existing pages before improving search keywords.
- Entering a wrong keyword there can also quickly pull you into a completely unrelated wikimedia project, e.g. Special:Search/meta:monkeys or randomly push you into random wikis (https://www.mediawiki.org/wiki/Special:GoToInterwiki/wikicities:monkeys).
- Not to mention that it may result in a lot of deadends and needlessly skew the search results statistics. 197.218.80.160 (talk) 23:57, 6 July 2017 (UTC)
- Hi,
- I've added this conversation to the older (but similar) ticket you noted earlier: https://phabricator.wikimedia.org/T85770. We'll take a look at this and scope out the work that this type of new update would probably need and then prioritize it from there. :) DTankersley (WMF) (talk) 17:32, 13 July 2017 (UTC)
- This project would likely give the search engine a huge boost: https://phabricator.wikimedia.org/T986 (although that may take a year or more to complete). Since that will make it possible to differentiate between synonyms and the same word in different dialects, e.g. boot vs trunk, pants vs underwear , and automobile vs motor.
- Thanks for considering the idea! 197.218.91.95 (talk) 20:23, 13 July 2017 (UTC)
Issue: Commons multimedia results disabled all wikis?
[edit]Steps to reproduce:
- Go to pt:wikipedia
Expected:
In the sidebar for interwiki searches some images of milk from commons, e.g. commons:Special:search/~milk
Actual:
Sidebar with interwiki results but no images from commons. 197.235.241.38 (talk) 13:38, 30 August 2019 (UTC)
- I think this was only meant to be disabled on english wikipedia. 197.235.241.38 (talk) 13:56, 30 August 2019 (UTC)
- Thanks for letting us know about this issue. I've filed a task on Phabricator and we'll take a look at it soon! :) https://phabricator.wikimedia.org/T232032 DTankersley (WMF) (talk) 19:00, 4 September 2019 (UTC)
- You're welcome.
- As a sidenote, I think the reason people didn't realize it was missing is partly because it is completely random whether anything will ever appear because the sidebar with search results isn't always there. I noticed this problem months ago, but originally thought perhaps the images were simply not matching or it was taking a long time to load.
- As a potential future change it might be useful to keep a heading or button always available to show potential results, consider for example, if one searches in google (www.google.com/search?q=lx14566&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjYns3wnMTkAhWTlFwKHTD8DHsQ_AUIEygC&biw=1299&bih=637) for a non-existing image. One would note that despite the search not "currently" showing any images it still contains a heading that people can click on to see if there are any.
- Perhaps there could be a button like "Show sister wiki results", or something like that. Just a thought.
197.235.248.150 (talk) 17:13, 9 September 2019 (UTC)- Thanks for the suggestions! :) The patch is rolling out this week on the train, please let us know if you see more weirdness on the search results page. DTankersley (WMF) (talk) 14:10, 10 September 2019 (UTC)
- It seems to be working again.
- Although it is odd that the same search string (in different wikis) may not always bring up commons search results. For example:
- Leopard seal
- https://fr.wikipedia.org/w/index.php?sort=relevance&search=leopard+seal (some multimedia results e.g.: https://fr.wikipedia.org/wiki/Fichier:Leopard%20seal%20in%20antarctica.jpg)
- https://pt.wikipedia.org/w/index.php?sort=relevance&search=leopard+seal (no multimedia results)
- shrimp fisherman
- What makes it weird is that the file title actually contains those strings. On a positive note, it seems to be nicely matching some commons structured data, so while it can't seem to find the content in one label, it finds it in another, e.g.:
- It also seems to be quite accurate when searching for a single word, presumably if that's at least included in the title. But with two or more words it seems to fail fairly often, even in cases where there is a perfect match in the title, e.g.: "leaning tower of pisa" , https://fr.wikipedia.org/w/index.php?sort=relevance&search=leaning+tower+of+pisa->https://commons.wikimedia.org/wiki/File:Leaning_tower_of_pisa.jpg. 197.235.230.130 (talk) 15:45, 16 September 2019 (UTC)
- Apparently this issue came back, none of the links above seem to show any commons images. 197.218.88.244 (talk) 13:44, 27 December 2020 (UTC)
How to start cross-wiki search on my own wiki?
[edit]I have two wikis, and I hope the search results are interconnected, like wikipedia ( File:Sidebar-crosswiki-search-results.jpg ),Sister site search results are displayed on the right, what should I do? 36.227.241.50 (talk) 06:08, 24 February 2023 (UTC)
- Hi, I looked at the linked image and its /ja page: have you had a look at Cross-wiki Search Result Improvements , where everything seems to start/evolve? Pardon me if I am too short-sighted. Cheers, Omotecho (talk) 06:21, 24 February 2023 (UTC)
- I've seen it, but the documentation doesn't tell me how to enable it on other wikis. 220.246.250.11 (talk) 03:11, 27 February 2023 (UTC)