Hi how to search using multiple key words, For eg: Libra ascendant born on 1965 how could we search this parameters
Help talk:CirrusSearch
Simply by typing libra ascendant born 1965
into the search form (I assume "on" is a so called stop word). If there are dedicated categories for a topic you could also use the filter word incategory
, e.g. ascendant libra incategory:"1965 births"
.
i tried to use the deepcat feature, but it shows no result. Replacing deepcat with incategory shows results. So there shall be results especially with deepcat. How can i make this query work?
i just saw, that i did not use Special:Search, where deepcat is working.
Hi all. I had a lot of trouble understanding the #Prefix and namespace section, so I did my best to make it more readable.
Specifically confusing was term vs. term:
, so I tried to make that more consistent within that section. When I see code
, the programmer part of my brain thinks "oh, I type this in." Italics is used in prose for emphasis, is not very visually distinctive, and therefore doesn't trigger the same "oh, I must do something!" response. So I hope it makes sense why I think term:
is preferable here.
Secondly, the HTML <kbd>
tag is typically used to denote hotkeys, such as Control+c, but I had a look at the MDN docs and it seems that using it for strings of literal text input is OK, too. What would be the preference then, <kbd>
or <code>
?
Would it be helpful to do a pass over the whole article, or the whole batch of CirrusSearch user-facing documentation, in order to make the use of ''term''
/ <code>term</code>
/ <kbd>term</kbd>
more consistent? --Ernstkm (talk) 03:17, 12 November 2023 (UTC)
Thanks for improving the documentation!
I think clarity is the most important goal, but consistency almost always helps with clarity. A lot of what the italics and <code>
/<kbd>
markup is trying to get at is the use–mention distinction. The problem is that there's no consistency on how to format mentions, and different traditions vary, so we are collectively not always consistent.
Adding the monospaced <code>
or <kbd>
to the mix lets us make finer distinctions (linguistics does this kind of thing, too, sometimes using both italics and quotes for different mentions, and mixing single and double quotes: He said, "I told my cat that gato means 'cat' in Spanish.") Search discussions often use italics rather than quotes so we can mention quotes: You should search for "pet" dog cat. And like you, I tend to interpret monospaced
text as things I could type; I guess it's a tech-flavored mention.
I guess I'm mostly agreeing that it's a mess, but I think that trying to make another finer distinction between <code>
and <kbd>
would only make it messier and harder for newcomers to understand or contribute. Since ⌘ Command+⇧ Shift+6 (on a Mac) generates <code>
tags, that's probably the best thing to standardize on—unless clarity of formatting creates a reason to use <kbd>
, too.
OK, I agree, and thanks for the lesson on "use-mention distinction." I guess I intuitively knew that was a thing, but didn't know its name.
I'll go ahead and replace the <kbd>
s with <code>
s. I thought the use of <kbd>
was a little odd anyway, given that it's used on other sites like Stack Overflow and GitHub specifically to indicate keypresses, and gets styled like keycaps, in the same way that {{key press}} is used here.
…or not. There are 420 uses of <kbd>
in the article. I could search-and-replace all of them, but I'm not sure that improves the article materially. Leaving as is for now.
> …or not. There are 420 uses of <kbd>
in the article.
Fair enough!
How to search for an exact string including greyspace characters?
Try "exact string including?" insource:/"exact string including?"/
. The last part is found under Regular Expression searches.
In the page it says that the search index will be updated, at least once a day. I've been trying to fix broken files over at Commons that have 0 x 0 px. I used the search fileh:0 filew:0 filetype:image -filemime:image/tiff to find them. Now, files I fixed weeks ago are still listed in the results. When will they go away?
Thanks for reporting the problem, there seems to be a problem in the way CirrusSearch is handling these edits, I filed Phab:T342562 to track and fix the issue.
Okay, perfect.
hi all, going to -> https://www.mediawiki.org/w/index.php?search=%2A&title=Special:Search&profile=advanced&fulltext=1&ns10=1 i want all existing templates ie all pages title in ns Template: . After setting this single ns only from the drop list, i tried several forms but without success: 1. with no string i get no result 2. with joker '*' i get the template * only.
So please what is the syntax ? of this elementary request "give me all page titles of ns Template:" Thanks -- Christian 🇫🇷 FR (talk) 07:03, 27 June 2023 (UTC)
Search cannot do that. That's what the api or quarry is for.
Or Special:AllPages: https://www.mediawiki.org/wiki/Special:AllPages?namespace=10
That is a feature that I too once wanted: a list of page titles matching some query. Instead I settled on storing the search result as text, and then using my text-processing skills to extract the titles.
In your case it works to first capture the search result of prefix: template: to file.
Then you grep, and can sort them alphabetically.
Again, this is not what you are supposed to use search for. If you want a list, you should use something made to generate lists, like Special:AllPages, database dumps or quarry. Search is fuzzy, its optimised to find words, not to generate lists.
This is an example to get the first 50 template names on mediawiki.org which are not redirects and not deleted:
https://quarry.wmcloud.org/query/74910
And when lists get really big, you will HAVE to use pagination. There is no way around this as WMF properties generally are very big properties.
Hello everybody, is there a possibility to automatically jump/redirect to the first result? Obviously this works for dewiki
https://de.wikipedia.org/w/index.php?search=Espenfeld
but not for Wikidata:
https://www.wikidata.org/w/index.php?search=Am_Hanffgraben_(Berlin)
Maybe there is a parameter that can be added to the URL?
Thank you in advance, --~~~~
There is no such functionality. What you are seeing is title matching. If your search exactly matches the title of a page, it will take you to that page. For wikidata the title of a page is its Q id. So you can do https://www.wikidata.org/w/index.php?search=Q111351350 and it will take you to that Q id.
Hi, I have a search result on wiki "articles without ref tags", and I want to dump/export the list of all the titles from that search. I tried API, but 500 is the limit of each call.
Can anyone help? Thanks beforehand.
Hi, sadly this is not possible.
You can try to make multiple calls to the API using pagination via the API:Continue parameter to gather more than 500 results.
But there will be limits there too, you can't paginate past the 10000th result.
Such limits are in place to protect the service because even using the continue parameter elasticsearch (the underlying search engine used by CirrusSearch) have to keep all the results from the start in memory.
A quick note regarding your query:
-insource:<ref>
The characters < and > will be ignored and what is actually run is
-insource:ref
and thus you might exclude pages that have the word ref used outside a <ref>, e.g. : https://id.wikipedia.org/wiki/Sumber_primer.
If you want to actually search for the < and > characters you have to use the regular expression syntax by wrapping you search text between a pair of /
and escaping the < > characters with a \
:
-insource:/\<ref\>/
But beware that the query above might not filter pages with named references <ref name="named ref">
or pages where the reference tag is added via a template.
Thank you for the answer and the correction!
When trying to post my question here I get the ⧼abusefilter-warning-linkspam⧽ error, so I posted my full question on stackoverflow at questions/75269346 and I will post only a summary here:
I have installed Cirrus, Elastica and ElasticSearch as per the instructions, but no matter what I do (for example php ./maintenance/runJobs.php, php maintenance/updateSpecialPages.php), number of words on the statistics page never updates.
How can I get that to update? Thanks!
Hi,
This number is cached by CirrusSearch for one day: https://gerrit.wikimedia.org/g/mediawiki/extensions/CirrusSearch/+/be6fd75573ebabbae739823d0b53bac9727ead57/includes/Query/CountContentWordsBuilder.php#15
It might be the reason why it was not updated right after you edited the page.
Thanks! Wow that was driving me nuts!
Hi,
Any profile example on how we can use a synonym file with CirrusSearch and Elastic ?
Thanks
Unfortunately synonyms aren't something CirrusSearch has any support for. It's been in the background as something to work on, but we need to come up with a solution that works in hundreds of languages and likely defers the actualy synonym definition to wiki editors rather than system administrators.
While not exactly synonyms, on the WMF wikis we rely on redirects to pages to provide alternate names for them. In most cases where wiki search externally appears to have used synonyms what actually happened was there was a redirect to the page giving alternate titles (that are used as a fairly strong ranking signal).
Thanks for the feedback.
Because Elasticsearch doses support synonyms as a filter and that Cirrus is really just a Bridge to Elastic, I was hopping we could work this out with profiles, such as
'default' => [ 'builder_class' => Query\FullTextQueryStringQueryBuilder::class, 'settings' => [ 'filter' => [ 'type' => 'synonym', 'settings' => [ 'synonyms_path' => 'my_synonyms.txt', 'updateable' => 'true' ] ] ],
Synonyms are important to us (medical wiki), as for instance if you look for, say "audition", you should find not only page with "audition" in it, but also page with "hear" or "malleus" (small bone inside the hear).
Editing the page to add synonyms is not an option for us, as this will add a lot of work for page producers.