Topic on Extension talk:Lucene-search

Different results on Wikipedia mirror than live Wikipedia

1
46.30.65.92 (talkcontribs)

We've setup an English Wikipedia mirror featuring Lucene search. For some reason, the search results differ for certain queries from the live English Wikipedia. For example, the search for "player" results in: John_Player_%26_Sons (hit #1), while live Wikipedia returns "Player" (the disambiguation page).

Interestingly, John_Player_%26_Sons is the #1 hit also on live Wikipedia using the API http://en.wikipedia.org/w/api.php?action=query&format=xml&list=search&srwhat=text&srlimit=1&srsearch=player However, if the user uses the search box in an article page, he gets the (generally more desirable) "Player" result. If the user uses the search page (http://en.wikipedia.org/w/index.php?search=&button=&title=Special%3ASearch), the #1 result is John_Player_%26_Sons.

The same happens e.g. for query "American".

What is the magic behind this? How do I programmatically get to "Player" as the #1 hit for "player" on my Wikipedia mirror?

Search result snippet:

459301

  1. info search=[ner], highlight=[ner] in 344 ms
  2. no suggestion
  3. interwiki 0 0
  4. results 20

4779.728 0 John_Player_%26_Sons

  1. h.title [] [5,11] [] John+Player+%26+Sons
  2. h.text [] [5,11,36,42] [+] John+Player+%26+Sons%2C+known+simply+as+Player%27s%2C+was+a+tobacco++and+cigarette++manufacturer+based+in+Nottingham+%2C+England+.+
  3. h.text [] [] [] It+is+today+a+
  4. h.redirect [] [0,6] [] Player%27s 0%3APlayer%27s
  5. h.date 2012-08-25T04:29:23Z
  6. h.wordcount 1475
  7. h.size 10036

4088.8413 0 Player

  1. h.title [] [0,6] [] Player
  2. h.text [] [0,6] [+] Player+may+refer+to%3A+
  3. h.text [] [0,6] [] Player+%28dating%29%2C+a+man+or+woman%2C+who+has+romantic+affairs+or+sexual+relations%2C+or+both%2C+with+other+women%2C+or+men+and+
  4. h.date 2012-08-26T00:15:02Z
  5. h.wordcount 301
  6. h.size 2359
Reply to "Different results on Wikipedia mirror than live Wikipedia"