Search/Old/status

Last update on: 2014-07-monthly

2013-03-11
Ram made an update to reduce noise in log files; this has been reviewed and merged. Chad made an update to remove unused code and reduce build-time warnings; this is under review. Ram and Antoine worked on getting search updated on the Beta cluster. Ram created an initial patch for fixing, which was reverted to make sure that the deployment would go smoothly, and that backwards-compatibility was taken into account.

2013-03-15
The noise reduction patches are now in production and ops reports that the logs are now much cleaner. Ram has pushed a patch to fix the problem with updates being lost sometimes; it awaits review. Most of Chad's cleanup patches have been merged. Ram is instrumenting the code with better diagnostics so that we can troubleshoot search issues better. 

2013-03-monthly
Search deployed to Beta Cluster. Search code instrumented for better troubleshooting and identification of issues, and work is underway to add PoolCounter support. Plan for April to make search updates more robust.

2013-04-monthly
Code has been instrumented (and will soon be deployed) to log more data to allow root cause analysis of the spurious "Zero results" issue. Some log analysis was also done. The Puppet configuration on beta was updated to limit lucene-search-2 memory usage on Labs.

2013-06-monthly
Work has pretty much shifted from supporting MWSearch/lsearchd to investigating and implementing Solr. Nik Everett and Chad Horohoe have begun writing an extension to implement Solr searching for MediaWiki, and we've gotten a lot of the initial basic functionality completed. Peter Youngmeister and Andrew Bogott will be handling the operations tasks for the new setup. Initial operations tasks will involve packaging Solr 4 and working with Chad to puppetize the whole design. Additionally, we're going to do some investigation into ElasticSearch, as it's been suggested as an alternative to Solr.

2013-07-monthly
Nik Everett and Chad Horohoe have continued writing an extension to implement ElasticSearch searching for MediaWiki, and we've finished most of the required features. Next comes getting it deployed, scaled, and fixing the inevitable bugs. We're aiming to deploy to the test site beta.wmflabs.org before the end of the month. Peter Youngmeister and Asher Feldman will be handling the operations tasks for the new setup.

2013-08-monthly
In August we deployed CirrusSearch to test2.wikipedia.org and mediawiki.org and we're testing there. We're actively looking for other volunteers to test out CirrusSearch. Right now, CirrusSearch is not the primary search for mediawiki.org; you have to use a URL parameter to test it. We're hoping to make it the primary in September.

2013-09-monthly
In September, we expanded the new CirrusSearch back-end to a number of wikis. Italian Wiktionary, Catalan Wikipedia and English Wikisource are all running CirrusSearch now. Additionally, we deployed to all "closed" wikis. Further feature refinement and bugfixing are ongoing, with roughly 2 to 3 deployments a week.

2013-10-monthly
<section begin="2013-10-monthly"/>In October, CirrusSearch was deployed as a secondary search engine to Wikidata, all Wikivoyage wikis, and Wikipedia in Bengali. It became the primary search engine on Wiktionary in Italian, Wikipedia in Catalan and Wikisource in English. In November, we plan to deploy many more wikis including some larger than the Catalan Wikipedia. To expand to those larger wikis, we've negotiated some new hardware that should be deployed mid month.<section end="2013-10-monthly"/>

2013-11-monthly
<section begin="2013-11-monthly"/>Before November 18, we were spinning up an aggressive plan to add many new wikis to CirrusSearch. On November 18, we had multiple incidents that caused us to roll all wikis using CirrusSearch back to Lucene; we've spent the rest of November implementing fixes for all issues discovered on the 18th. That is now done and we plan to switch all wikis that used to have CirrusSearch back to running it as a secondary search engine on December 2. We'll attempt to restart our aggressive plan as soon as we're comfortable with it again.<section end="2013-11-monthly"/>

2013-12-monthly
<section begin="2013-12-monthly"/>We've continued our aggressive roll-out of Cirrus as a Beta Feature. You can search now 52% of pages including Commons and Wikidata via CirrusSearch. We've fallen back somewhat on our goal to make Cirrus the primary search engine. Right now, we only handle about 1.5% of search traffic.

While we will be switching more wikis over to Cirrus as the primary search back-end in January, the theme of the month really is adding Cirrus as a Beta Feature to more wikis, including the English Wikipedia. We're not sure how many wikis we'll be able to add before we consider ourselves out of hardware space. We're planning on 50% more servers in February so we'll likely be able to finish adding wikis then.<section end="2013-12-monthly"/>

2014-01-monthly
<section begin="2014-01-monthly"/>As of February 3, CirrusSearch is available as a Beta Feature on wikis representing about three quarters of all pages, and serves about 7.5% of our search traffic. Next month, we hope to get the hardware that we need to be a Beta Feature on the remaining wikis. We also hope to be the primary search back-end for more wikis. To that end, we're working through performance and recall issues as well as trying to save space in the indexes.<section end="2014-01-monthly"/>

2014-02-monthly
<section begin="2014-02-monthly"/>This month, almost all LuceneSearch and MWSearch bugs have either been closed as problems that are fixed in CirrusSearch, or moved to the CirrusSearch component. We then prioritized all CirrusSearch bugs. After clearing out any remaining high priority issues, engineering work for an update to the design of the search results page is due to commence on March 10.<section end="2014-02-monthly"/>

2014-03-monthly
<section begin="2014-03-monthly"/>In March we upgraded to the newest version of Elasticsearch and expanded onto more wikis. We also started a performance assessment which has started showing us the work required to use Cirrus as the primary search back-end for the larger wikis. We then started in on that work.<section end="2014-03-monthly"/>

2014-04-monthly
<section begin="2014-04-monthly"/>We deployed Cirrus as a Beta Feature on all wikis that didn't yet have it. We're working on deploying a change to how snippets are generated that should be faster and better. We're also starting to work with Elasticsearch plugins for improved analysis of some languages as well as backup.<section end="2014-04-monthly"/>

2014-05-monthly
<section begin="2014-05-monthly"/>In May, we deployed changes to improve snippets generated by Cirrus to a handful of wikis, spent some time improving its analysis for Hebrew, and adding more backwards compatibility with lsearchd's syntax to Cirrus.<section end="2014-05-monthly"/>

2014-06-monthly
<section begin="2014-06-monthly"/>CirrusSearch is running as the default search engine on all but the highest traffic wikis at this point. Nik Everett and Chad Horohoe plan to migrate most of the remaining wikis in July, leaving only the German and English Wikipedia to migrate in August.<section end="2014-06-monthly"/>

2014-07-monthly
<section begin="2014-07-monthly"/>I'm writing to send another CirrusSearch update. This one isn't good news. We got a bit over aggressive about pushing Cirrus as the primary search backend for bigger wikis and pushed ourselves over the edge but in slow motion. Things started breaking down during Europe's peak time on Tuesday. I wrestled with the production system all day trying get an accurate fix on exactly how we were failing and to stem the tide. I thought I had it by the end of my day on Tuesday. On my Wednesday morning (Europe's afternoon) I woke to see us slipping again. So I rolled back "set cirrus as primary" deploys. See here for wikis that don't have cirrus as primary: (Search)

Now that we're stable again I've started working on the problem at the root:
 * 1)  We're getting more servers.  We're about doubling the cluster size.
 * 2)  I'm putting together more optimizations to the portion of Cirrus that fell over (working set).  If everything goes as planned we'll reduce it by about 80%.  They swap indexing performance for search performance.

I'm going to start rolling those optimization out on Monday July 28th. They won't go everywhere immediately, but I'll roll them in as see how the index time performance hit effects us.

Also: these changes will change result relevance some. In my local testing it looked like everything still worked: title still beats redirect still beats category still beat heading still beats article lead in still beats text still beats image captions and file contents. BUT I still expect things to shift around a bit. Please let me know if you see anything fishy.<section end="2014-07-monthly"/>