Search/Old/Solr 4

Here lies Chad's first impressions on Solr 4.2, after a week of using it.


 * I setup a SolrCloud on labs
 * Used 3 nodes acting as zookeeper (solr-zk[0-2])
 * Used 4 nodes for solr (solr0-solr3), one collection in two shards


 * The new SolrCloud stuff is (mostly) awesome
 * It's super easy to add new replicas to a collection, so it scales out nicely.
 * Each instance can act as a master (for writes) or a slave (for reads)
 * This removes the SPOF of having a single indexer (lsearchd) or a single master (solr 3.x and below)
 * Has a gui which makes it easy to look at the state of the "cloud"
 * Zookeeper manages config & index state
 * Can't re-shard a collection, requires index rebuild. There's bugs reported for this, no ETA.
 * Proper initial planning for the larger indicies would make this less of a priority.


 * Zookeeper was easy to setup, works with standard ubuntu packages & minimal config
 * Not really a SPOF since it requires multiple instances to run.
 * Formula is "require 50% + 1" to operate. So with 3 servers you need 2/3 operational, with 5 you need 3/5, etc.
 * Even the "leader" isn't a SPOF--if the leader goes away then zookeeper elects a new leader.
 * Zookeeper is already used by analytics, and they like it.
 * Unknown how well it would work cross-DC (conflicting reports)


 * Solr 4.x isn't in Ubuntu yet, not even raring
 * Installed by hand for the demo, but we'll want to look into real packages.