User:Stefahn/Solr Docu

My own docu about Solr and SolrStore.

Indexing and updating

 * You put documents in it (called "indexing") via XML, JSON, CSV or binary over HTTP.
 * You can modify a Solr index by POSTing XML Documents containing instructions to add (or update) documents, delete documents, commit pending adds and deletes, and optimize your index.
 * schema.xml can specify a "uniqueKey" field called "id". Whenever you POST instructions to Solr to add a document with the same value for the uniqueKey as an existing document, it automatically replaces it for you.
 * index changes are not visible until changes are committed and a new searcher is opened.
 * Commit can be an expensive operation so it's best to make many changes to an index in a batch and then send the commit command at the end.

Query

 * You query it via HTTP GET and receive XML, JSON, CSV or binary results.
 * Basics: http://lucene.apache.org/solr/api-3_6_2/doc-files/tutorial.html#Querying+Data
 * Test and debug queries within your Solr: http://localhost:8080/solr/core0/admin/form.jsp
 * Example search UI: http://localhost:8983/solr/browse
 * http://wiki.apache.org/solr/SolrQuerySyntax

Installation

 * Extension:SolrStore/Install_Solr
 * http://www.icuriousmedia.com/blog/how-to-install-apache-solr-on-windows-xp-1439.php
 * The folder solr in tomcat/webapps is generated automatically. One doesn't need to copy it from other locations.

Restarting Solr
Do the following as root (or sudo): cd /opt ./tomcat/bin/shutdown.sh ./tomcat/bin/startup.sh

Command "shutdown" turns off the whole server!

schema.xml

 * http://wiki.apache.org/solr/SchemaXml
 * in SolrStore: located in core0/conf/
 * Defines the field types and fields of documents.
 * The schema defines the fields in the index and what type of analysis is applied to them. The current schema your server is using may be accessed via the [SCHEMA] link on the admin page.
 * analyzers are applied to fieldtypes
 * Attention: comment within comment leads to error

Changing and reindexing

 * Script (I don't know how to use up2now): http://www.jason-palmer.com/2011/05/how-to-reindex-a-solr-database/
 * What helps to re-index: modify articles and save afterwards
 * There seems to be no problem if one quits XAMPP - data is still there the next time when one launches XAMPP again...

Note that usually when you change the schema.xml you have not only to restart solr, but also rebuild the index. In general, you need to be very careful when you change the schema without reindexing. The most efficient/complete way is to... 1. Stop your application server 2. Change your schema.xml file 3. Delete the index directory in your data directory (Stefan: in the core directory) 4. Start your application server (Solr will detect that there is no existing index and make a new one) 5. Re-Index your data (see next section)

Alternative: http://wiki.apache.org/solr/CoreAdmin#RELOAD

Multicore

 * Multicore means one has more than one Solr core
 * Purpose: you can have a single Solr instance with separate configurations and indexes - while having the convenience of unified administration. More info: http://wiki.apache.org/solr/CoreAdmin
 * Cores are defined in solr.xml

Links

 * http://php-solr-lucene.blogspot.com/