User:Stefahn/Solr Docu

My own docu about Solr and SolrStore.

Indexing and updating

 * You put documents in it (called "indexing") via XML, JSON, CSV or binary over HTTP.
 * You can modify a Solr index by POSTing XML Documents containing instructions to add (or update) documents, delete documents, commit pending adds and deletes, and optimize your index.
 * schema.xml can specify a "uniqueKey" field called "id". Whenever you POST instructions to Solr to add a document with the same value for the uniqueKey as an existing document, it automatically replaces it for you.
 * index changes are not visible until changes are committed and a new searcher is opened.
 * Commit can be an expensive operation so it's best to make many changes to an index in a batch and then send the commit command at the end.

Query

 * You query it via HTTP GET and receive XML, JSON, CSV or binary results.
 * Basics: http://lucene.apache.org/solr/api-3_6_2/doc-files/tutorial.html#Querying+Data
 * Test and debug queries within your Solr: http://localhost:8080/solr/core0/admin/form.jsp
 * Example search UI: http://localhost:8983/solr/browse
 * http://wiki.apache.org/solr/SolrQuerySyntax

Installation

 * Extension:SolrStore/Install_Solr
 * http://www.icuriousmedia.com/blog/how-to-install-apache-solr-on-windows-xp-1439.php
 * The folder solr in tomcat/webapps is generated automatically. One doesn't need to copy it from other locations.

Restarting Solr
Do the following as root (or sudo): cd /opt ./tomcat/bin/shutdown.sh ./tomcat/bin/startup.sh

Command "shutdown" turns off the whole server!

schema.xml
Example:  "subject" = field, "text_general" = fieldtype / analyzer that is applied to the field called "subject"
 * Info: http://wiki.apache.org/solr/SchemaXml
 * located in:
 * SolrStore: solr/core0/conf/
 * Solr example: solr/example/solr/conf
 * Defines the field types and fields of documents.
 * The schema defines the fields in the index and what type of analysis (field types) is applied to them.
 * The current schema your server is using may be accessed via the [SCHEMA] link on the admin page.
 * Attention: comment within comment leads to error

Tips and tricks

 * If you want to sort an attribute with values like "1 - rookie", "2 - advanced", "3 - expert" don't chose "text_general" as field type, but "string" for example. If you chose text_general results are sorted in this way: advanced, expert, rookie (because "1 -" is skipped/tokenized somehow).

SolrStore

 * You don't need to define the SMW attributes as fields in your schema.xml. You only need to define fields if you want to do one of the following:
 * You want to sort results by a attribute.
 * You want to have a search input that searches in more than one attribute (for example search in wikitext and pagetitle at the same time).

multivalued

 * multiValued = this field may contain multiple values per document, i.e. if it can appear multiple times in a document

Changing and reindexing
When you change the schema.xml you have not only to restart solr, but also to rebuild the index.

Way to go:
 * 1) Stop your application server
 * 2) Change your schema.xml file
 * 3) Delete the index directory in your data directory (Stefan: in the core directory)
 * 4) Start your application server (Solr will detect that there is no existing index and make a new one)
 * 5) Re-Index your data

Ways to reindex: php SMW_refreshData.php -ftpv php SMW_refreshData.php -v
 * For SMW: Use the following two commands on a shell:
 * See for more info.


 * Script (I don't know how to use up2now, Simon: doesn't work with SMW): http://www.jason-palmer.com/2011/05/how-to-reindex-a-solr-database/
 * Modify articles and save afterwards

Misc:
 * There seems to be no problem if one quits XAMPP - data is still there the next time when one launches XAMPP again (reason: it's saved)
 * In general, you need to be very careful when you change the schema without reindexing - see
 * Alternative to stopping application server: use multi-core - see

Multicore

 * Multicore means one has more than one Solr core
 * Purpose: you can have a single Solr instance with separate configurations and indexes - while having the convenience of unified administration. More info: http://wiki.apache.org/solr/CoreAdmin
 * Cores are defined in solr.xml

Links

 * http://php-solr-lucene.blogspot.com/