Wikibase/Indexing

Goals

 * ideally, public web service
 * external requests return within a few seconds
 * how to enforce that constraint needs to be determined and influences the architecture
 * internal requests are allowed much longer
 * these need to not crash external requests and external cannot crush internal
 * high concurrency, failover / replication
 * needs to support continuous updates to reflect latest Wikidata state
 * Seconds or even a minute or two lag seems acceptable at this point but nothing beyond that.
 * support for queries that satisfy the needs of WikiGrok, cf. Extension:MobileFrontend/WikiGrok/Claim_suggestions

Candidate solution: Titan

 * Distributed graph database
 * Supports online modification (OLTP), so can reflect current state
 * Expressive query language (Gremlin); shared with other graph dbs like Neo4j
 * Implemented as a thin stateless layer on top of Cassandra or HBase: transparent sharding, replication and fail-over
 * async multi-cluster replication can be used for isolation of research clusters, DC fail-over
 * Supports relatively rich indexing, including complex indexes using ElasticSearch