User:GWicke/Notes/Storage

See bug 48483.

Cassandra
Distributed storage with support for indexes, CAS and clustering / transparent compression. Avoids hot spots for IO (problem in ExternalStore sharding scheme).

Idea: Use this for revision storage, with a simple node storage service front-end. Easier to implement than trying to build a frontend for ExternalStore, provides testing for possible wider use.

CREATE TABLE revisions ( id uuid,  key text,  ts timestamp,  value blob,  PRIMARY KEY (id, key, ts) ) WITH CLUSTERING ORDER BY (ts DESC);
 * helenus Nodejs bindings

// sstable2json output demonstrating resulting clustering by column for delta compression [ {    "key": "550e8400e29b41d4a716446655440000", "columns": [ [       "html:1969-12-31 16\\:00\\:00-0800:", "",       1379879308010000      ],      [        "html:1969-12-31 16\\:00\\:00-0800:value", "666f6f", 1379879308010000     ],      [        "html:1969-12-31 16\\:00\\:00-0800:", "",       1379879315744000      ],      [        "html:1969-12-31 16\\:00\\:00-0800:value", "666f6f626172", 1379879315744000     ],      [        "wikitext:1969-12-31 16\\:00\\:00-0800:", "",       1379879325607000      ],      [        "wikitext:1969-12-31 16\\:00\\:00-0800:value", "666f6f", 1379879325607000     ],      [        "wikitext:1969-12-31 16\\:16\\:40-0800:", "",       1379879583462000      ],      [        "wikitext:1969-12-31 16\\:16\\:40-0800:value", "61207265616c6c79206c6f6e6720737472696e67", 1379879583462000     ]    ]  } ]

Alternatives

 * Swift: A bit hacky conceptually. Lacks clustering / compression features. Was not the most reliable when used for thumbnails.
 * Riak: Similar to Cassandra. Does not offer clustering and compression. Reportedly less mature and slower. Smaller community. No cross-datacenter replication in open source edition.

Related REST storage interfaces

 * Amazon S3
 * Swift
 * couchDB - underscore prefix for private resources