User:GWicke/Notes/Storage

See bug 48483.

Cassandra
Distributed storage with support for indexes, CAS and clustering / transparent compression. Avoids hot spots for IO (problem in ExternalStore sharding scheme).

Idea: Use this for revision storage, with a simple node storage service front-end. Easier to implement than trying to build a frontend for ExternalStore, provides testing for possible wider use.

CREATE TABLE revisions ( id uuid,  key text,  ts timestamp,  value blob,  PRIMARY KEY (id, key, ts) ) WITH CLUSTERING ORDER BY (ts DESC);
 * helenus Nodejs bindings

// sstable2json output demonstrating resulting clustering by column for delta compression [ {    "key": "550e8400e29b41d4a716446655440000", "columns": [ [       "html:1969-12-31 16\\:00\\:00-0800:", "",       1379879308010000      ],      [        "html:1969-12-31 16\\:00\\:00-0800:value", "666f6f", 1379879308010000     ],      [        "html:1969-12-31 16\\:00\\:00-0800:", "",       1379879315744000      ],      [        "html:1969-12-31 16\\:00\\:00-0800:value", "666f6f626172", 1379879315744000     ],      [        "wikitext:1969-12-31 16\\:00\\:00-0800:", "",       1379879325607000      ],      [        "wikitext:1969-12-31 16\\:00\\:00-0800:value", "666f6f", 1379879325607000     ],      [        "wikitext:1969-12-31 16\\:16\\:40-0800:", "",       1379879583462000      ],      [        "wikitext:1969-12-31 16\\:16\\:40-0800:value", "61207265616c6c79206c6f6e6720737472696e67", 1379879583462000     ]    ]  } ]

History compression
It used to be more efficient when pages on Wikipedia were still smaller than the (typically 64k) compression algorithm window size: meta:History_compression.

-rw-r--r-- 1 gabriel gabriel 143K Sep 23 14:00 /tmp/Atheism.txt -rw-r--r-- 1 gabriel gabriel 14M Sep 23 14:01 /tmp/Atheism-100.txt -rw-r--r-- 1 gabriel gabriel 7.8M Sep 23 14:29 /tmp/Atheism-100.txt.lz4 -rw-r--r-- 1 gabriel gabriel 5.0M Sep 23 14:02 /tmp/Atheism-100.txt.gzip9 -rw-r--r-- 1 gabriel gabriel 1.3M Sep 23 14:01 /tmp/Atheism-100.txt.bz2 -rw-r--r-- 1 gabriel gabriel 49K Sep 23 14:05 /tmp/Atheism-100.txt.lzma
 * 1) -100 is 100 concatenations of the single file.
 * 2) First a page larger than the typical 64k compression window.
 * 3) Only lzma fully picks up the repetition with its large window.

-rw-r--r-- 1 gabriel gabriel 7.0K Sep 23 14:16 /tmp/Storage.html -rw-r--r-- 1 gabriel gabriel 699K Sep 23 14:16 /tmp/Storage-100.html -rw-r--r-- 1 gabriel gabriel 6.8K Sep 23 14:17 /tmp/Storage-100.html.gz -rw-r--r-- 1 gabriel gabriel 5.7K Sep 23 14:29 /tmp/Storage-100.html.lz4 -rw-r--r-- 1 gabriel gabriel 4.9K Sep 23 14:16 /tmp/Storage-100.html.bz2 -rw-r--r-- 1 gabriel gabriel 2.2K Sep 23 14:18 /tmp/Storage-100.html.lzma
 * 1) Now a small (more typical) 7k page, this time as HTML.
 * 2) Compression works well using all algorithms.
 * 3) LZ4 (fast and default in Cassandra) outperforms gzip -9.
 * Size stats enwiki: 99.9% of all articles are < 64k

Alternatives

 * Swift: A bit hacky conceptually. Lacks clustering / compression features. Was not the most reliable when used for thumbnails.
 * Riak: Similar to Cassandra. Does not offer clustering and compression. Reportedly less mature and slower. Smaller community. No cross-datacenter replication in open source edition.

Related REST storage interfaces

 * Amazon S3
 * Swift
 * couchDB - underscore prefix for private resources