Multi-Content Revisions/Blob Storage

This page is part of the MCR documentation.

The lowest level of the storage infrastructure is the blob storage service, which allows arbitrary binary data to be stored and retrieved. We want to be able to have several different such storage facilities available, and address them by name. In particular, it should be possible to use different storage backends for different slots, or different content models.

The general interface for storing data blobs is defined by the BlobStore interface:

interface BlobStore {
/**
 * Retrieve a blob, given an address.
 *
 * MCR migration note: this replaces Revision::loadText
 *
 * @param string $blobAddress The blob address as returned by storeBlob(),
 *        such as "tt:12345" or "ex:DB://s16/456/9876".
 * @param int $queryFlags See IDBAccessObject.
 *
 * @throws BlobAccessException
 * @return string binary blob data
 */
public function getBlob( $blobAddress, $queryFlags = 0 );

/**
 * A batched version of BlobStore::getBlob.
 *
 * @param string[] $blobAddresses An array of blob addresses.
 * @param int $queryFlags See IDBAccessObject.
 * @throws BlobAccessException
 * @return StatusValue A status with a map of blobAddress => binary blob data or null
 *         if fetching the blob has failed. Fetch failures errors are the
 *         warnings in the status object.
 * @since 1.34
 */
public function getBlobBatch( $blobAddresses, $queryFlags = 0 );

/**
 * Stores an arbitrary blob of data and returns an address that can be used with
 * getBlob() to retrieve the same blob of data,
 *
 * @param string $data raw binary data
 * @param array $hints An array of hints. Implementations may use the hints to optimize storage.
 * All hints are optional, supported hints depend on the implementation. Hint names by
 * convention correspond to the names of fields in the database. Callers are encouraged to
 * provide the well known hints as defined by the XXX_HINT constants.
 *
 * @throws BlobAccessException
 * @return string an address that can be used with getBlob() to retrieve the data.
 */
public function storeBlob( $data, $hints = [] );
}

The BlobStore has full control over the address that shall be used later to retrieve the blob. storeBlob() returns the address that can be used with getBlob() to retrieve the blob.
The address is completely opaque. It may be based on the blob's content hash, use incremental numbering, or GUIDs, or some other scheme.
Hints are used to expose meta-data to the storage layer to allow for optimization (e.g. revisions from the same page can be grouped together for compression).

Managing multiple storage backends

The mapping between slots and storage backends can be split into two steps:

BlobStore names are associated with BlobStore implementations and configurations, designating a concrete storage location. This association must NEVER change, otherwise any stored data will become inaccessible (this is similar to how externalstore clusters are configured). This mapping is managed by BlobStoreMux (see below).
slot names are associated with a BlobStore name. This indicates which store is to be used when storing new data. This association can be changed at will. This mapping is not used by the blob storage layer, it is only needed when saving content for a new revision.

The string returned by storeBlob() is an opaque URL for later loading the slot data using getBlob().

/**
* Stores data to one of several underlying BlobStores, based on a logical name.
* Provides blob address resolution based on the logical name of a blob store
* encoded in a blob address.
*
* @todo find a better name. BlobAddressResolver? BlobStorageManager?
* @todo should this be a BlobStoreRegistry, or have a BlobStoreRegistry?
**/
interface BlobStoreMux extends BlobLookup {

	/**
	 * @param string $storeName the logical name of the BlobStore to save the data to.
	 * @param string $data binary data to store
	 * @param array $hints associative array of hint keys to hint values.
	 *
	 * @todo Add a transaction context as a parameter.
	 *
	 * @return string the permanent canonical address of the blob. Can be used
	 *         with BlobStoreMux::getBlob(). The address will typically contain
	 *         the store name as a prefix, and the address returned by the
	 *         underlying store as a suffix.
	 */
	public function storeBlobTo( $storeName, $data, $hints = [] );
}

(Related code experiment: https://gerrit.wikimedia.org/r/#/c/217710/)

BlobStoreMux manages a number of BlobStores by logical name, and provides address resolution based on such names. Specifically:

The address returned by storeBlobTo is composed of two parts: the name of the BlobStore, and the address returned by the BlobStore.
BlobStoreMux' implemention of getBlob() relies on the prefix in the $address to find the correct BlobStore to load the blob.

Note: storeBlob() and storeBlobTo() should probably get some kind of transaction context as an additional parameter. See Transaction Management.

Basic BlobStore Implementation

The basic BlobStore implementation is based on the text table. It has support for compression, transcoding, and External Storage. Later, support for ExternalStorage can be improved by bypassing the text table alltogether, and using the external storage URL as the blob address. See the section on intergation with ExternalStore below.

Besides the SQL-Based storage used currently used by MediaWiki, BlobStores could be implemented on top of the raw file system, Cassandra, Apache Swift, higher level HTTP based services like RESTBase, etc.

Hints

Storage hints provide a way to "leak" high level information to the blob storage layer to allow for optimization. Similarly, hints can be used by the blob storage layer to expose meta-data to higher level code. All hints are optional, all storage operations should function without them.

Some hints that may be useful:

model: Hint at the content model of the data. BlobStores may use this to optimize storage for well known content models.
format: Hint at the serialization format of the data. BlobStores may use this to optimize storage for well known data formats.
hash: Hint at the hash of the data. BlobStores that need a content hash may use this instead of re-calculating the hash.
page: Hint at the page the data belongs to. BlobStores may use this to group related data together, e.g. for prefetching.
revision: Hint at the revision the data belongs to. BlobStores may use this to group related data together, e.g. for prefetching.
replace: Hint at an address of a blob that is superseded by the new data. BlobStores may use this to discard obsolete data.
similar: Hint at an address of a blob that is probably similar. BlobStores may use this to group similar data together, e.g. for compression.
parent: Hint at the parent revision of the revision the data belongs to. BlobStores may use this to group similar data together, e.g. for compression.
used-by: Hint at the URI of a resource that uses the blob. BlobStores may use this for reference counting.

Integration with ExternalStore

The introduction of BlobStore is not blocked on integrating with ExternalStore, nor is the introduction of Content Meta-Data. An initial BlobStore implementation can use ExternalStore as-is, including the indirection via the text table.

However, to avoid the indirection of storing ExternalStore URLs in the text table, ExternalStore should be integrated with the new BlobStore infrastructure. This poses some minor challanges: The class ExternalStoreMedium exposes similar methods as the BlobStoreMux interface descibed above:

 public function fetchFromURL( $url );
 public function store( $location, $data );

However, ExternalStoreMedium is not accessed directly; instead, the ExternalStore class is used to do the multiplexing between different ExternalStoreMedium objects. It exposes the following relevant methods:

 public static function fetchFromURL( $url, array $params = [] )
 public function insertWithFallback( array $tryStores, $data, array $params = [] )

Note that insertWithFallback doesn't get one store, but a fallback chain of stores; this doesn't seem to be used in practice, though.

The $params argument means that for each call to insertWithFallback or fetchFromURL, a new ExternalStoreMedium instance is created based on these parameters. The main purpose of $params seems to be to select a wiki, in case we are trying to load blobs that belong to another site. With the BlobStoreMux interface, each service instance would be permanent, and bound to a specific wiki. To access blobs for a different wiki, an appropriate BlobStoreMux instance would have to be acquired from a factory.

When introducing BlobStore, the existing older interface needs to be integrated. There seem to be three options:

Don't introduce a new blob storage interface at all, expand and adopt the existing ExternalStore facility.This would be a considerable logical break: ExternalStore would no longer be an implementation detail hidden by the code that manages the text table - to the contrary, it would be the primary way to access content data, and the text table would just be one possible ExternalStoreMedium. Addresses for direct storage in the text table will look something like TT:7641432, externally stored blobs would keep using addresses like DB://cluster5/873284.
Turn ExternalStore into a BlobStoreMux, in addition to the purely static interface. This means two-level multiplexing, once for the BlobStore interface, and once for the ExternalStoreMedium interface. Blob addresses using the external store would look something like ES:DB://cluster5/873284. (Code experiment: https://gerrit.wikimedia.org/r/#/c/300533)
Create an ExternalStoreMediumBlobStore adapter, that implements the BlobStore interface on top of an ExternalStoreMedium instance. The BlobStoreMux would use these adapters directly, bypassing the old EntityStore class completely. Blob addresses from this adapter would be the same as the old external store URLs, e.g. DB://cluster5/873284.

Note that with the content meta-data storage in place, we no longer need to write the URL to the text table, since it can be stored in the cont_address field directly. However, when converting from ES-URL-in-the-text-table to ES-URL-in-cont_address, care must be taken to encode the information in old_flags into the URL, e.g. DB://cluster5/873284;latin1,gzip.

Structured Storage

In the future, it may become desirable to store content objects in a structured way, instead of blobs. For example, tabular data could be stored in a relational database table, workflow state could be stored in a document oriented database, etc. To allow this, we would need a storage level interface that handles Content objects instead of blobs:

interface ContentLookup {
	 /**
	 * @param string $address The desired content's address, as returned by ContentStore::storeBlob().
	 * @param int $queryFlags Bitfield, see the IDBAccessObject::READ_XXX constants.
	 *
	 * @return Content 
	 */
	public function loadContent( $address, $queryFlags = 0 );
}

interface ContentStore extends ContentLookup {
	/**
	 * @param Content $content content to store
	 * @param array $hints associative array of hint keys to hint values.
	 *
	 * @todo Add a transaction context as a parameter.
	 *
	 * @return string the permanent canonical address of the content. Can be used
	 *         with ContentLookup::loadContent().
	 */
	public function storeContent( $content, $hints = [] );
}

And similarly, a ContentStoreMux class, analogous to BlobStoreMux.

Higher level code could either always go through the content based interface, falling back to a generic implementation of ContentStore based on serialized storage.

TBD: it may be more useful to place the structured storage interface at a slightly higher level of abstraction, so it doesn't use opaque addresses, but has access to revision ID and slot name: function loadContent( $revision, $slot, $queryFlags = 0 ). This would allow for virtual slots to be implemented naturally using this interface, though it would not solve the problem of a virtual slot implementation needing access to the primary slot content (and perhaps also content of the parent revision, for blame maps and diffs).

(TBD: This perhaps fits better with Multi-Content Revisions/Revision Retrieval than here)

Batch Interface

A batch interface will likely be required in the future, at least for loading, but perhaps also for writing blobs (e.g. when importing a dump). The batch methods should probably be defined in separate interfaces:

BatchContentLookup::getBlobBatch
BatchContentStore::storeBlobBatch
BatchContentStoreMux::storeBlobBatchTo