Platform Engineering Team/Personal Development Share Back/Distributed Storage Transactions

Idea
Distributed data stores can provide massive scalability, fault-tolerance, and replication semantics for robust geographic distribution, compelling features for an organization like Wikimedia. However, these systems have also sacrificed important properties for the sake of their distribution, such as joins, or ACID transactions. We are therefore required to evaluate these systems by a set of trade-offs between their unique capabilities, and what must be sacrificed to make use of them.

To illustrate the problem, consider our core platform; MediaWiki is a multi-user content management system, users make edits to create new revisions of pages. Users, pages, and their revisions are all objects that require state to be persisted. It makes sense to organize, or model, these objects grouped by like-entities (users with users, pages with other pages, etc). This is called normalization. However, it must still be possible to maintain correctness of this disjoint data during updates, and to join linked objects on query. MediaWiki utilizes an RDBMS, and such systems are well suited to these data models. A distributed database sans support for multi-item transactions would require de-normalization of the data; De-normalization results in duplication and creates significant challenges to maintaining correctness.

While it is unlikely that we'll ever both have our proverbial cake, and be able to eat it, there is a growing body of research that explores the idea of adding transactions to distributed databases. Even limited support for multi-item transactions could be a game-changer, opening the door to use-cases that would benefit from distribution, but might otherwise be considered intractable.

What is proposed here is a long-term, open-ended project to evaluate, research, and experiment with technologies and techniques to address some of these missing capabilities. This work will focus particularly on Apache Cassandra, since it is a system already in use at Wikimedia.

Phase 1
Create a greenfield implementation of Cherry Garcia, an abstraction for the client-coordinated transaction commitment protocol outlined in "Scalable Distributed Transactions across Heterogeneous Stores"."...we propose an approach that enables multi-item transactions with snapshot isolation across multiple heterogeneous data stores using only a minimal set of commonly implemented features such as single item consistency, conditional updates, and the ability to store additional meta-data. We define a client-coordinated transaction commitment protocol that does not rely on a central coordinating infrastructure. The application can take advantage of the scalability and fault-tolerance characteristics of modern key-value stores and access existing data in them, and also have multi-item transactional access guarantees with little performance impact."