Multi-Content Revisions/Revision Retrieval

From mediawiki.org
This page was part of the MCR proposal
See the RevisionStore class for the actual implementation

Storage layer code dealing with revisions should also be migrated from using Revision objects to the RevisionLookup service which returns RevisionRecords.

interface RevisionLookup {
    function getRevision( Title $title, $revisionId ): RevisionRecord;
}

class RevisionRecord {
  public function getId();
  public function getParentId();
  public function getSha1();
  public function getSize();
  public function getTitle(); // XXX: need?
  public function getPage(); // XXX: need?
  public function getUserId();
  public function getUserName();
  public function getComment();
  public function getFlags(); // minor, 
  public function isDeleted( $field );
  public function getReadRestrictions(): string[]; // getVisibility
  public function getTimestamp();
  public function isCurrent(); // XXX: really?
  
  public function listPrimarySlots();
  public function getSlotInfo( $slot );
  public function getSlotContent( $slot );
}

(Code experiment: https://gerrit.wikimedia.org/r/#/c/217710)

RevisionRecord should work as a replacement for the Revision class in most situations.

TBD: Shall we use the full Conent interface also for derived content, or should we define a more narrow interface, with only simple methods like getModel() and isEmpty()?

The getSlotInfo and getSlotContent methods will rely on RevisionSlotLookup and BlobLookup. It would perhaps be useful to introduce a structured storage layer in between, so the logic for serializing and deserializing is isolated from RevisionRecord. This would also define an injection point for structured (as opposed to blob) storage services, as well as virtual slots.

RevisionRecord should have a "lazy" more (or implementation) that only loads slot meta-data on demand, and loads slot content only when specifically requested. In general, getSlotContent() should be considered to be potentially expensive, since it may trigger lazy loading of the content blob.

For virtual slots, getSlotContent() and getSlotInfo() would generate the desired information on the fly. To allow virtual slots to be defined, getSlotInfo and getSlotContent can call a hook (or dedicated service) before using a RevisionSlotLookup and BlobLookup to retrieve the Content object. This could perhaps also be implemented on the structured storage layer.