User:Daniel Kinzler (WMDE)/MCR-achitecture


This is a very rough architecture brain dump. It's based on the original MCR proposal as documented in Requests_for_comment/Multi-Content_Revisions#Architecture.

RevisionStore loads and stores RevisionRecords. RevisionRecords provide access to SlotRecords and Content objects through lazy loading. Serialized content is loaded and stored using BlobStore.

PageStore loads and stores PageRecords, which are immutable value objects. It also acts as a factory for PageUpdaters, which are stateful controller-like objects that manage the creation of revisions on a page, and trigger the necessary updates of derived data (cached rendered content, links tables, etc).

TBD: we need a way for callbacks to access post-PST content and rendered content during the edit process, before the content gets saved. This access is currently provided via WikiPage::prepareContentForEdit and implemented in DerivedPageDataUpdater, but it would be better if the necessary information could be passed to hook handlers directly.

TBD: we need a way to trigger various updates related to page update, creation, and deletion, triggered by edits, invalidation, purges, import, undeletion, etc. These operations need access to rendered content, which should be re-used if it is already present - either from the parser cache, or directly if that information is on the stack during an edit. This is currently covered by DerivedPageDataUpdater, but needs further refactoring.

RevisionRenderer is a stateless service that generates the canonical rendering of the content of the revision (all slots) for display, preview, indexing, etc. May rely on SlotRoleHandler and/or PageTypeHandler for composing HTML. The RevisionRenderer relies on the rendered content of the individual slots, which is accessible via a SlotRenderingProvider. The main purpose of the SlotRenderingProvider is lazy creation of ParserOutput and caching: generating parserOutput is expensive, so it should be done only when needed, and only once. For long term caching, a sublcass of ParserOutput may implement the SlotRenderingProvider interface, to allow the rendering of each slot to be persisted in the ParserCache along with the combined rendering. This would allow a new combined ParserOutput to be constructed after an edit without having to re-generate all output for slots that did not change (and do not depend on slots that changed).

EditControllers are controller-like objects ("interactors") that model the "edit" user action in a presentation-neutral way (no knowledge about HTML/UI or JSON/API or HTTP requests). An EditController uses a PageUpdater to create a revision, after it performs transformations on the user input (section replacement, edit conflict resolution - maybe in the future also PST), and checks the user's ability to perform the action (permission check including blocks and protection, token check, rate limit check, probably also edit filter callbacks).

SlotRoleHandler declare slot roles, and provide basic functionality. SlotRoleHandlers are available via a SlotRoleRegistery. They at the very least define the content model of the slot. They will probably also provide a mechanism for placing the HTML of slot's rendered content in a combined ParserOutput.

PageTypeHandler replace the WikiPage/WikiFilePage and Article/ImagePage hierarchies. They provide functionality specific to a certain kind of page (article, file, category, message, module, script, template, etc) which are currently implemented as special cases in WikiPage::onArticleEdit, WikiPage::doEditUpdates, etc. They also specify which slots are required or allowed, and what the model of the main slot is. They also implement the mechanism for combining the HTML of different slots, which per default can rely fully on what the SlotRoleHandlers want to do, but could override behavior for some well known slots (e.g. providing a side-by-side view for content transcription, or applying templates styles to the output on preview).