VisualEditor/Design/Software overview

MediaWiki allows users to quickly edit web pages. Editing is done by modifying an article’s source code directly within the browser. This source code, called WikiText, is a combination of three distinct kinds of syntax: macros, shorthand, and HTML. Macros are either templates or hooks, both referred to by name and optionally given arguments which direct their expanded result. Shorthand is a meta syntax for rendering HTML as well as specifying meta information for the page. A subset of actual HTML is also allowed to pass through the rendering process, whereas the use of disallowed HTML tags are escaped and rendered as plain text.

This document specifies the information models and technologies required to interact with WikiText visually.

Project status
This project, like this document, is in a research and design phase. Details about the project are subject to change, and will evolve through research and prototyping. This document represents a view of how the software and information models may work, but is not a complete or approved design. For more information about the motivation behind this project, see the Great Movement Projects section of the Product Whitepaper area of the Wikimedia Strategy Wiki.

Objectives
A visual editor should make it easier for new users to contribute productively on a wiki. Studies have shown that entry level users of MediaWiki have difficulties learning WikiText; that becomes a factor in their deciding to limit or stop contributing. Thus editing tends to be monopolized by those who are able and willing to spend the time to learn WikiText, time that otherwise could be used for actually editing content.

Visual editing should first improve the usability of the most common tasks. Less frequent tasks may still be performed using a source code editing mode. In early versions, a visual editor may only implement a minimal subset of features, so it’s important that these initial features target the most common use cases. Reliance on source code editing should naturally decrease as the software matures.

Visual editing should enhance, not degrade, the ability to inspect what was changed between revisions. "Dirty diffs" are a common pitfall of visual editing systems that work with Wikitext; they occur when portions of the document that the user did not intend to change are modified, obscuring the user's contribution. These unexpected modifications can occur when converting WikiText to HTML for editing and then from HTML back into WikiText upon saving. Ideally, a visual editor should be able to more accurately keep track of changes as they are made, and provide information beyond a simple diff, indicating more clearly the user’s intentions. At the very least, a visual editor should not make more work for administrators and editors who are reviewing edits done by others.

Note that visual editing and source editing should not be considered as entirely unrelated. Some existing visual editors simply allow no way to alter components they don't grok, but by using a clean document model we can avoid that. A visual editor that doesn't implement a nice UI for, say, parser function or tag hook invocations will still know the boundaries between the surrounding document and the bits within, and can expose that piece of the document as editable source in-place within a larger editing context.

Constraints
To facilitate visual editing of WikiText, several constraints must to be applied to the rendering system. Macros, such as templates and hooks, must be rendered prior to final resolution in the document, and their resulting HTML structures must be balanced. This will allow macros to be safely treated as discreet objects while editing visually. Macros that do not expand into balanced HTML structures, or which can not be successfully validated should either be fixed using a best-effort approach, or rejected and replaced with a visible error in the final rendered document.

Normalization
To avoid a long tail of minor differences in manually and automatically written WikiText, normalization can be applied before saving. This will cause a single “dirty diff” when a document is normalized for the first time, however all subsequent changes will be far more stable helping to improve the cleanliness of the content between revisions and reduce the required complexity of the parsing and editing systems. This will in effect both prescribe and enforce an official set of formatting rules, which should result in improved readability of WikiText source code.

Document model
While WikiText contains, and has been traditionally converted exclusively into, HTML, there is not a 1:1 correlation between WikiText and HTML due to a combination of features being present in one but not the other, ambiguities in shorthand syntax and the general forgiving nature of WikiText.

Structured content blocks containing annotated text can provide a way to represent WikiText in a sufficiently abstract manner, allowing WikiText to be parsed, modified and rendered back into WikiText without loss of information, as well as rendered into a variety of formats including a variety of styles of HTML, such as HTML4 or HTML5, a simplified form of HTML for mobile devices, or non HTML formats such as PDF or plain text.

Elements
Sequences of blocks, each a discreet element within the document which is displayed in sequence from top to bottom. Data structures which may contain documents or annotated text, and have specific rendering intentions. Pairing of a string of plain text and an annotation table which defines formatting, rendering and meaning information for ranges of the text.
 * Documents
 * Blocks
 * Annotated

Blocks
Annotated text. Beginning of a specific section level containing annotated text. Mix of nestable ordered and unordered lists containing documents. Grid of columns and rows containing documents. Application controlled content with parameters containing documents. Application controlled content with plain text parameters.
 * Paragraph
 * Heading
 * List
 * Table
 * Template
 * Hook

Annotated Text
The meaning or appearance of text can be defined by applying annotations to regions of the text. Additional in-line content can be injected by applying rendering annotations at a specific positions within the text.

Bold, italic, internal and external links, etc. Images, templates and hooks. Semantic relationships and comments.
 * Formatting
 * Rendering
 * Meaning

Transactions
As the user works with a document, their use of the mouse and keyboard is interpreted into a series of transactions which can be applied to the document immediately, logged for later analysis, or communicated to other users and transformed against their transactions, allowing real-time collaboration.

Block transactions
Adding content to a block. Removing content from a block. Relocating content from one block to another. Applying an annotation to a portion of content. Removing an annotation from a portion of content.
 * Insert content
 * Delete content
 * Move content
 * Annotate content
 * De-annotate content

Document transactions
Adding a new block. Removing an existing block. Rearranging blocks. Joining two consecutive blocks together. Dividing a block into two consecutive blocks. Change configuration information about the document.
 * Insert block
 * Delete block
 * Move block
 * Merge blocks
 * Split block
 * Configure document

Wiki transactions
Move a block from one document to another. Combine two documents into a single document. Divide a single document into two documents.
 * Move block
 * Merge documents
 * Split document