VisualEditor/Design/Software overview

MediaWiki allows users to quickly edit web pages. Editing is done by modifying an article’s source code directly within the browser. This source code, called Wikitext, is a combination of three distinct kinds of syntax: macros, shorthand, and HTML. Macros are either templates or hooks, both referred to by name and optionally given arguments which influence their expanded result. Shorthand is a meta syntax for rendering HTML as well as specifying meta information for the page. A subset of actual HTML is also allowed to pass through the rendering process, whereas the use of disallowed HTML tags are escaped and rendered as plain text.

This document specifies the information models and technologies required to interact with Wikitext visually.

Project status
This project, like this document, is in a research and design phase. Details about the project are subject to change, and will evolve through research and prototyping. This document represents a view of how the software and information models may work, but is not a complete or approved design. For more information about the motivation behind this project, see the Great Movement Projects section of the Product Whitepaper area of the Wikimedia Strategy Wiki.

Objectives
A visual editor should make it easier for new users to contribute productively on a wiki. Studies have shown that entry level users of MediaWiki have difficulties learning Wikitext; that becomes a factor in their deciding to limit or stop contributing. Thus editing tends to be monopolized by those who are able and willing to spend the time to learn Wikitext, time that otherwise could be used for actually editing content.

Visual editing should first improve the usability of the most common tasks. Less frequent tasks may still be performed using a source code editing mode. In early versions, a visual editor may only implement a minimal subset of features, so it’s important that these initial features target the most common use cases. Reliance on source code editing should naturally decrease as the software matures.

Visual editing should enhance, not degrade, the ability to inspect what was changed between revisions. "Dirty diffs" are a common pitfall of visual editing systems that work with Wikitext; they occur when portions of the document that the user did not intend to change are modified, obscuring the user's contribution. These unexpected modifications can occur when converting Wikitext to HTML for editing and then from HTML back into Wikitext upon saving. Ideally, a visual editor should be able to more accurately keep track of changes as they are made, and provide information beyond a simple diff, indicating more clearly the user’s intentions. At the very least, a visual editor should not make more work for administrators and editors who are reviewing edits done by others.

Editing a document visually may occasionally involve the use of source code, especially while the editor is under development. When a visual editor is given a complex document that contains syntax it's not designed to handle, one of three choices must be made; ignore the unrecognized syntax and risk loosing information, disable the visual editor for the entire page and return to source code editing or isolate these portions of the document to be edited as source code while leaving other portions of the document editable visually. Over time more of these edge cases will be resolved as more user interfaces are developed around them. This progressive enhancement approach will allow incremental development and maximize the number of documents which the visual editor can edit while safe-guarding against data corruption. To ensure that the editor need not be disabled on a document, some syntax that may have been previously accepted will need to be constrained.

Constraints
To enhance the user experience of visual editing of Wikitext, and support a more portable document object model, several constraints will need to be applied to the parsing and rendering systems. Macros, such as templates and hooks, will need to be rendered prior to final resolution in the document, and their resulting HTML structures will need to be complete and valid HTML. This will allow macros to be safely treated as discreet objects while editing visually or when rendering into various alternative formats. Macros that do not expand into valid HTML structures should either be fixed using a best-effort approach, or rejected and replaced with a visible error in the final rendered document.



The sequential example illustrates a template is poorly factored, using separate templates to open the table, add rows and cells, and to close it out. While the hierarchical document structure can still describe this template, a visual editor will not be able to present the user with a sensible user interface because there is no clear connection between the opening, contents and closing. Finding and migrating old templates in this style will be an ongoing task to be considered in how new tools get rolled out.

Ideally templates that generate tables or other arbitrarily long lists of content would be invoked using a method more similar to the encapsulated example. It's important to not that this may require more powerful iteration utilities for the template processor. An added benefit of this approach is that the rendered result could easily encapsulate the content that each template expanded to without breaking the HTML structure.

Normalization
To avoid a long tail of minor differences in manually and automatically written Wikitext, normalization can be applied before saving. This will cause a single “dirty diff” when a document is normalized for the first time, however all subsequent changes will be far more stable helping to improve the cleanliness of the content between revisions and reduce the required complexity of the parsing and editing systems. This will in effect both prescribe and enforce an official set of formatting rules, which should result in automatic consistency and thus improved readability of Wikitext source code.

Document model
While Wikitext contains, and has been traditionally converted exclusively into, HTML, there is not a 1:1 correlation between Wikitext and HTML due to a combination of features being present in one but not the other, ambiguities in shorthand syntax and the general forgiving nature of Wikitext.

Parsing Wikitext and converting it into a data structure of blocks, each containing content, allows Wikitext to be represented in a sufficiently abstract manner, such that it can be modified and rendered back into Wikitext without loss of information, as well as rendered into a variety of formats including a variety of styles of HTML, such as HTML4 or HTML5, a simplified form of HTML for mobile devices, or non HTML formats such as PDF or plain text.

Elements
Sequences of blocks, each a discreet element within the document which is displayed in sequence from top to bottom. Data structures which may contain content or nested documents, and have specific interaction models and rendering intentions. Pairing of a string of plain text and a series of annotations, each defining formatting, rendering or meaning information for ranges of the text.
 * Documents
 * Blocks
 * Content

Blocks
Series of lines of content. Single line of content and a heading level. Series of items, each containing a single line of content and any number of nested lists. Series of rows, each containing a series of columns, each containing containing a document. Application controlled content with any number of parameters, each a document. Application controlled content with plain text parameters.
 * Paragraph
 * Heading
 * List
 * Table
 * Template
 * Hook

Content
The meaning or appearance of text can be defined by applying annotations to regions of the text. Additional in-line content can be injected by applying rendering annotations at a specific position within the text, typically occupied by a space character.

Bold, italic, internal and external links, etc. Images, templates and hooks. Semantic relationships and comments. (not currently supported in Wikitext)
 * Formatting
 * Rendering
 * Meaning



Transactions
As the user works with a document, their use of the mouse and keyboard is interpreted into a series of transactions which can be applied to the document immediately, logged for later analysis, or communicated to other users and transformed against their transactions, allowing real-time collaboration.

Block transactions
Adding content to a block. Removing content from a block. Add or remove annotation on a range of content.
 * Insert content
 * Remove content
 * Annotate content

Document transactions
Adding a new block. Removing an existing block. Dividing a block into two consecutive blocks. Joining two consecutive blocks together. Rearranging blocks. Change configuration information about the document.
 * Insert block
 * Remove block
 * Split block
 * Merge blocks
 * Arrange blocks
 * Configure document

Wiki transactions
Move a block from one document to another. Divide a single document into two documents. Combine two documents into a single document.
 * Move block
 * Split document
 * Merge documents

Wikitext Representations


Once parsed, Wikitext will be converted into a serializable data structure, which can be sent to a web browser as JSON data. Once at the client, the VisualEditor converts the serialized form into a data structure optimized for editing and rendering, so that user interactions are responsive. HTML, or any other output format, can also be generated from the serialized form.

Linear Addressability


At the document level, all content is accessed in a single linear address space. Blocks have local address space which is utilized during rendering and interaction, and handle translating the positions of aggregate contents if any, such as in the case of lists.