Talk:VisualEditor/Design/Software overview

About this board

Previous page history was archived for backup purposes at Talk:VisualEditor/Design/Software overview/LQT Archive 1 on 2015-07-10.

Start a new topic

Table start/table end templates

3 comments • 11:02, 10 July 2015 8 years ago

3

Yair rand (talkcontribs)

Are templates like wikt:Template:trans-top, wikt:Template:trans-mid, and wikt:Template:trans-bottom disallowed by this?

Reply 23:11, 14 September 2011 12 years ago

Trevor Parscal (WMF) (talkcontribs)

Our goal is to for the answer to be no, it will not be disallowed, because all existing Wikitext should still work. However, the editor will still not know that the top, mid and bottom templates are in any way related, so the user experience will suffer greatly compared to using templates with an encapsulated model. Eventually we want to see the ability for the template call/transclusion user interface to be customizable using on-wiki template parameter definitions and gadget code - but the editor will still need templates to be encapsulated for these user interfaces to make sense to the user.

This post was posted by Trevor Parscal (WMF), but signed as Trevor Parscal.

Reply Edited by Flow talk page manager 11:02, 10 July 2015 8 years ago

G.Hagedorn (talkcontribs)

Another note on "Template Encapsulation": Yes I agree that the hierarchical way is preferable, but I can provide (anecdotal) evidence that we attempted this design a few years ago (about mw version 1.14) and run against a wall: Templates were simply not allowed to have that much content by the parser, which was throwing error messages. We ended up with a Key Start / Lead / Key End structure (random example: http://species-id.net/w/index.php?title=Fannia_postica&action=edit ).

This may be one reason why Wikipedia-Table templates, provided some lists generated with them are long, often use the serialized version.

Also, I would like to point out, that for all SMW users, Semantic forms strongly favors the serialized version: multiple records (1:n relations) can be conveniently edited that way. While WMF does not use SMW yet, it is very widely used elsewhere.

Thus, while I agree on the preference for hierarchical display, I wonder whether it is truly impossible to find a solution, a kind of virtual hierarchy. I am thinking of providing an ability to extent existing templates with the ability to call a "start nesting" parser function, which nests all further content until the call of some "end nesting" (or the end of the page). This would greatly smooth transitioning existing templates.

Reply Edited 10:11, 4 November 2011 12 years ago

Reply to "Table start/table end templates"

Constraints

2 comments • 11:02, 10 July 2015 8 years ago

2

✓ (talkcontribs)

I think you're making some heavy mistakes there. I also was thinking about such an live editor and semantic autoformatting, and instead of starting hacking (OK, I did and rewrote Preprocessor_DOM in javascript) I pondered a lot about parsing and editing. I had loved to join in the hackaton, but I had to learn for my tests.

First of all you're completely forgetting "inclusion zones" (<includeonly> etc), which are even used in article namespace (w:de:Wikipedia:WikiProjekt Begriffsklärungsseiten/FAQ#Wie funktioniert das nochmal mit dem Per-Vorlage-Einbinden und dem noinclude.2Fonlyinclude.3F) and in talkspaces as a workaround for Extension:Labeled Section Transclusion.
The proplems described in the section #Constraints are more common than you seem to believe. Lots of things are based on the use of templates as and in attributes (eg {| {{orangetable}} ...), and its unimaginable to do without.
Templates are never complete documents, some are even designed to be tablestarters or new-liners. How would you parse structures like {{#if:...| {{!-}} ...}}?
And some templates even need to be invalid, because they would be much much larger instead. Examples would be de:Wikipedia:Formatvorlage Bahnstrecke#Beispielanwendung or de:Vorlage:Infobox Schiff/DokuOhneTyp#Beispiel.
Another topic is the editing of template pages. A preview for test parameters (and test environment) would be nice, also dynamic nesting of templates etc. I can't see how to deal with such requests in the proposed DOM.

At first I also thought about a top-down document model, but I fastly came to the conclusion that this is only doable at very, very simple pages. A autoformatter that sees an unclosed table/div/whatever never knows what's hidden in the following templates. A live-parser/autoformatter/semantic lexer has to use a bottom-up model, just like the current parser. Steps would be

Getting the xml-like tag hooks, comments and inclusion handlers (what to do if malformed? Current: run to the end)
Parsing headings, templates and tpl-arguments
expanding templates
parsing wikitexts into tables/blocks/images/whatever and doing text annotations
tidy the generated html for output

The current parser does the first two steps together, semantically they could be divided. I'm not sure about the fourth step, I've not dived into the source code yet so maybe I'm writing nonsense about that.

My conclusion is that a semantic lexer has to start at the bottom, a autoformatter or editing transaction needs to run down from the top (generated result) again. Everthing other would narrow the required syntax possibilities.

Of course, I think its right to have the document-block-annotatedText model as a data format for saving pages with parsing possibilities to html4, html5, pdf, rss etc, for quick-generating cached content and, most of all, for creating diffs. But for editing we will have to go deeper into wikitext, which has to stay as uncomfortable as today, and templates should not be a part of the DOM.

Reply Edited by External Link to Interwiki (Bot) 02:35, 27 December 2011 12 years ago

Trevor Parscal (WMF) (talkcontribs)

You have some great points and have clearly thought this out. Most of what you are focusing in on has to do with the parser, so you might want to get involved over here. One thing I will say though is that it's important to remember that there are, and will always be, many edge cases that aren't being addressed. What we hope to do is meet in the middle, between supporting exotic cases and content being reformed. While it may not be reasonable for us to support every imaginable edge case, it is quite reasonable for us to provide alternative solutions to the use cases that are causing the edge cases. With careful consideration and research, these alternative solutions can serve the use case and the editor software equally well. It's important to keep a sense of balance in this work, not diving too deep on edge cases, and also not pretending there are none. Hopefully you can help User:Brion VIBBER and others who are focused on the parser to keep that balance and contribute your expertise.

This post was posted by Trevor Parscal (WMF), but signed as Trevor Parscal.

Reply Edited by Flow talk page manager 11:02, 10 July 2015 8 years ago

Reply to "Constraints"

Various output formats

2 comments • 11:02, 10 July 2015 8 years ago

2

P858snake (talkcontribs)

Structured content blocks containing annotated text can provide a way to represent WikiText in a sufficiently abstract manner, allowing WikiText to be parsed, modified and rendered back into WikiText without loss of information, as well as rendered into a variety of formats including a variety of styles of HTML, such as HTML4 or HTML5, a simplified form of HTML for mobile devices, or non HTML formats such as PDF or plain text.

Imho, it would be quite interesting to give users including programs a choice of formats to select from for various purposes, including somewhat selective outputs like sections, the table of contents, ressource description formats, references for quotations (aka current Special:Cite), etc. - For instance, if one wants to quote from an article when writing a paper, one could ask for a section in LaTex or .rtf format, paste it in their work, open a footnote, copy the appropriate BibTeX entry from the page's Special:Cite page, close the footnote, and be done without having to worry about converting formats.

Moving comment from page into LQT discussion (Peachey88).

Reply Edited 05:39, 10 May 2011 12 years ago

Trevor Parscal (WMF) (talkcontribs)

This, and many other use cases should be easily supported since the official structure will be in a generic and easy to convert format (we are calling WikiDom, but it's just and ordered map tree that's easily encoded into JSON).

This post was posted by Trevor Parscal (WMF), but signed as Trevor Parscal.

Reply Edited by Flow talk page manager 11:02, 10 July 2015 8 years ago

Reply to "Various output formats"

“lines”

3 comments • 18:36, 5 November 2012 11 years ago

3

115.152.227.60 (talkcontribs)

What’s a line, and why is it being introduced here? How is a heading limited to one line, while a paragraph is not? There’s no such concept in HTML.

The idea that an element contains an element or content, but not both, is also novel and in need of basic explanation.

Reply Edited 15:18, 17 October 2012 11 years ago

Jdforrester (WMF) (talkcontribs)

I believe that the first of these is a basic assumption of wikitext; the second is a modelling assumption.

Reply 16:10, 17 October 2012 11 years ago

Catrope (talkcontribs)

The second bit is mostly a modeling assumption for simplicity. It basically means there is a hierarchy of what can contain what:

branch nodes can have children that are either branch nodes or content branches, but can't contain content directly. Examples of branch nodes are tables and lists
content branches can have children, but those children must be content. Examples of content branches are paragraphs, headings and pre's
content nodes "are" content, and can't have children. Examples are text nodes (plain or annotated text), images and br's

This means that some things that are legal in HTML are not legal in our model. For instance, in the HTML that we get, it's common for <li>'s to contain text directly. In our model, that's represented as a list item containing a paragraph containing a text node.

Reply 18:36, 5 November 2012 11 years ago

Reply to "“lines”"

Linear model and not DOM?

2 comments • 18:27, 5 November 2012 11 years ago

2

189.35.191.61 (talkcontribs)

I don't undertand why not use DOM model, and why use so complex and non-standard "Linear model".

Reply 17:47, 5 November 2012 11 years ago

Catrope (talkcontribs)

The editor uses the linear model internally because it's easier to define transactions on. We could've gotten away with using a DOM model or a DOM-like tree instead, but that would have made a future collaborative editing feature a lot harder to implement. It so happens we do actually use a tree, built from the linear model, for some purposes (including rendering and traversal) because it makes more sense to use a tree for those applications.

We also can't use the input DOM directly because it doesn't have a 1:1 mapping to our conceptual nodes, so there has to be some internal data structure that is different from the input DOM, be it a linear model, a tree, or something else. The points where the 1:1 mapping breaks down are mostly "alien nodes" (things we don't understand and render as an uneditable box; in the DOM, these are usually subtrees or sets of adjacent subtrees rather than a single node) and "meta nodes" (things like categories and magic words; these are <meta>/<link> tags in the DOM, not present in the editor, but still need to be restored in the right place on output).

Reply 18:27, 5 November 2012 11 years ago

Reply to "Linear model and not DOM?"

Coping with meta-nodes

One comment • 22:49, 28 September 2012 11 years ago

1

Jdforrester (WMF) (talkcontribs)

Our plan for coping with meta-nodes (i.e., non-positional nodes that make page-level changes, but which nevertheless can appear anywhere in the document):

We will have a Meta Linear Model:

(ve.dm.Document).meta=[];
Transactions & processing
Sparsely-indexed array correlating to offsets in the data array
Offsets are maintained with splice; — insertion splices into meta array with undefined; deletions splice out meta array data and leave meta elements in place.

Reply 22:49, 28 September 2012 11 years ago

Reply to "Coping with meta-nodes"

There are no older topics