Parser 2011/Stage 2: Informal grammar

...

Loose structure assembly
Our primary document structure nests along lines of template expansions/parser functions/tag hooks/links, which have pretty firmly defined start/end/content.

However a lot of bits are built on loose structure; the first formal parsing steps will give us tokens/nodes for pieces of looser structures like tables and HTML elements, which need to be assembled into higher-level structures.

Separate nesting levels
While not popular with purists, this is a fact of life in existing wikitext. :) It is not at all uncommon to find structures like this:


 * some page
 * start template
 * row template
 * || blah
 * end template
 * |}
 * end template
 * |}

If producing HTML web output, we need to produce output that looks more like:


 * some page
 * &lt;table>
 * &lt;tr>
 * &lt;td> blah
 * &lt;/table>

Similarly, sequences of HTML-like or table tags may be missing clearly ordered close tags etc and may need to be disabled or have implicit closes added. Even if all the pieces are there, their nesting may not match the brace structure:


 * some page
 * bla bla
 * bla bla

Here, we have to "pull through" those #if function blocks.

Sequences of adjacent list item lines similarly may need to be reassembled into a properly-nesting structure of lists and list items when producing HTML or similar output -- they too may come from a combination of different template/function nesting levels.

We may however be able to limit some of these sorts of structures for editing purposes...?