Markup spec/BNF/Special block

From mediawiki.org

Special blocks are things like itemized lists starting with * ; they can only be specified at the start of a line and usually run till the end of the line.

<special-block>           ::= <horizontal-rule> | <heading> | <list-item> | <table> | <space-block> | ...

The dots need to be filled in.

Horizontal rule[edit]

A horizontal rule is specified by 4 or more dashes. It is translated to an <hr> element.

<horizontal-rule>         ::= "----" [<dashes>] [<inline-text>] <newline>
<dashes>                  ::= "-" [<dashes>]

If the inline-text is present, it is not wrapped in a <p> element.

Heading[edit]

A level-n heading is translated to an <hn> element.

<heading>                 ::= <level-6-heading> | <level-5-heading> | <level-4-heading> 
                                 | <level-3-heading> | <level-2-heading> | <level-1-heading>
<level-6-heading>         ::= "======" <inline-text> "======" <space-tabs> <newline>
<level-5-heading>         ::= "====="  <inline-text> "====="  <space-tabs> <newline>
<level-4-heading>         ::= "===="   <inline-text> "===="   <space-tabs> <newline>
<level-3-heading>         ::= "==="    <inline-text> "==="    <space-tabs> <newline>
<level-2-heading>         ::= "=="     <inline-text> "=="     <space-tabs> <newline>
<level-1-heading>         ::= "="      <inline-text> "="      <space-tabs> <newline>

The alternatives in the first rule need to be tried from left to right.

Some notes (as implied by the grammar):

  • An unterminated heading tag is treated as normal text.
  • Unbalanced tags are treated as the shorter of the two tags (i.e. ==== heading == renders as the level 2 heading == heading)
  • More than 6 = signs are treated as 6, with the extra symbols being included in the header.

List item[edit]

<list-item>               ::= <indent-item> |  <enumerated-item> | <bullet-item> 
<indent-item>             ::= ":" [(<list-item> | <item-body>)]
<enumerated-item>         ::= "#" [(<list-item> | <item-body>)]
<bullet-item>             ::= "*" [(<list-item> | <item-body>)]
<item-body>               ::= <defined-term> | [<whitespace>] <inline-text>

<defined-term>            ::= ";" <text> [ (<definition>)]
<definition>              ::= ":" <inline-text>

Semantics:

  • <indent-item> and <definition> are translated to a <dd> element, wrapped in a <dl>
  • A <bullet-item> is translated to a <li> element wrapped in a <ul>.
  • An <enumerated-item> is translated to a <li> element wrapped in a <ol>.
  • A <defined-term> is translated to a <dt> element wrapped in a <dl>.

Notes:

  • The grouping of successive list items cannot be captured in EBNF. The simplest approach would appear to be a second pass whereby successive pairings of close/open list are eliminated. For example, <ol><li>Foo</li></ol><ol><li>Boo</li></ol> would be rewritten as <ol><li>Foo</li><li>Boo</li></ol>
  • <list-item> and <defined-term> are obviously matched in preference to <inline-text>. The user has to insert whitespace in order to get inline-text starting with #, ;, * or :.
  • The current parser accepts a wide range of syntax than the above, allowing other list items to appear after a definition list (;). This appears to be arbitrary, unpredictable and not particularly useful. See bug11894.

Table[edit]

From meta...minor reformatting

 <Table>                   ::=  "{|" [ " " TableParameters ] NewLine TableFirstRow "|}" ;
 <TableFirstRow>           ::= TableColumnLine NewLine | TableColumnMultiLine | TableRow ;
 <TableRow>                ::= "|-" [ CSS ] NewLine TableColumn [ TableRow ] ;
 <TableColumn>             ::= TableColumnLine | TableColumnMultiLine ;
 <TableColumnLine>         ::= "|" InlineText [ "|" TableColumnLine ] ;
 <TableColumnMultiLine>    ::= "|" [ TableCellParameters "|" ] AnyText NewLine [ TableColumnMultiLine ] ;
 <TableParameters>         ::= CSS | ? HTML table attributes ? ;
 <TableCellParameter>      ::= CSS | ? HTML cell attributes ? ;

Space block[edit]

Starting a line with a space creates a pre-formatted block of text similar to using <pre>. The big difference is that the contained text is still parsed and rendered normally.

<space-block>             ::= " " <inline-text> <newline> [ {<space-block-2} ]
<space-block-2>           ::= " " [<inline-text>] <newline>
Rendering
The block is surrounded with <pre>. White space and newlines are preserved literally.
Note that the first line of a space block must have text in it. Subsequent lines can be composed of just spaces.