Markup spec/BNF/Article

Wiki-page
The top-level element is wiki-page which describes the contents of a page. A page can either be a redirect or a normal article.

              ::= [ ] | [ ] ::=    ( | | EOL)            ::= FROM_LANGUAGE_FILE

, ,  and are defined in ../Links/ Notes:

The  is language-specific, and may have more than one possible value. By default the value for the right-hand-side of the expression (replacing FROM_LANGUAGE_FILE) is, but in Estonian it is. This match is case-insensitive (though this again may be overridden in the language file).

should be non-greedy, matching the largest subset of characters that does not contain .

For example,   will match the following, and treat it as a redirect to foo:
 *  #REDireCTnon%^sense[[foo|and this is parsed as article content </tt>


 * Interwiki prefixes may not be supported in redirect links. (Is this configurable?)
 * The following the redirect link is not rendered. However, it is parsed. So, interwiki links, category links and even normal links are still treated and behave "normally".
 * Anchors (Article#Section) are supported, but not yet described in the grammar.

Article
This describes the contents of an article. An article consists of blocks, which come in two flavours: paragraphs and special blocks. Both of them end with a newline. Paragraphs are separated by empty lines.

<special-block-and-more> ::= <special-block> ( EOF | [ ] <special-block-and-more>                                                      | ( | "") <paragraph-and-more> ) <paragraph-and-more>     ::= ( EOF | [ ] <special-block-and-more>                                                  | <paragraph-and-more> )

The nonterminals special-block-and-more</tt> and paragraph-and-more</tt> are not disjoint; the parser should first try to match against special-block-and-more</tt>.

The expression ( | "")</tt> is a greedy version of [ ]</tt>. If both the empty string and a newline can be matched, then the former expression matches the newline, while the latter expression would match the empty string according to the conventions on ../.


 * For the definition of special block, see ../Special block.


 * Note
 * Any line that does not start with one of the following is not a special block:
 * This should assist in parsing.
 * Hey, that's almost what it says in the current parser. I must be onto something. Wonder why it doesn't cover space or = though.

Paragraph
Every paragraph ends with a newline character. A paragraph translated in a &lt;p&gt; element. ::= [<lines-of-text>] | <lines-of-text> <lines-of-text>          ::= <line-of-text> [<lines-of-text>] <line-of-text>           ::= <inline-text>


 * For the definition of inline text, see ../Inline text.

The recursion in the second rule should be non-greedy, i.e., it should match as few lines as possible. For instance,
 * abc</tt>
 * </tt>

should be parsed as one line-of-text</tt> and one horizontal-rule</tt>, but
 * abc</tt>
 * ---</tt>

should be parsed as two line-of-text</tt> nonterminals.

If a paragraph starts with a newline, the newline is as a &lt;br&gt; element.

Block HTML
(not referred to yet) BlockHTML = Pre | Blockquote | TableHTML | Div | HeaderHTML ;

String Types
''This text came from Meta-Wiki. It's not immediately compatible with the surrounding text (it's EBNF, rather than BNF, for a start). However it is much more precise about the nature of lines and captures rules about whitespace normalisation.''

Fundamental strings

WikiMarkupCharacters = "|" | "[" | "]" | "*" | "#" | ":" | ";" | "<" | ">" | "=" | "'" | "{" | "}" ;

UnicodeCharacter = ? all supported Unicode characters ? - Whitespaces ; UnicodeWiki = UnicodeCharacter - WikiMarkupCharacters ; PlainText = UnicodeWiki | "  " { "|" | "[" | "]" | "<" | ">" | "{" | "}" } "   "          | UnicodeWiki { " " } ( "*" | "#" | ":" | ";" ) | UnicodeWiki [ " " ] "=" [ " " ] UnicodeWiki | UnicodeWiki "'" | " '" UnicodeWiki ; WhiteSpaces = " " | NewLine | ? carriage return ? | ? line feed ? | ? tab ? | ? variants of spaces ? ; NewLine = ? carriage return and line feed ? ;

Article strings

Line = PlainText { PlainText } { " " { " " } PlainText { PlainText } } ; Text = Line { Line } { NewLine { NewLine } Line { Line } } ;

Titles

PageName = TitleCharacter, { [ " " ] TitleCharacter } ; PageNameLink = TitleCharacter, { [ " " | "_" ] TitleCharacter } ; SectionTitle = ( SectionLinkCharacter - "=" ) { [ " " ] ( SectionLinkCharacter - "=" ) } ; SectionLink = SectionLinkCharacter { [ "_" ] SectionLinkCharacter } ; LinkTitle = { UnicodeCharacter { " " } } ( UnicodeCharacter - "]" ) ;

TitleCharacter = UnicodeCharacter - BadTitleCharacters ; BadTitleCharacters = "[" | "]" | "{" | "}" | "<" | ">" | "_" | "|" | "#" ; SectionLinkCharacter = UnicodeCharacter - BadSectionLinkCharacters ; BadSectionLinkCharacters = "[" | "]" | "|" ;