Extension Syntax

This page is used to discuss and vote on a syntax for extensions to the MediaWiki software. These include:
 * LaTeX (mathematics, formulas, chess..)
 * musical notes (Lilypond)
 * plots and timelines (gnuplot, ploticus, EasyTimeline)
 * SVG->PNG rendering
 * hieroglyphs (WikiHiero)
 * map rendering
 * 3d models rendering
 * anything else that's cool and useful

There will be a brainstorming phase of 7 days, after which voting will start on these proposals (on April 4). The voting phase will last seven days.

'Voting has begun. Voting will last until April 12, 2004, 20:00 UTC. PLEASE DO NOT VOTE UNLESS YOU UNDERSTAND THE PROBLEM. THIS IS ABOUT FINDING A SYNTAX TO ENCAPSULATE DIFFERENT EXTENSIONS, NOT ABOUT THE SYNTAX OF THE EXTENSIONS THEMSELVES.

Only registered users can vote. You can vote for any number of options, but not for and against an option, and not for a single option multiple times.

= Issues =


 * Find an easy to use and intuitive syntax
 * Be consistent with already used syntax
 * Be able to use data between the tag or from on other page
 * If possible, don't forbid to use special caracters as data (ie.  may forbid to use }} as data)

= Proposals =

Erik's proposal
We should use an XML-like syntax for extensions:
 * 1) &lt;math&gt;insert code here&lt;/math&gt;
 * 2) &lt;music&gt;insert code here&lt;/music&gt;
 * 3) &lt;hiero&gt;insert code here&lt;/hiero&gt;

Long code segments could be moved into the template namespace and transcluded in the standard manner, e.g. Template:Beethoven's 9th Symphony (sample) would contain the &lt;music&gt;...&lt;/music&gt; code, and could be transcluded using. (This syntax obsoletes the current "msg:" syntax and will go live with the next software update.)

Arguments for:
 * Many people are already familiar with XML-style syntax
 * Indeed, many people are already familiar with using $$ $$, and many articles already include this syntax.
 * It is immediately obvious over long segments of code what class a particular segment belongs to (i.e. you can look at the bottom of a music segment and know that it is music code, because there's a &lt;/music&gt; closing tag)
 * It is easy to remember (at least for HTML-savvy people)
 * It is sufficiently unique to avoid parsing problems (as opposed to a short sequence of control characters, which might conflict with any present or future extensions)
 * It is easy to standardize the parser for it
 * It is at least somewhat more wiki-like than "&lt;rend class="music"&gt;"
 * It is consistent with existing uses of different kinds of brackets
 * [ is associated with links ( [ext. link] int. link ); { is associated with "transclusion" (e.g., and the up-coming  templates ); < is already associated with HTML and HTML-style formatting tags (such as  , , , , ... ). The extension syntax is marking a special formatting rule for the enclosed text, and therefore fits with the other uses of.

Arguments against:
 * In short segments of code the redundant closing tag can be annoying
 * Needs to be localized
 * Easy using Tim's MagicWord class, which also allows for synonyms, e.g. the English version could be valid in all translations to allow for easy copy & pasting. Possible disadvantage of confusing users ("what's the difference between "math" and "qxyz"?), but most users who would deal with these issues would probably be familiar with both languages.
 * Gives the false impression of being real HTML or XHTML
 * arguably, it is real XML
 * I rather dislike the asociation that it's XML. We should think of it as an arbitrary syntax convention styled after XML, because otherwise we'll start thinking of it as a hierarchical element structure, which is not the case with wiki-markup in general. (In the same way, Lilypond's authors stress that it's not TeX, even though it looks similar) - IMSoP 16:38, 31 Mar 2004 (UTC)
 * In turn gives (to the slightly less technically- and logically-minded people) the false impression that Wikipedia allows all of HTML
 * Less technically and logically minded people probably don't know what HTML is anyway, and those who do will easily figure out what works and what doesn't
 * Gives the false impression of nestability ( $$ x^2 =  $$ )
 * The nesting argument can be applied to any syntax that has different opening and closing tags. Nestings will be possible where they make sense, just like you used the &lt;nowiki&gt; tag to create the example above, a case of nesting. Nestings are not possible / have no effect in HTML where they do not make sense.
 * Rather hard to type
 * Easy to go wrong and mess up a whole page ( x^2 $$ )
 * We could auto-fix that
 * that could begin to make the parser's behaviour more confusing and less predictable
 * Hardly, as it would only happen during errors. There is no slippery slope here.
 * Generally just as easy to spot and fix
 * The same is true of or  or any of the other HTML/XML-like tags we already use
 * then this is equally an argument against and
 * Goes counter to the trend of instituting a wiki-like syntax for everything (like tables); if tags-that-look-like-HTML were what we are looking for, we wouldn't have needed the wiki table syntax.
 * The reason we created a table syntax is that tables consist of many, many different, nested tags.
 * articles with a lot of mathematics and chemistry would then also contain many, many tags.
 * Identical and in clear distance from another, not nested and different.
 * This syntax will only create the same number of tags as any other syntax on this page. The only difference will be the form (including size) of those tags.
 * There is no real trend to replace "everything" with our own syntax (e.g. HTML table parameters, CSS and DIV tags).
 * Difficult to read, because the tags distract and don't give an intuitive visual appearance of encapsulation (parenthesisation, or whatever you call it)
 * Among programmers/hackers/geeks, the tags are the most widely recognized form of labeled encapsulation. Laymen, however, may find it hard to grasp.
 * Unlabeled encapsulation is confusing over large segments, and more difficult to learn because the brain recognizes and remembers new words easier than new symbols
 * <> and  are no less symbols than and  are (i.e. this is not the only form of labelled encapsulation suggested here)
 * Nobody claimed it was. It is, however, the most widely recognized one.
 * New extensions could potentially clash with other kinds of markup which use the same syntax (though unlikely)

Votes for Erik's proposal

 * 1) Eloquence 00:18, 6 Apr 2004 (UTC)

Magnus' proposal
For images:

can produce a PNG or an SVG, depending on user settings or browser identification. That can utilize goodies like thumbnail generation etc. An image is an image is an image, after all.

For more complex structures (hiero, music):

or. The first variant will use "stuff" directly as data, while the second one will use the data stored in stuff. That way, a complicated timeline can get its own "article" (I suggest a "data:" namespace), while a few hieroglyphs can be entered directly.

Alternative for page reference (result of discussion):



Pros :
 * Simple syntax
 * Consistent with existing syntax
 * – parser is already in place
 * Allows for easy handling of large amounts of raw data without cluttering the article source

Cons :
 * Needs to be localized (easy using Tim's MagicWord class, which also allows for synonyms, e.g. the English version could be valid in all translations to allow for easy copy & pasting)
 * Re: images, does not allow for SVG editing as specified
 * Re: extensions, makes large blocks of text difficult to read because it's not clear what the closing tag refers to (as opposed to, e.g. &lt;/math&gt;, where it is immediately clear that it ends a math block
 * And if we need }} in markup ? Regular expression to match is not a "parser". Taw 18:50, 28 Mar 2004 (UTC)
 * Why not? I fail to see the problem here. If you want to display two curly braces in the text, put them in &lt;nowiki&gt; tags. -- Stw 10:45, 29 Mar 2004 (UTC)
 * 2 curly braces in embedded markup, not in text: ( \frac{10^{15}}{\pi} ). That's very common combination in TeX. One of the reasons why I chose  was because there's no chance in hell that  $$  would appear in math markup. Taw 01:56, 30 Mar 2004 (UTC)

Comments
I'd like to see the extension part go away in image links. The link should be so that whatever format happens to be best can be used. Currently one can't change the format of an image without uploading to a new name (with the new extension) and then changing all the links. Replacement of an image could be done with a special 'replace image' link on each image page. Audin 04:07, 30 Mar 2004 (UTC)

Votes against Magnus' proposal
Eloquence 00:20, 6 Apr 2004 (UTC) (too confusing with existing use of curly brackets, inline code will get hard to read)

Forum like (Aoineko)

 * [math]...[/math]
 * [hiero]...[/hiero]
 * [music]...[/music]

Forum like (with no /) (Aoineko)

 * [math]...[math]
 * [hiero]...[hiero]</tt>
 * [music]...[music]</tt>

Forum like (with same end marker) (Aoineko)

 * [math]...[end]</tt>
 * [hiero]...[end]</tt>
 * [music]...[end]</tt>

HTML like alternative 1 (Aoineko)
no / end marker
 * &lt;math&gt;...&lt;math&gt;</tt>
 * &lt;music&gt;...&lt;music&gt;</tt>
 * &lt;hiero&gt;...&lt;hiero&gt;</tt>

HTML like alternative 2 (Aoineko)
alway same end marker
 * &lt;math&gt;...&lt;end&gt;</tt>
 * &lt;music&gt;...&lt;end&gt;</tt>
 * &lt;hiero&gt;...&lt;end&gt;</tt>

Votes against

 * 1) Eloquence 00:20, 6 Apr 2004 (UTC)

Magnus' proposal alternative 1 (Aoineko)
directly as data (like  </tt>)
 *  math:... </tt>
 *  hiero:... </tt>
 *  music:... </tt>

from data page (like  </tt>)
 * <tt>

</tt>
 * <tt></tt>
 * <tt></tt>

Magnus' proposal alternative 2 (Aoineko)

 * <tt>

</tt>
 * <tt></tt>
 * <tt></tt>

Where the software check if foo is a valid page (data:foo). If true, parse the data page; If not, parse the text in tags.

Magnus' proposal alternative 3 (Aoineko)
directly as data
 * <tt>

</tt>
 * <tt></tt>
 * <tt></tt>

from data page (like #redirect )
 * <tt></tt>
 * <tt></tt>
 * <tt></tt>

Magnus' proposal alternative 4 (Aoineko)
To be able to use }} inside the code.
 * <tt>

or  </tt>
 * <tt>  or    </tt>
 * <tt>  or   </tt>

Uli's Proposal
Abstract: This is essentially a variant of Erik's proposal with more intelligent templates.

I had suggested a thing like that some time ago in the discussion on navigation bars. As I understand, we have some issues, that should be covered:


 * Navigational data (like theme-rings, article grouping ("history of germany, part 1..)) which should not be rendered at the position they are placed in the text, but instead at a - probably skin specific - position, and possibly only in certain situations (for example, theme rings should not be rendered in a print view).
 * Defining short non-textual data within articles (Hieroglyphs), to be rendered at the position where they are placed
 * Including long non-textual data, to be rendered at the position where they are placed
 * Possibly also including long non-textual data, to be rendered at a specific position (I'm thinking at those large information tables in the upper right corner of states, cities, elements and so on)
 * Probably we want to have some sort of parameters to pass to a transcluded item

So, suppose we get a namespace "Include:". We would seperate that namespace again by convention into type-specific segments, so article fragments containing music data would to be named "Include:Music:Beethovens 9th Symphony", Tabluar data to be included into an article would be named "Include:Table:Dollar rate since 1991", the big tables (upper right corner) possibly "Include:Infotable:Uranium", a navigation bar "Include:Navlist:10 largest cities of Island" and so on. It's important to have the type of the included data somehow coded into the article name, so you can render that fragment stand-alone!

Those fragments would be included with the already disussed syntax , . For not-included data, I'd prefere the XML-type syntax ( $$$$ ) - transcluding is something different than switching the syntax within the article, so we should seperate those two use-cases.

Very important: depending on the type (Music, Infotable, Navigation, ) of a transcluded fragment the software should not only decide on how to interpret the given data, but also on when and where to render.

Peter's proposal


pros: cons:
 * complex syntax, which is more suitable for programmers than users

Inline brackets proposal

 * [!math x^2 + y^2 = z^2  !]
 * [!hiero b-l:a-h  !]
 * [!music do re mi fa sol !]

Symbol bracket proposal
pros:
 * no key word to translate
 * quick to type
 * not significantly more difficult, for some perhaps even easier, to remember than " ... "
 * easy to read in source text, provides visual encapsulation/parenthesisation
 * easy to get right (it is intuitive to think you have to close the brackets you open)

cons:
 * Looks like a link, but isnt
 * Not an argument &mdash; nor is [[Image:]] . &mdash; Timwi 13:44, 30 Mar 2004 (UTC)
 * Well, in a sense it is: it links to an image somewhere else. Formulae don't involve anything stored anywhere else. IMSoP 00:50, 31 Mar 2004 (UTC)
 * ... yes they do ;-) But I see your point. &mdash; Timwi 10:04, 31 Mar 2004 (UTC)
 * not obvious which markup refers to which feature
 * hard to memorize or even only to recognize - with more and more randomly assigned special characters we might end up with a syntax that is only comprehensible for Perl programmers Erik Zachte 00:52, 31 Mar 2004 (UTC)


 * difficult to add additional types of markup
 * not really. Admittedly, once we have hundreds of extension, it might start to be difficult to remember which one is @# and which one is #$, but (1) I don't think we'll ever have that many extensions (HTML and Unicode do evolve too); (2) I don't think any single user uses more than a couple of them. &mdash; Timwi 10:04, 31 Mar 2004 (UTC)

IMSoP's proposal
<tt> ... </tt> <tt> ... </tt>

Advantages:
 * Retains most of the advantages of
 * Easier to add new extensions - i.e. <tt> </tt> cannot possibly clash with something other than an extension, unlike <tt> </tt>
 * Even more obvious to a reader that this is a block of special <tt>foo</tt> markup and not just gibberish

Disadvantages:
 * Harder/slower to type
 * Even more localisation needed, unless an international word for "special" can be found

Phil's proposal
Building upon IMSoP's proposal:

<tt> ... ...  ... </tt>

Advantages as IMSoP's proposal plus:
 * wiki needs no localisation
 * &lt;wiki&gt; is intuitively the inverse of <tt>&lt;nowiki&gt;</tt> and thuswise easier to remember

Saff's proposal
''Note: This is not a generic proposal as it does not allow for arbitrary extensions. However, it does allow for a certain type of user-created extensions. If you think that is enough, or should be used in addition to the other proposals, vote for this one.''

I like the go syntax used at [senseis.xmp.net Sensei's Library], a go wiki. It looks something like:

$$B Main line $$ -- $$ . . . . . . . 2 1 . . | $$ . . . . . . . O X O O | $$ . . X. X. O. X O 5 | $$ . . . . . O X. X O X | $$ . . . . . . O O X X 6 | $$ . . . . . . . . O 3 4 | $$ . . . . . . . O. O X | $$ . . . . . . . . . O. | $$ . . . . . . . . . . . | $$ . . . . . . . . . . . |

So, what I think would be perfect for many things (games, music, probably not svg) is create a new namespace, called markup or something. On a "markup" page, certain text strings can be marked with replacement by any object (different text, images, etc.):

Maybe something like: ... == Text to replace == Comments about this item.
 * 1) What to replace it with

(example "markup:go") == X == An 'X' represents a black stone.
 * 1) [[Image:black_stone.png]]

== O == An 'O' represents a white stone.
 * 1) [[Image:white_stone.png]]

== -- == This represents the edge of a board. ... and so on, which would then be used:
 * 1) [[Image:board_edge.png]]

$$B Main line $$ -- $$ . . . . . . . 2 1 . . | $$ . . . . . . . O X O O | $$ . . X. X. O. X O 5 | $$ . . . . . O X. X O X | $$ . . . . . . O O X X 6 | $$ . . . . . . . . O 3 4 | $$ . . . . . . . O. O X | $$ . . . . . . . . . O. | $$ . . . . . . . . . . . | $$ . . . . . . . . . . . |

For inline markup, just {{markup:music} a b c# } should work. The lack of using a closing bracket seems like not a big deal, no matter what you do there must be some terminating sequence. Multine markup should use a special starting character, like $, #, or :. It would then end at the last special character.

This could be very generic and wiki-like. Kevin Saff 15:29, 2 Apr 2004 (UTC)