Wikitext/2011-01 format discussion

From mediawiki.org

This is an attempt to structure the many discussions from many threads on wikitech-l:

1. Goals: what are we trying to achieve?

  • Tool interoperability
    • Alternative parsers
    • GUIs
    • Real-time editing (ala Etherpad)
  • Ease of editing raw text
  • Ease of structuring the data
  • Template language with fewer squirrelly brackets
  • Performance
  • Security
  • What else?

2. Abstract format: regardless of syntax, what are we trying to express?

  • Currently, we don't have an abstract format; markup just maps to a subset of HTML (so perhaps the HTML DOM is our abstract format)
  • What subset of HTML do we use?
  • What subset of HTML do we need?
  • What parts of HTML do we *not* want to allow in any form?
  • What parts of HTML do we only want to allow in limited form (e.g. only safely generated from some abstract format)
  • Is the HTML DOM sufficiently abstract, or do we want/need some intermediate conceptual format?
  • Is browser support for XML sufficiently useful to try to rely on that?
  • Will it be helpful to expose the abstract format in any way

3. Syntax: what syntax should we store (and expose to users)?

  • Should we store some serialization of the abstract format instead of markup?
  • Is hand editing of markup a viable long term strategy?
  • How important is having something expressible with BNF?
  • Is XML viable as an editing format? JSON? YAML?

4. Tools (e.g. WYSIWYG)

  • Do our tool options get better if we fix up the abstract format and syntax?
  • Tools:
    • Wikia WYSIWYG editor
    • Magnus Manske's new thing
    • Line-by-line editing

....list goes on...

5. Infrastructure: how would one support mucking around with the data?

  • Support for per-wiki data formats?
  • Support for per-page data formats?
  • Support for per-revision data formats?
  • Evolve existing syntax with no infrastructure changes?