Wiki Interchange Format

Magnus Manske (MediaWiki), Eugene Eric Kim (PurpleWiki), SvenDowideit (TWiki), and Christoph Lange (along with a few others who dropped in and out) gathered at WikiMania on August 4, 2005 to discuss moving forward on a Wiki Interchange Format. Although none of us were intimately familiar with previous work in this area (WikiInterchangeFormat, WikiInterchangeFormat, CommunityWiki:WikiInterchangeFormat, and InterWikiMarkupLanguage), not surprisingly, there is quite a bit of overlap. Nevertheless, I think the way we discussed the issue and our plan moving forward is useful. The following are notes from the meeting; further discussion will most likely take place at WikiInterchangeFormat.

Assumptions
A universal markup language is a bad idea. First, it's not realistic. People won't agree on a standard, because there aren't good reasons for it. Many argue that a single markup standard would make it easy for users to use multiple Wikis. While true, this is not the correct solution. Alternative ways of entering data into Wikis is a good thing. One alternative entry method is a WYSIWYG text widget. That more than universal markup will make things easier on users.

100 percent semantic interoperability is impossible. This complicates interchange, but all in all, it's also a good thing. Different domains will have different needs; there will never be such a thing as a one-size-fits-all Wiki. Moreover, innovation requires breaking from the mold. If we assume innovation is good, then we must also assume that full semantic interoperability will never be possible.

Use Cases

 * Migrating Wikis (probably relatively minor)
 * Universal Wiki export format
 * Client-Side interaction
 * Online/offline Wiki synchronization

Note that the first three are also listed at WikiInterchangeLanguage.

Design
Simple, but complete.

Create an XML vocabulary for the most common subset of WikiText markup formats. XHTML is as good a candidate as any.

Create a tag (with namespaces for different Wikis) for custom markup. Most likely CDATA.

For Custom markup, encourage the export of both the source makeup from that Wiki, and some visually representative rendered output that can be used to show the result if it is not supported by the importing system (More relevant when using this format for editing, rather than a pure import)

Discussion
This will only be lossy for different Wiki-to-Wiki interchange. It won't be lossy for same Wiki-to-Wiki interchange, which is useful for dumping/serializing the database and offline-online Wiki synchronization.

Import scripts will determine what to do with markup a Wiki does not understand or handles differently. For example, if MediaWiki sees a tag representing a TWiki plugin, it will most likely decide to ignore it. If UseModWiki sees a  tag containing illegal syntax for internal links, it can choose to either transform it or even delink it. It's all up to the clients.

A custom tag could conceivably contain both tag information and also rendered output. Magnus suggested that this might be useful for the MediaWiki math rendering.

Next Steps
Make sure we look at the work that's been done before. We don't want to be Yet Another Proposal; we want to work with the community and build on top of existing work.

Develop a spec.

Implement. Magnus already has some code. Multiple serializations is already easy-to-do in PurpleWiki. Sven is not highly motivated due to TWiki's nature, but we've got three more days to beat him into submission.

Publicize and get other folks on board.

Other Links

 * The Semantic Web presentation at Wikimania by John Breslin An ontology for describing and exchanging articles shows the use of rdf as the interchange format. This would be an existing, and already useful format that we could use (oops, i said we).
 * Recent approaches go towards using HTML and RDFa; see Wiki Specification Request 3.