Sweble is an alternative parser for MediaWiki's Wikitext, written in Java.
We also wrote a paper on the design of the Sweble Wikitext Parser (Archived 2015-02-24 at the Wayback Machine) which can be found in download section of the Sweble wiki. More information on Sweble and its parser can be found on the Sweble Blog.
The Sweble parser produces an abstract syntax tree (AST) as output. The AST is, as the name suggests, close to the actual syntax of the Wikitext document. It is also specific to the design of the parser and the surrounding framework and therefore not practical as data interchange or storage format. A brief description of the AST used by the Sweble parser can be found in the Sweble repository tree.
The eXtensible Wikitext Markup Language is an XML language designed to work as data exchange and storage format for articles written in MediaWiki's Wikitext. The language is defined in a technical report that can be found in the download section of the Sweble wiki.
The XML Schema file that defines XWML can be found in the Sweble repository tree.
Wom Java interfaces
The Wikitext Object Model (WOM) is defined in the same technical report in which the XWM language is defined. The WOM is similar to the Document Object Model (DOM) of HTML which defines a way to access and modify an HTML document in a program. The WOM Java interfaces define a way to access and modify a XWML document inside a program. The Java interfaces can be found in the Sweble repository tree.