Parsoid/Wikimania 2014

Lets summarize our talk plans for Wikimania

Talk abstract
Over the last year, most wikipedias have seen the roll out of Visual Editor. In this presentation, we (Parsoid developers) are going to talk about Parsoid, one of the important technologies that enable visual editing in the presence of wikitext. Parsoid converts wikitext to (editable) HTML and converts (edited) HTML back to wikitext. The presentation will start out presenting the problem of bi-directional conversion between wikitext and HTML and what makes it hard. We will also present our testing methodology and how we ensure that Parsoid's conversion is accurate.

VisualEditor is only one of the clients of Parsoid. After having introduced Parsoid, we'll discuss how the RDFa annotated HTML representation of wikitext lets us support other clients and applications. For example, Flow uses Parsoid to support wikitext editing for users that prefer to participate in Flow discussions via wikitext. Kiwix uses Parsoid's HTML output to generate offline dumps of various wikipedias.

We'll then present how Parsoid's API can enable other applications and how you might be able to use this for building other gadgets / tools, etc.

Finally, we will close our presentation by talking about future plans for Parsoid -- how we are hoping to move towards storing parsed HTML for pages to enable efficient editing and access to content without having to deal with wikitext. Parsoid deals with wikitext so you don't have to.

Questions to answer

 * What is the problem we are trying to solve?
 * Why is this hard? (examples!)
 * How do we address those problems?
 * How does having HTML+RDFa enable new features?
 * kiwix, PDF rendering, lintoid, Google, translations, Flow, ...
 * How can I use this in my gadget / bot / whatever?
 * How to use the API (hopefully save API by then)
 * What are the future plans re content, templating etc?
 * Rashomon, fast page loads for logged-in users, HTML templating, ?

Outline
Here's one possible outline: (cscott)


 * 1) Introduction / motivating example
 * 2) The difficulties of wikitext
 * 3) * wikitext tarpits
 * 4) * parser codebase
 * 5) * practical issues: hard to write bots, etc
 * 6) Vision
 * 7) * A more standard representation
 * 8) * Editable with existing CE tools
 * 9) The HTML+RDFa promised land
 * 10) * Some examples: it's just HTML!
 * 11) ** can use jquery to find all links, etc
 * 12) * RDFa semantic data
 * 13) Current applications
 * 14) * Visual Editor
 * 15) * kwix
 * 16) * PDF rendering
 * 17) Future applications
 * 18) * lintoid
 * 19) * better templating
 * 20) * easier bots
 * 21) * unified storage?
 * 22) Community
 * 23) * how can i use parsoid? (parsoid API service, etc)