Parsoid/API

Parsoid converts MediaWiki's Wikitext to XHTML5 + RDFa and back.

In addition to the API defined below, the Parsoid service provides some form-based debugging tools at. These are subject to change and may disappear at any time.

Common HTTP headers supported in all entry points

 * Accept-encoding : Please accept gzip.
 * Cookie : Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so do not send a cookie for public wikis if you care about caching.
 * x-request-id : A request id that will be forward to the Mediawiki Api.

Common path parameters across all requests

 * domain
 * The hostname of the wiki.


 * title
 * Page title -- needs to be urlencoded (percent encoded).


 * revision
 * Revision id of the title.


 * format
 * Input / output format of content - wikitext, html, or pagebundle
 * wikitext
 * Plain text that is treated as wikitext. Content type is text/plain.
 * html
 * Parsoid's XHTML5 + RDFa output, which includes inlined data-parsoid attributes. The HTML conforms to the MediaWiki DOM spec. Content type is text/html.
 * pagebundle
 * A JSON blob containing the above html with the data-parsoid attributes split out and ids added to each node. Content type is application/json.

Pagebundle blobs have the form,

For wikitext -> HTML requests

 * body_only
 * Optional boolean flag, only return the HTML body.innerHTML instead of a full document.

For HTML -> wikitext requests

 * scrub_wikitext
 * Optional boolean flag, which normalizes the DOM to yield cleaner wikitext than might otherwise be generated.

Wikitext -> HTML

 * revision
 * Revision is optional, however GET requests without a revision id should be considered a convenience method. If no revision id is provided, it'll redirect to the latest revision.


 * format
 * One of html or pagebundle

Some querystring parameters are also accepted: body_only

POST
The content type for the POST payload can be:,  , or

Wikitext -> HTML

 * from: wikitext
 * format: One of html or pagebundle

The payload can contain,

Some other fields exist (including  for expansion reuse). See Parsoid's API test suite for their use.

HTML -> Wikitext

 * from
 * One of html or pagebundle


 * format: wikitext

The payload can contain,

Parsoid serializes HTML to a normalized form of wikitext. In order to avoid "dirty diffs" (differences outside the edited region of content) when serializing HTML generated from a given wikitext source, pass in the revision (either as  in the path or   in the payload) and optionally (as an optimization, because Parsoid will fetch / generate them if they're missing) the source, , and unedited html, (  and  ). This strategy is known as "selective serialization"; an example of which can be seen in the test suite.

HTML -> HTML
Parsoid exposes an API which transforms Parsoid-format HTML (encapsulated as a page bundle) to itself, performing a number of possible transformations. T114413 discusses some of the transformations, both actual and potential.

The payload is of the form:

The  field is a pagebundle blob, as described above.

The  field specifies the desired transformations, which are described in more detail below.

Redlinks
XXX: write me

Variant
See T43716.

XXX: write me

Content up/downgrade
XXX: write me

Wikitext -> Lint
Parsoid also exposes an API to get wikitext "syntax" errors for a given page, revision or wikitext.

The payload can contain:

Examples
For more intricate examples, see Parsoid's API test suite.

GET
Some simple GET requests to a Parsoid HTTP server bound to.

Returns text/html

Returns application/json

POST
POSTing the following blob,

to,

returns,

POST
POSTing the following blob,

to  returns

POST
POSTing the following blob

to  returns

Content Negotiation

 * Accept

When making a parse requests (wikitext->HTML), passing an  header defining an acceptable spec version will induce Parsoid to return HTML that satisfies that version, following Semantic Versioning caret semantics, or error with a   status code.

Older entry points
These versions have been deprecated.


 * Parsoid/API/v2
 * Parsoid/API/v1