Parsoid/API: Difference between revisions

From mediawiki.org
Content deleted Content added
No edit summary
No edit summary
Line 6: Line 6:


== Common HTTP headers supported in all entry points ==
== Common HTTP headers supported in all entry points ==

; Accept-encoding : Please accept gzip.
; Accept-encoding : Please accept gzip.
; Cookie : Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so ''do not send a cookie for public wikis if you care about caching''.
; Cookie : Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so ''do not send a cookie for public wikis if you care about caching''.
Line 13: Line 14:


=== Common path parameters across all requests ===
=== Common path parameters across all requests ===

; domain
; domain
: The hostname of the wiki.
: The hostname of the wiki.
Line 33: Line 35:
Pagebundle blobs have the form,
Pagebundle blobs have the form,


<source lang="JavaScript">
<source lang="javascript">
{
{
"html": {
"html": {
Line 80: Line 82:
==== Wikitext -> HTML ====
==== Wikitext -> HTML ====


<code> POST /:domain/v3/transform/:format1/to/:format2/:title?/:revision? </code>
<code> POST /:domain/v3/transform/:from/to/:format/:title?/:revision? </code>


; format1: wikitext
; from: wikitext
; format2: One of '''html''' or '''pagebundle'''
; format: One of '''html''' or '''pagebundle'''


The payload can contain,
The payload can contain,


<source lang="JavaScript">
<source lang="javascript">
{
{
"wikitext": "...", // if omitted, a title is required to fetch wt source
"wikitext": "...", // if omitted, a title is required to fetch wt source
Line 99: Line 101:


==== HTML -> Wikitext ====
==== HTML -> Wikitext ====

<code>POST /:domain/v3/transform/:format1/to/:format2/:title?/:revision? </code>
<code>POST /:domain/v3/transform/:from/to/:format/:title?/:revision? </code>
; format1

; from
: One of '''html''' or '''pagebundle'''
: One of '''html''' or '''pagebundle'''
; format2: wikitext
; format: wikitext


Some payload parameters are also accepted: '''scrub_wikitext'''
Some payload parameters are also accepted: '''scrub_wikitext'''


== Examples ==
== Examples ==

Some simple GET requests to a Parsoid HTTP server bound to <code>localhost:8000</code>.
Some simple GET requests to a Parsoid HTTP server bound to <code>localhost:8000</code>.


Line 116: Line 121:


Returns ''application/json''
Returns ''application/json''

POSTing the following blob,

<source lang="javascript">
{
wikitext: "== h2 =="
}
</source>

to,

<code><nowiki>http://localhost:8000/localhost/v3/transform/wikitext/to/html/</nowiki></code>

returns,

<source lang="html5">
<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head ...>...</head><body data-parsoid='{"dsr":[0,8,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><h2 data-parsoid='{"dsr":[0,8,2,2]}'> h2 </h2></body></html>
</source>


For more intricate examples, see [http://git.wikimedia.org/blob/mediawiki%2Fservices%2Fparsoid.git/master/tests%2Fmocha%2Fapi.js Parsoid's API test suite].
For more intricate examples, see [http://git.wikimedia.org/blob/mediawiki%2Fservices%2Fparsoid.git/master/tests%2Fmocha%2Fapi.js Parsoid's API test suite].


== Older entry points ==
== Older entry points ==

'''''These versions have been deprecated.'''''
'''''These versions have been deprecated.'''''



Revision as of 00:53, 21 January 2016

Parsoid converts MediaWiki's Wikitext to XHTML5 + RDFa and back.

In addition to the API defined below, the Parsoid service provides some form-based debugging tools at /. These are subject to change and may disappear at any time.

Common HTTP headers supported in all entry points

Accept-encoding
Please accept gzip.
Cookie
Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so do not send a cookie for public wikis if you care about caching.
x-request-id
A request id that will be forward to the Mediawiki Api.

v3 API

Common path parameters across all requests

domain
The hostname of the wiki.
title
Page title -- needs to be urlencoded (percent encoded).
revision
Revision id of the title.
format
Input / output format of content - wikitext, html, or pagebundle
wikitext
Plain text that is treated as wikitext. Content type is text/plain.
html
Parsoid's XHTML5 + RDFa output, which includes inlined data-parsoid attributes. The HTML conforms to the MediaWiki DOM spec. Content type is text/html.
pagebundle
A JSON blob containing the above html with the data-parsoid attributes split out and ids added to each node. Content type is application/json.

Pagebundle blobs have the form,

{
  "html": {
    "headers": {
      "content-type": "text/html;profile='mediawiki.org/specs/html/1.0.0'"
    },
    "body": "<!DOCTYPE html> ... </html>"
  },
  "data-parsoid": {
    "headers": {
      "content-type": "application/json;profile='mediawiki.org/specs/data-parsoid/0.0.1'"
    },
    "body": {
      "counter": n,
      "ids": { ... }
    }
  }
}

Common payload / querystring parameters across all requests

body_only
Optional boolean flag, only return the HTML body.innerHTML instead of a full document.
scrub_wikitext
Optional boolean flag, which normalizes the DOM to yield cleaner wikitext than might otherwise be generated.

GET requests

Wikitext -> HTML

GET /:domain/v3/page/:format/:title/:revision?

revision
Revision is optional, however GET requests without a revision id should be considered a convenience method. If no revision id is provided, it'll redirect to the latest revision.
format
One of html or pagebundle

Some querystring parameters are also accepted: body_only

POST requests

The content type for the POST payload can be: application/x-www-form-urlencoded, application/json, or multipart/form-data

Wikitext -> HTML

POST /:domain/v3/transform/:from/to/:format/:title?/:revision?

from
wikitext
format
One of html or pagebundle

The payload can contain,

{
  "wikitext": "...",  // if omitted, a title is required to fetch wt source
  "title": "...",  // optional, instead of in the path
  "revid": n,  // optional, instead of in the path
  "body_only": true  // optional
}

Some other fields exist (including original and previous for expansion reuse). See Parsoid's API test suite for their use.

HTML -> Wikitext

POST /:domain/v3/transform/:from/to/:format/:title?/:revision?

from
One of html or pagebundle
format
wikitext

Some payload parameters are also accepted: scrub_wikitext

Examples

Some simple GET requests to a Parsoid HTTP server bound to localhost:8000.

http://localhost:8000/en.wikipedia.org/v3/page/html/User:Arlolra%2Fsandbox/696653152

Returns text/html

http://localhost:8000/en.wikipedia.org/v3/page/pagebundle/User:Arlolra%2Fsandbox/696653152?body_only=true

Returns application/json

POSTing the following blob,

{
  wikitext: "== h2 =="
}

to,

http://localhost:8000/localhost/v3/transform/wikitext/to/html/

returns,

<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head ...>...</head><body data-parsoid='{"dsr":[0,8,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><h2 data-parsoid='{"dsr":[0,8,2,2]}'> h2 </h2></body></html>

For more intricate examples, see Parsoid's API test suite.

Older entry points

These versions have been deprecated.

Parsoid/API/v2

Parsoid/API/v1