Parsoid/API: Difference between revisions
No edit summary |
No edit summary |
||
Line 6: | Line 6: | ||
== Common HTTP headers supported in all entry points == |
== Common HTTP headers supported in all entry points == |
||
; Accept-encoding : Please accept gzip. |
; Accept-encoding : Please accept gzip. |
||
; Cookie : Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so ''do not send a cookie for public wikis if you care about caching''. |
; Cookie : Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so ''do not send a cookie for public wikis if you care about caching''. |
||
Line 13: | Line 14: | ||
=== Common path parameters across all requests === |
=== Common path parameters across all requests === |
||
; domain |
; domain |
||
: The hostname of the wiki. |
: The hostname of the wiki. |
||
Line 33: | Line 35: | ||
Pagebundle blobs have the form, |
Pagebundle blobs have the form, |
||
<source lang=" |
<source lang="javascript"> |
||
{ |
{ |
||
"html": { |
"html": { |
||
Line 80: | Line 82: | ||
==== Wikitext -> HTML ==== |
==== Wikitext -> HTML ==== |
||
<code> POST /:domain/v3/transform/: |
<code> POST /:domain/v3/transform/:from/to/:format/:title?/:revision? </code> |
||
; |
; from: wikitext |
||
; |
; format: One of '''html''' or '''pagebundle''' |
||
The payload can contain, |
The payload can contain, |
||
<source lang=" |
<source lang="javascript"> |
||
{ |
{ |
||
"wikitext": "...", // if omitted, a title is required to fetch wt source |
"wikitext": "...", // if omitted, a title is required to fetch wt source |
||
Line 99: | Line 101: | ||
==== HTML -> Wikitext ==== |
==== HTML -> Wikitext ==== |
||
<code>POST /:domain/v3/transform/: |
<code>POST /:domain/v3/transform/:from/to/:format/:title?/:revision? </code> |
||
; format1 |
|||
; from |
|||
: One of '''html''' or '''pagebundle''' |
: One of '''html''' or '''pagebundle''' |
||
; |
; format: wikitext |
||
Some payload parameters are also accepted: '''scrub_wikitext''' |
Some payload parameters are also accepted: '''scrub_wikitext''' |
||
== Examples == |
== Examples == |
||
Some simple GET requests to a Parsoid HTTP server bound to <code>localhost:8000</code>. |
Some simple GET requests to a Parsoid HTTP server bound to <code>localhost:8000</code>. |
||
Line 116: | Line 121: | ||
Returns ''application/json'' |
Returns ''application/json'' |
||
POSTing the following blob, |
|||
<source lang="javascript"> |
|||
{ |
|||
wikitext: "== h2 ==" |
|||
} |
|||
</source> |
|||
to, |
|||
<code><nowiki>http://localhost:8000/localhost/v3/transform/wikitext/to/html/</nowiki></code> |
|||
returns, |
|||
<source lang="html5"> |
|||
<!DOCTYPE html> |
|||
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head ...>...</head><body data-parsoid='{"dsr":[0,8,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><h2 data-parsoid='{"dsr":[0,8,2,2]}'> h2 </h2></body></html> |
|||
</source> |
|||
For more intricate examples, see [http://git.wikimedia.org/blob/mediawiki%2Fservices%2Fparsoid.git/master/tests%2Fmocha%2Fapi.js Parsoid's API test suite]. |
For more intricate examples, see [http://git.wikimedia.org/blob/mediawiki%2Fservices%2Fparsoid.git/master/tests%2Fmocha%2Fapi.js Parsoid's API test suite]. |
||
== Older entry points == |
== Older entry points == |
||
'''''These versions have been deprecated.''''' |
'''''These versions have been deprecated.''''' |
||
Revision as of 00:53, 21 January 2016
Parsoid converts MediaWiki's Wikitext to XHTML5 + RDFa and back.
In addition to the API defined below, the Parsoid service provides some form-based debugging tools at /
. These are subject to change and may disappear at any time.
Common HTTP headers supported in all entry points
- Accept-encoding
- Please accept gzip.
- Cookie
- Cookie header that will be forwarded to the Mediawiki API. Makes it possible to use Parsoid with private wikis. Setting a cookie implicitly disables all caching for security reasons, so do not send a cookie for public wikis if you care about caching.
- x-request-id
- A request id that will be forward to the Mediawiki Api.
v3 API
Common path parameters across all requests
- domain
- The hostname of the wiki.
- title
- Page title -- needs to be urlencoded (percent encoded).
- revision
- Revision id of the title.
- format
- Input / output format of content - wikitext, html, or pagebundle
- wikitext
- Plain text that is treated as wikitext. Content type is text/plain.
- html
- Parsoid's XHTML5 + RDFa output, which includes inlined data-parsoid attributes. The HTML conforms to the MediaWiki DOM spec. Content type is text/html.
- pagebundle
- A JSON blob containing the above html with the data-parsoid attributes split out and ids added to each node. Content type is application/json.
Pagebundle blobs have the form,
{
"html": {
"headers": {
"content-type": "text/html;profile='mediawiki.org/specs/html/1.0.0'"
},
"body": "<!DOCTYPE html> ... </html>"
},
"data-parsoid": {
"headers": {
"content-type": "application/json;profile='mediawiki.org/specs/data-parsoid/0.0.1'"
},
"body": {
"counter": n,
"ids": { ... }
}
}
}
Common payload / querystring parameters across all requests
- body_only
- Optional boolean flag, only return the HTML body.innerHTML instead of a full document.
- scrub_wikitext
- Optional boolean flag, which normalizes the DOM to yield cleaner wikitext than might otherwise be generated.
GET requests
Wikitext -> HTML
GET /:domain/v3/page/:format/:title/:revision?
- revision
- Revision is optional, however GET requests without a revision id should be considered a convenience method. If no revision id is provided, it'll redirect to the latest revision.
- format
- One of html or pagebundle
Some querystring parameters are also accepted: body_only
POST requests
The content type for the POST payload can be: application/x-www-form-urlencoded
, application/json
, or multipart/form-data
Wikitext -> HTML
POST /:domain/v3/transform/:from/to/:format/:title?/:revision?
- from
- wikitext
- format
- One of html or pagebundle
The payload can contain,
{
"wikitext": "...", // if omitted, a title is required to fetch wt source
"title": "...", // optional, instead of in the path
"revid": n, // optional, instead of in the path
"body_only": true // optional
}
Some other fields exist (including original
and previous
for expansion reuse). See Parsoid's API test suite for their use.
HTML -> Wikitext
POST /:domain/v3/transform/:from/to/:format/:title?/:revision?
- from
- One of html or pagebundle
- format
- wikitext
Some payload parameters are also accepted: scrub_wikitext
Examples
Some simple GET requests to a Parsoid HTTP server bound to localhost:8000
.
http://localhost:8000/en.wikipedia.org/v3/page/html/User:Arlolra%2Fsandbox/696653152
Returns text/html
http://localhost:8000/en.wikipedia.org/v3/page/pagebundle/User:Arlolra%2Fsandbox/696653152?body_only=true
Returns application/json
POSTing the following blob,
{
wikitext: "== h2 =="
}
to,
http://localhost:8000/localhost/v3/transform/wikitext/to/html/
returns,
<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head ...>...</head><body data-parsoid='{"dsr":[0,8,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><h2 data-parsoid='{"dsr":[0,8,2,2]}'> h2 </h2></body></html>
For more intricate examples, see Parsoid's API test suite.
Older entry points
These versions have been deprecated.