Topic on Talk:Parsoid

Cannot GET /localhost/mywiki/v3/transform/wikitext/to/html

6
Summary by Arlolra

urls with "transform" in the path only accept POST requests

Emske (talkcontribs)

Hello,

I've been having trouble issuing the wikitext to html command for a whole MediaWiki page. I've tried out the parse.js program on small text, but I need to be able to convert wikitext to html for a whole page. In the long-run, I'll need to be able to do it for several pages.

I tried recreating the command listed in mediawiki.org/wiki/Parsoid/API#For_wikitext_->_HTML_requests:

/:domain/v3/page/:format/:title/:revision?

by using :

http://localhost:8000/localhost/mywiki/v3/transform/wikitext/to/html

where 'localhost/mywiki' is the path to my wiki's main page (I have another page, titled 'Sandbox', located at localhost/mywiki/index.php/Sandbox that I'd like to run wikitext to html on).

However, the above command I tried out returned the following:

Cannot GET /localhost/mywiki/v3/transform/wikitext/to/html

What is the proper implementation of this command? Or should I be doing something different?

Arlolra (talkcontribs)

The domain part of /:domain/v3/page/:format/:title/:revision? is supposed to correspond to what you've defined in your config.yaml

I imagine you've got something like,

mwApis: - uri: 'http://localhost/mywiki/api.php'

 domain: 'localhost'

In which case, you'd want to query,

http://localhost:8000/localhost/v3/page/html/Main_Page or http://localhost:8000/localhost/v3/page/html/Sandbox

Note that urls with "transform" in the path only accept POST requests.

See https://github.com/wikimedia/parsoid/blob/master/lib/api/ParsoidService.js#L212-L214

Emske (talkcontribs)

Hi Arlolra,

Thank you so much for the swift response. The latter query appears to have worked. It displayed my Sandbox page in HTML.

However, I would like the HTML code dump (preferably in a txt file, but I think JSON would also work). Is there a way to output the actual HTML?

Thanks so much again!

Arlolra (talkcontribs)

Try,

node bin/parse.js --config --domain localhost --pageName Sandbox < /dev/null

Emske (talkcontribs)

I tried that; the system couldn't find the path specified (I'm guessing /dev/null, but I can look through the parsoid or node_modules directory to try a different path).

I tried executing without < /dev/null and got:

Waiting for stdin...

What input is it waiting on? I thought the Sandbox for the --page flag wouldn't covered that?

Arlolra (talkcontribs)

If you supply an input with the --pageName argument set, it parses the input in the context of the page. If no input is supplied, it'll fetch the source for that page. If you hit ctrl-c when it says "Waiting for stdin ..." it should do what you expect. Piping in < /dev/null is another way of achieving the same thing.