Talk:Parsoid

Jump to navigation Jump to search

About this board

How to limit size of error_log?

1
Johnywhy (talkcontribs)

can i set to auto-purge old errors? or some other way to limit size?

If not, how to prevent any error logging?

My error log grew to 14 GB and maxed out my webhosting.

Reply to "How to limit size of error_log?"

after upgrade to Debian 11: Error loading data from server: apierror-visualeditor-docserver-http-error

2
RobFantini (talkcontribs)

Hello

After upgrading from debian buster to bullseye, when the edit key is pressed this message shows:

Error loading data from server: apierror-visualeditor-docserver-http-error: ⧼apierror-visualeditor-docserver-http-error⧽. Would you like to retry?

edit source mode works.

version info:

Product Version

MediaWiki 1.34.4

PHP 7.4.21 (apache2handler)

MariaDB 10.5.11-MariaDB-1

ICU 67.1

/var/log/parsoid/parsoid.log  : {"name":"parsoid","hostname":"wiki-v134","pid":932,"level":30,"levelPath":"info/service-runner","msg":"master(932) initializing 2 workers","time":"2021-08-22T18:49:14.192Z","v":0}

{"name":"parsoid","hostname":"wiki-v134","pid":1028,"level":60,"code":"MODULE_NOT_FOUND","requireStack":["/usr/lib/parsoid/node_modules/service-runner/lib/base_service.js","/usr/lib/parsoid/node_modules/service-runner/lib/master.js","/usr/lib/parsoid/node_modules/service-runner/service-runner.js"],"moduleName":"../src/lib/index.js","levelPath":"fatal/service-runner/worker","msg":"Cannot find module '../src/lib/index.js'\nRequire stack:\n- /usr/lib/parsoid/node_modules/service-runner/lib/base_service.js\n- /usr/lib/parsoid/node_modules/service-runner/lib/master.js\n- /usr/lib/parsoid/node_modules/service-runner/service-runner.js","time":"2021-08-22T18:49:14.692Z","v":0}


{"name":"parsoid","hostname":"wiki-v134","pid":932,"level":40,"message":"first worker died during startup, continue startup","worker_pid":1028,"exit_code":1,"startup_attempt":1,"levelPath":"warn/service-runner/master","msg":"first worker died during startup, continue startup","time":"2021-08-22T18:49:15.701Z","v":0}


from the log I can not see which module is missing.

I would appreciate any advice or suggestions to fix this.

RobFantini (talkcontribs)
Reply to "after upgrade to Debian 11: Error loading data from server: apierror-visualeditor-docserver-http-error"
2001:4998:EFFD:7804:0:0:0:104C (talkcontribs)

hi,


I see a offline option, I want to render the infoboxes offline, is there a way I can fetch the templates and then render the html without making calls to wikipedia/media wiki?


Thanks

Reply to "parsoid offline mode"

Installation section shall reflect current version

1
Grin (talkcontribs)

1.36.1 seems to be the last release so the comment about developers using 1.36 slightly seems to be out of date.

Reply to "Installation section shall reflect current version"

Can Parsoid fully work without Internet connection?

1
DungLe94 (talkcontribs)

Assume that I have set up MediaWiki along with Parsoid properly. I don't import the `.xml.bz2` dump into MediaWiki. I also don't have Internet connection. Is it possible for Parsoid to convert the wikitext in the dump into a complete HTML, just as the Standard HTML REST API call from the wiki website does?

Reply to "Can Parsoid fully work without Internet connection?"

How to use thenets/parsoid in Docker in Windows 10?

1
DungLe94 (talkcontribs)

I've installed thenets/parsoid on Docker on Windows 10. I want to convert the text file F:\zim\pomme.txt to html. I tried

docker run --name myparsoid -d -t -i -v /f/zim:/zim thenets/parsoid:latest sh

type /zim/pomme.txt | docker exec myparsoid php bin/parse.php --wt2html --offline

but it returns an error

Microsoft Windows [Version 10.0.19042.928] (c) Microsoft Corporation. All rights reserved.

C:\Users\Akira>docker run --name myparsoid -d -t -i -v /f/zim:/zim thenets/parsoid:latest sh 7912b0cef8fba4244b2519f4f9603ec8e278b67bcc4fe08f4658721b98f941f3

C:\Users\Akira>type /zim/pomme.txt | docker exec myparsoid node bin/parse.js --wt2html --offline The syntax of the command is incorrect. internal/modules/cjs/loader.js:638

   throw err;
   ^

Error: Cannot find module '/bin/parse.js'

   at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
   at Function.Module._load (internal/modules/cjs/loader.js:562:25)
   at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
   at startup (internal/bootstrap/node.js:283:19)
   at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)

Could you please shed some light on how to fix the error?

Reply to "How to use thenets/parsoid in Docker in Windows 10?"

Whitespace in headings?

2
Summary by Arlolra
RoySmith (talkcontribs)

What is the intended behavior when parsingehavior when parsing:

== Foo ==

I would have expected the whitespace around Foo to be preserved, but it's not. The example at Parsoid/API#POST 2 implies that it is, but when I try it, the whitespace is gone:


wget -q -O -  'http://en.wikipedia.org/api/rest_v1/page/html/User:RoySmith%2Fsandbox%2Fparsoid-whitespace-example'


<!DOCTYPE html>

<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/" about="https://en.wikipedia.org/wiki/Special:Redirect/revision/1005513872"><head prefix="mwr: https://en.wikipedia.org/wiki/Special:Redirect/"><meta property="mw:TimeUuid" content="b56a5f60-69b0-11eb-876b-49aa12313550"/><meta charset="utf-8"/><meta property="mw:pageId" content="66664679"/><meta property="mw:pageNamespace" content="2"/><link rel="dc:replaces" resource="mwr:revision/0"/><meta property="mw:revisionSHA1" content="9fa2ea02674418d1bab8d09bd0c639bcf220a57b"/><meta property="dc:modified" content="2021-02-08T01:54:45.000Z"/><meta property="mw:html:version" content="2.2.0"/><link rel="dc:isVersionOf" href="//en.wikipedia.org/wiki/User%3ARoySmith/sandbox/parsoid-whitespace-example"/><title>User:RoySmith/sandbox/parsoid-whitespace-example</title><base href="//en.wikipedia.org/wiki/"/><link rel="stylesheet" href="/w/load.php?lang=en&amp;modules=mediawiki.skinning.content.parsoid%7Cmediawiki.skinning.interface%7Csite.styles&amp;only=styles&amp;skin=vector"/><meta http-equiv="content-language" content="en"/><meta http-equiv="vary" content="Accept"/></head><body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr"><section data-mw-section-id="0" id="mwAQ"></section><section data-mw-section-id="1" id="mwAg"><h2 id="Foo">Foo</h2></section></body></html>

SSastry (WMF) (talkcontribs)

Parsoid starts but fails to connect with curl

2
Summary by Arlolra

User disappeared

Johnjin216326 (talkcontribs)

OS is Fedora 31

I downloaded parsoid from bluespice wiki

ii. create service under /etc/system/system/parsoid.service


[Unit]

Description=Mediawiki Parsoid web service on node.js

Documentation=

Wants=local-fs.target network.target

After=local-fs.target network.target

    [Install]

    WantedBy=multi-user.target

    [Service]

    Type=simple

    User=nobody

    Group=nobody

    WorkingDirectory=/opt/parsoid

    #EnvironmentFile=-/etc/parsoid/parsoid.env

    ExecStart=/usr/bin/nodejs /opt/parsoid /bin/server.js

    KillMode=process

    Restart=on-success

    PrivateTmp=true

    StandardOutput=syslog


iii. Under /opt/parsoid/config.yaml

worker_heartbeat_timeout: 300000

    logging:

        level: info

    services:

      - module: lib/index.js

        entrypoint: apiServiceWorker

        conf:

            localsettings: ./localsettings.js

iv. Under /opt/parsoid/localsettings.js

/

* This is an example configuration for a BlueSpiceWikiFarm setup

* In this case 'httpd' is used as wiki webserver machine name as it is in our

* docker environment.

/

'use strict';

    exports.setup = function(parsoidConfig) {

        parsoidConfig.dynamicConfig = function(domain) {

   var baseUrl = Buffer.from( domain, 'base64').toString();

    parsoidConfig.setMwApi({

        uri: baseUrl + '/api.php',

        domain: domain,

        strictSSL: false

    });

}

};

The nodejs is at version 10 and parsoid is v0.10

Here's the output of curl

[root@wiki-server BlueSpice3]# curl http://127.0.0.1:8000

<!DOCTYPE html>

<#html lang="en">

<#head>

<#meta charset="utf-8">

<#title>Error<#/title>

<#/head>

<#body>

<#pre>Internal Server Error<#/pre>

<#/body>

<#/html>

(I've added a # in the bracket to show more info)

SELINUX is disabled, firewall is open and listening port 8000, although netstat doesn't show that parsoid service is using the port

[root@wiki-server BlueSpice3]# netstat -aon | grep 8000

tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN off (0.00/0/0)

httpd is configured with SSL domain certificate and https enabled.

Why does this fail?

Arlolra (talkcontribs)

Try restarting the Parsoid service and see what information is logged to syslog?

Parsoid is not working in non-english language

2
Summary by Arlolra

User disappeared

186.151.92.120 (talkcontribs)

I tried several time to setup my wiki in spanish languate with MediaWiki 1.35.1, and always shows a Parasoid/Rest error curl 7, however when I choose english as the wiki languate, everything goes fine.

Hope this can be fixed, cause my users doesn't talk english.

Arlolra (talkcontribs)

How did you install Parsoid?

Parsoid - Memory exhaustion on a big page

5
Summary by Arlolra

Something in the user's setup is enforcing the limit.

189.9.10.125 (talkcontribs)

Hello, I'm having some problems with mediawiki parsoid regarding memory exhaustion, can someone help me?

on a very big page (can't tell the exact size, but the original written on MS Word has more than 70 pages) I get the following issue

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 135168 bytes) in /var/www/html/mediawiki-1.35.1/vendor/wikimedia/parsoid/src/Html2Wt/WikitextSerializer.php on line 1683. Is an explode function


As you can see, it says the memory limit is 128M, but my phpinfo says 750M, configured via php.ini in several places to make sure (php.ini, php-fpm.conf)

from my phpinfo

memory_limit 750M 750M

here's a grep -r memory_limit on my /etc

php-fpm.d/www.conf:php_admin_value[memory_limit] = 750M

php.ini:memory_limit = 750M

so, both php.ini and fpm are configured with 750M


I already tryed to fix the memory_limit on the LocalSettings.php, but also no deal


PHP 7.4.14 (fpm-fcgi)

MediaWiki 1.35.1

Lua 5.1.5

ICU 65.1

MySQL 5.6.35-80.0-log

wikimedia/parsoid 0.12.1


Can someone help me? This is preventing me and my team to create long and important documents.

Thank you!

189.9.10.125 (talkcontribs)

Oh, and I did stop/restart php, phpfpm and httpd. Even restarted the OS.

189.9.10.125 (talkcontribs)

I also tryed to set ini_set( 'memory_limit', '750M' ); on wikitext2html and WikitextSerializer.php, inside serializeDOM, but it raises the same error

Arlolra (talkcontribs)

This isn't an inherent problem with Parsoid. On the WMF cluster, Parsoid runs with an ~1.4G memory limit, which it occasionally hits, but is certainly not limited to 128M

https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSettings.php#L18558-L18560

Something in your setup is enforcing that limit. Maybe it's the OS, maybe it's the HTTP server, or PHP configurations you're mentioning.

Try isolating it. You can run Parsoid on the command line with bin/parse.php

Pass it your large page and see if you run up against the memory limit there

189.9.10.125 (talkcontribs)

Ok, i'll try that,

thank you