Topic on Talk:Parsoid

Use a single Parsoid PHP server for multiple wikis

2
Calebgcooper (talkcontribs)

I am trying to separate restbase and other nodejs apps into separate VMs/containers. I am testing in mediawiki 1.38 and 1.37.


If I configure parsoid host in this form: parsoidHost (restbase config.yaml):

  /parsoid:
    x-modules:
      - path: sys/parsoid.js
        options:
          parsoidHost: "{{'http://{domain}/rest.php'}}"


I get the following error in the log which indicates domain has not been replaced with the correct domain:

"HTTPError","message":"Invalid URI \"http:///%7Bdomain%7D/rest.php/testwiki.wiki.internal/v3/page/pagebundle/Main_Page/20\"","status":504,

The correct url for parsoid should be:


http://testwiki.wiki.internal/rest.php/testwiki.wiki.internal/v3/page/html/Main%20Page/20


This format gives a similar error:

parsoidHost: "{{'http://{$.request.params.domain}/rest.php'}}"


full error 
{"name":"restbase","hostname":"restbase","pid":7192,"level":50,"message":"Invalid URI \"http:///$%7B$.request.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"","res":{"name":"HTTPError","message":"Invalid URI \"http:///$%7B$.request.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"","status":504,"headers":{"content-t
ype":"application/problem+json","access-control-allow-origin":"*","access-control-allow-methods":"GET,HEAD","access-control-allow-headers":"accept, content-type, content-length, cache-control, accept-language, api-user-agent, if-match, if-modified-since, if-none-match, dnt, accept-encoding","access-control-expose-headers":"etag","x-content-type-options":"nosniff","x-frame-options":"SAMEORIGIN","referrer-policy":"origin-when
-cross-origin","x-xss-protection":"1; mode=block","content-security-policy":"default-src 'none'; frame-ancestors 'none'","x-content-security-policy":"default-src 'none'; frame-ancestors 'none'","x-webkit-csp":"default-src 'none'; frame-ancestors 'none'","cache-control":"private, max-age=0, s-maxage=0, must-revalidate","x-request-id":"65e84430-ca39-11ec-80d5-5303b9bf502a","server":"restbase"},"body":{"type":"internal_http_er
ror","detail":"Invalid URI \"http:///$%7B$.request.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"","internalStack":"Error: Invalid URI \"http:///$%7B$.request.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"\n    at Request.init (/var/lib/restbase/node_modules/request/request.js:273:31)\n    at new Request (/var/l
ib/restbase/node_modules/request/request.js:127:8)\n    at Function.request (/var/lib/restbase/node_modules/request/index.js:53:10)\n    at Request._tryUntilFail (/var/lib/restbase/node_modules/requestretry/index.js:124:23)\n    at Factory (/var/lib/restbase/node_modules/requestretry/index.js:178:7)\n    at Request.run (/var/lib/restbase/node_modules/preq/index.js:160:16)\n    at preq (/var/lib/restbase/node_modules/preq/in
dex.js:267:46)\n    at Object.module.exports [as filter] (/var/lib/restbase/node_modules/hyperswitch/lib/filters/http.js:76:12)\n    at handlerWrapper (/var/lib/restbase/node_modules/hyperswitch/lib/hyperswitch.js:420:27)\n    at /var/lib/restbase/node_modules/hyperswitch/lib/hyperswitch.js:426:28\n    at HyperSwitch._filteredRequest (/var/lib/restbase/node_modules/hyperswitch/lib/hyperswitch.js:171:19)\n    at HyperSwitch.
<computed> [as post] (/var/lib/restbase/node_modules/hyperswitch/lib/hyperswitch.js:460:21)\n    at ParsoidService.callParsoidTransform (/var/lib/restbase/sys/parsoid.js:690:40)\n    at /var/lib/restbase/sys/parsoid.js:643:25\n    at tryCatcher (/var/lib/restbase/node_modules/bluebird/js/release/util.js:16:23)\n    at Promise._settlePromiseFromHandler (/var/lib/restbase/node_modules/bluebird/js/release/promise.js:547:31)","
internalURI":"http://${$.request.params.domain}/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20","internalQuery":"{}","internalErr":"Invalid URI \"http:///$%7B$.request.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"","internalMethod":"post"},"requestName":"get_from_backend"},"stack":"HTTPError: Invalid URI \"http:///$%7B$.request.p
arams.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"\n    at /var/lib/restbase/node_modules/preq/index.js:246:19\n    at tryCatcher (/var/lib/restbase/node_modules/bluebird/js/release/util.js:16:23)\n    at Promise._settlePromiseFromHandler (/var/lib/restbase/node_modules/bluebird/js/release/promise.js:547:31)\n    at Promise._settlePromise (/var/lib/restbase/node_modules/bl
uebird/js/release/promise.js:604:18)\n    at Promise._settlePromiseCtx (/var/lib/restbase/node_modules/bluebird/js/release/promise.js:641:10)\n    at _drainQueueStep (/var/lib/restbase/node_modules/bluebird/js/release/async.js:97:12)\n    at _drainQueue (/var/lib/restbase/node_modules/bluebird/js/release/async.js:86:9)\n    at Async._drainQueues (/var/lib/restbase/node_modules/bluebird/js/release/async.js:102:5)\n    at Imm
ediate.Async.drainQueues [as _onImmediate] (/var/lib/restbase/node_modules/bluebird/js/release/async.js:15:14)\n    at processImmediate (node:internal/timers:466:21)","latency":21,"root_req":{"method":"post","uri":"/testwiki-fat.wiki.internal/v1/transform/html/to/wikitext/Main_Page/20","headers":{"content-length":"5758","content-type":"multipart/form-data; boundary=------------------------54315dbeb2b1f7be","if-match":"\"20/
956ec650-ca1d-11ec-b753-6728b9e18a10\"","user-agent":"VisualEditor-MediaWiki/1.38.0-rc.0","api-user-agent":"VisualEditor-MediaWiki/1.38.0-rc.0","x-client-ip":"::ffff:192.168.128.20","x-forwarded-for":"::ffff:192.168.128.20","x-request-id":"65e84430-ca39-11ec-80d5-5303b9bf502a","x-request-class":"external"}},"request_id":"65e84430-ca39-11ec-80d5-5303b9bf502a","levelPath":"error/request","msg":"Invalid URI \"http:///$%7B$.req
uest.params.domain%7D/rest.php/testwiki-fat.wiki.internal/v3/transform/pagebundle/to/wikitext/Main_Page/20\"","time":"2022-05-02T17:00:41.352Z","v":0}


This syntax does work however for the api endpoint and restbase base uri (and is recommended in the docs:

apiUriTemplate: "{{'http://{domain}/api.php'}}"
baseUriTemplate: "{{'http://restbase.wiki.internal:7231/{domain}/v1'}}"


Using curl or web browser this returns the main page as expected:

http://restbase.wiki.internal/testwiki.wiki.internal/v1/page/html/Main_Page


As does the same request directly to parsoid:

http://testwiki.wiki.internal/rest.php/testwiki.wiki.internal/v3/page/html/Main%20Page/20


If I change config.yaml for restbase to use a DNS entry to point all wiki servers to a single wiki for parsoid I get errors like 'Validation of `domain` failed' 
{"message":"Error: exception of type Wikimedia\\ParamValidator\\ValidationException: Validation of `domain` failed: invalid-domain {\"expected\":\"testwiki-a.wiki.internal\",\"actual\":\"testwiki-b.wiki.internal\"}","exception":{"id":"699104fa54c628eaf170cbb0","type":"Wikimedia\\ParamValidator\\ValidationException","file":"/var/lib/mediawiki/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php","line":162,"message":"Validation of `domain` failed: invalid-domain {\"expected\":\"testwiki-a.wiki.internal\",\"actual\":\"testwiki-b.wiki.internal\"}","code":0,"url":"/rest.php/testwiki-b.wiki.internal/v3/page/html/Main%20Page/20","caught_by":"other","backtrace":[{"file":"/var/lib/mediawiki/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php","line":260,"function":"assertDomainIsCorrect","class":"MWParsoid\\Rest\\Handler\\ParsoidHandler","type":"->"},{"file":"/var/lib/mediawiki/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php","line":75,"function":"getRequestAttributes","class":"MWParsoid\\Rest\\Handler\\ParsoidHandler","type":"->"},{"file":"/var/lib/mediawiki/includes/Rest/Router.php","line":414,"function":"execute","class":"MWParsoid\\Rest\\Handler\\PageHandler","type":"->"},{"file":"/var/lib/mediawiki/includes/Rest/Router.php","line":338,"function":"executeHandler","class":"MediaWiki\\Rest\\Router","type":"->"},{"file":"/var/lib/mediawiki/includes/Rest/EntryPoint.php","line":167,"function":"execute","class":"MediaWiki\\Rest\\Router","type":"->"},{"file":"/var/lib/mediawiki/includes/Rest/EntryPoint.php","line":132,"function":"execute","class":"MediaWiki\\Rest\\EntryPoint","type":"->"},{"file":"/var/lib/mediawiki/rest.php","line":31,"function":"main","class":"MediaWiki\\Rest\\EntryPoint","type":"::"}]},"httpCode":500,"httpReason":"Internal Server Error"}


Clearly it is not allowed for one wiki to parse text for another. Is there a flag to change this behavior perhaps?


So either I need to figure out how to get restbase to dynamically create the correct URL or how to configure a parsoid php server to respond for other wiki sites.


I have no problem getting restbase and parsoid working when everything is in the same container/VM.

Calebgcooper (talkcontribs)
Here is my full config.yaml 
bash-5.1# cat restbase/config.yaml
#
# Simple RESTBase config for Mediawiki Container
# https://www.mediawiki.org/wiki/RESTBase/Installation
#
# - cassandra DB
# - parsoid at http://localhost/rest.php
# - wiki at http://testwiki.wiki.internal/api.php
#
# - proxied via nginx, available via
# - http://hostname/api/rest_v1/
#
services:
  - name: restbase
    module: hyperswitch
    conf:
      port: 7231
      salt: 988881adc9fc3655077dc2d4d757d480b5ea0e11
      default_page_size: 125
      user_agent: RESTBase
      ui_name: RESTBase
      ui_url: https://www.mediawiki.org/wiki/RESTBase
      ui_title: RESTBase docs
      spec:
        x-request-filters:
          - path: lib/security_response_header_filter.js
          - path: lib/normalize_headers_filter.js
        x-sub-request-filters:
          - type: default
            name: http
            options:
              allow:
                - pattern: /^https?:\/\//
        paths:
          /{domain}/{api:v1}:
            x-modules:
              - spec:
                  info:
                    version: 1.0.0
                    title: Wikimedia REST API
                    description: Welcome to your RESTBase API.
                  x-route-filters:
                    - path: ./lib/normalize_title_filter.js
                      options:
                        redirect_cache_control: 's-maxage=0, max-age=86400'
                  paths:
                    /page:
                      x-modules:
                        - path: v1/content.yaml
                          options:
                            response_cache_control: 's-maxage=0, max-age=86400'
                        - path: v1/content_segments.yaml
                          options:
                            response_cache_control: 's-maxage=0, max-age=86400'
                        - path: v1/common_schemas.yaml
                        - path: v1/summary.js
                        - path: v1/related.js
                        - path: v1/random.yaml
                        - path: v1/pdf.js
                          options:
                            uri: http://restbase.wiki.internal:3030
                            secret: secret
                    /transform:
                      x-modules:
                        - path: v1/transform.yaml
                    /media:
                      x-modules:
                        - path: v1/mathoid.yaml
                          options:
                            host: http://restbase.wiki.internal:10042
                    /data:
                      x-modules:
                        - path: v1/citoid.js
                          options:
                            host: http://restbase.wiki.internal:1970
                        - path: v1/lists.js
                        - path: v1/recommend.yaml
                        - path: v1/javascript.yaml
                        - path: v1/css.yaml

          /{domain}/{api:sys}:
            x-modules:
              - path: projects/proxy.yaml
                options:
                  backend_host_template: '{{"/{domain}/sys/legacy"}}'
              - spec:
                  paths:
                    /table:
                      x-modules:
                        - path: sys/table.js
                          options:
                            conf:
                              version: 1
                              backend: cassandra
                              hosts:
                                - cassandradb
                              pool_idle_timeout: 20000
                              retry_delay: 250
                              retry_limit: 10
                              show_sql: false
                              keyspace: system
                              defaultConsistency: localOne
                              localDc: datacenter1
                              datacenters:
                                - datacenter1
                              storage_groups:
                                - name: local
                                  domains: /./
                    /legacy/key_value:
                      x-modules:
                        - path: sys/key_value.js
                    /legacy/page_revisions:
                      x-modules:
                        - path: sys/page_revisions.js
                    /post_data:
                      x-modules:
                        - path: sys/post_data.js
                    /action:
                      x-modules:
                        - path: sys/action.js
                          options:
                            apiUriTemplate: "{{'http://{domain}/api.php'}}"
                            baseUriTemplate: "{{'http://restbase.wiki.internal:7231/{domain}/v1'}}"
                    /page_save:
                      x-modules:
                        - path: sys/page_save.js
                    /events:
                      x-modules:
                        - path: sys/events.js
                    /parsoid:
                      x-modules:
                        - path: sys/parsoid.js
                          options:
                            parsoidHost: "{{'http://{domain}/rest.php'}}"
                            grace_ttl: 1000000
                    /mathoid:
                      x-modules:
                        - path: sys/mathoid.js
                          options:
                            host: "{{'http://restbase.wiki.internal:10042'}}"


# Finally, a standard service-runner config.
info:
  name: restbase

logging:
  name: restbase
  level: warn
  streams:
    - type: stdout


num_workers: 0

bash-5.1#


I found this page but it is for parsoid js, if there was an equivalent for parsoid php it would be very helpful: https://www.mediawiki.org/wiki/Parsoid/Setup/RESTBase/Arbitrary_domains

Reply to "Use a single Parsoid PHP server for multiple wikis"