Jump to content

User:ArielGlenn/Envoy for wikifarm

From mediawiki.org

Overview and disclaimers

[edit]

The configuration file below is set up for use for my local development testing environment with a local wikifarm, with CentralAuth running, so there are two domains, a loginwiki for SUL2, an auth shared domain for SUL3, and a couple of regular wikis for each domain.

[edit]

The wikifarm has apache listening on 443 and on 80. Most browsers these days will try to use port 443 automatically for connections, and that's certainly the case for Firefox, which I use for my testing. However, the domains I use, subdomains of *.test, are chosen so that requests to port 80 do not automatically get rewritten by the browser.

Having said that, I almost exclusively test via https these days; ymmv.

Envoy is set up to listen on 8443, expecting https requests. It uses the same private keys and certs as the backend wikifarm servers. To remove the extraneous 8443 from the Host: header envoy forwards, you'll need the strip_matching_host_port attribute of your http connect set to true. If your setup is different, you may not need this.

I did not configure envoy to validate the upstream certs, though I could have. For a local dev environment, it doesn't matter much.

envoy listeners and clusters

[edit]

Each domain (wikipedia.test, wiktionary.test) gets a listener configured. The wildcard cert and associated key for the domain go with the appropriate listener. I've added both domains into the Subject Alternative Name field of each cert, following our practice in production, where all top level domains are included in that field. Each domain also gets a separate route in the listener config, going to a separate defined cluster.

Each cluster gets the same hostname for the address; all hostnames listed anywhere in the config map to the loopback address in my /etc/hosts file. You may have things set up differently; perhaps you have multiple docker containers for a little group of servers each with their own ip address, all resolving to the same name on a round robin bases. That would work too.

So the point is that as long as the hostname pointed to by the cluster resolves to the same group of IP addresses as the host(s) actually serving mediawiki, you're good to go.

logging requests

[edit]

Each listener logs to a separate log, so the wikipedia.test requests from users go to one file, and the wiktionary.tests from users go to another. The http connection manager supports a filter for upstream logging as well (requests from envoy to the wikifarm), so I log those too, in two more files, one for requests to wikipedia.test in the wikifarm, and the other for the wiktionary.test requests.

admin interface

[edit]

I didn't bother to configure an admin interface yet, though I'll likely play with that soon. I've got a few runtime variables defined, which could be tweaked via that interface, just for that purpose. The runtime config capability seems to be quite flexible.

[edit]

Docs for various elements in this config file:

helpful hints

[edit]

For checking syntax of the yaml file before feeding it to envoy, I used yq via the command cat my-envoy-config-file.yaml | yq e -o=json and it was very helpful.

For everything else, see inline comments, or the envoy docs at https://www.envoyproxy.io/docs/envoy/latest/about_docs

The config file

[edit]
# notes: we use two routes so that each domain serves its own wildcard cert to the browser
static_resources:

  # one listener on a custom port handles all upstream clusters
  listeners:

  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 8443

    listener_filters:
    - name: envoy.filters.listener.tls_inspector
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
    tcp_fast_open_queue_length: 150

    filter_chains:

    # all *.wikipedia.test sites
    - filter_chain_match:
        server_names: [testen.wikipedia.test, test2en.wikipedia.test, login.wikipedia.test, meta.wikipedia.test, auth.wikipedia.test]
      filters:
      - name: envoy.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          # this is mandatory, even if we don't plan on ever using the stats interface for queries
          stat_prefix: ingress_http
          strip_matching_host_port: true

          access_log:
          - name: envoy.file_access_log
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
              path: /var/log/envoy/global_tls_wikipedia.log
            filter:
              status_code_filter:
                comparison:
                  op: GE
                  value:
                    default_value: 200
                    # for adjusting this value on the fly, if we want
                    runtime_key: global_tls_wp_min_log_code

          route_config:
            name: wikipedia_routes
            virtual_hosts:

              - name: wikipedias
                domains: [testen.wikipedia.test, test2en.wikipedia.test, login.wikipedia.test, meta.wikipedia.test, auth.wikipedia.test]
                # rate limits: FIXME add later
                retry_policy:
                  num_retries: 0
                routes:
                - match:
                    prefix: /
                  route:
                    cluster: service-wikipedias
                    timeout: 60s

          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

              upstream_log:
              - name: envoy.file_upstream_wikipedia_log
                typed_config:
                  "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                  path: /var/log/envoy/global_tls_upstream_wikipedia.log
                filter:
                 status_code_filter:
                   comparison:
                     op: GE
                     value:
                       default_value: 200
                       # for adjusting this value on the fly, if we want
                       runtime_key: global_tls_wp_upstream_min_log_code

      # common cert and key for all *.wikipedia.test sites
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
            - certificate_chain:
                filename: /usr/local/etc/localpki/wikipedia.test/pki/certs/wikipedia.test.crt
              private_key:
                filename: /usr/local/etc/localpki/wikipedia.test/pki/private/wikipedia.test.key

    # all *.wiktionary.test sites
    - filter_chain_match:
        server_names: [testen.wiktionary.test, test2en.wiktionary.test]
      filters:
      - name: envoy.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          # this is mandatory, even if we don't plan on ever using the stats interface for queries
          stat_prefix: ingress_http
          strip_matching_host_port: true

          access_log:
          - name: envoy.file_access_log
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
              path: /var/log/envoy/global_tls_wiktionary.log
            filter:
              status_code_filter:
                comparison:
                  op: GE
                  value:
                    default_value: 200
                    # for adjusting this value on the fly, if we want
                    runtime_key: global_tls_wikt_min_log_code

          route_config:
            name: wiktionary_routes
            virtual_hosts:

            # - name: testenwikt
            - name: wiktionaries
              domains: [testen.wiktionary.test, test2en.wiktionary.test]
              retry_policy:
                num_retries: 0
              routes:
              - match:
                  prefix: /
                route:
                  cluster: service-wiktionaries
                  timeout: 60s

          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

              upstream_log:
              - name: envoy.file_upstream_wiktionary_log
                typed_config:
                  "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                  path: /var/log/envoy/global_tls_upstream_wiktionary.log
                filter:
                 status_code_filter:
                   comparison:
                     op: GE
                     value:
                       default_value: 200
                       # for adjusting this value on the fly, if we want
                       runtime_key: global_tls_wikt_upstream_min_log_code

      # common cert and key for all *.wiktionary.test sites
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
            - certificate_chain:
                filename: /usr/local/etc/localpki/wiktionary.test/pki/certs/wiktionary.test.crt
              private_key:
                filename: /usr/local/etc/localpki/wiktionary.test/pki/private/wiktionary.test.key

  clusters:

  - name: service-wikipedias
    type: STRICT_DNS
    # only one host so no point in setting an lb policy 
    # lb_policy: ROUND_ROBIN
    # dns_lookup_family: V4_ONLY
    load_assignment:
      cluster_name: service-wikipedias
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: svc.wikipedia.test
                port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        common_tls_context:
          tls_params:
            tls_minimum_protocol_version: TLSv1_3
            tls_maximum_protocol_version: TLSv1_3

  - name: service-wiktionaries
    type: STRICT_DNS
    # only one host so no point in setting an lb policy 
    # lb_policy: ROUND_ROBIN
    # dns_lookup_family: V4_ONLY
    load_assignment:
      cluster_name: server-wiktionaries
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: svc.wiktionary.test
                port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        common_tls_context:
          tls_params:
            tls_minimum_protocol_version: TLSv1_3
            tls_maximum_protocol_version: TLSv1_3