Topic on Talk:Requests for comment/CentralNotice Caching Overhaul - Frontend Proxy

Node.js as frontend proxy

3
Mark Bergsma (talkcontribs)

I mostly like this proposal, and think it's the right way to go at it. I do however disagree with the idea of choosing Node.js as the frontend proxy. I would much rather like to see Varnish be the frontend proxy, with the CentralNotice banner selection code be a backend or plugin to that - even if it means that every request needs to pass through it anyway.

From an Operations management perspective, using Node.js as frontend proxy instead of one of our existing solutions is not a great idea:

  • Even though its performance and scalability is not bad, it's not as good as the existing solutions we use (Varnish, nginx)
  • Having an additional piece of software as a frontend proxy means that we need to integrate it into our solutions for logging, analytics, statistics, alerting, configuration management, IP ACLs & banning, etc, which is quite a lot of work on top of developing just this proxy
  • We have far less expertise dealing with performance and reliability problems of Node.js
  • Node.js would also need to work with whatever solution we use for SSL termination - e.g. X-Forwarded-For and X-Forwarded-Proto handling with nginx now, possibly stud with the PROXY protocol in the future
  • We would be unable to serve these banners from the same servers/clusters (e.g. bits or text), which may be good to reduce hostname lookups, and save hardware resources. For each additional cluster, we're normally talking about 4 servers per datacenter. Today we have 4 data centers, with more planned for the future. Therefore, the option to consolidate is always nice.

In short, I think there's more to it than simply selecting node.js because it's a popular and nice choice to put a little bit of logic into.

I don't necessarily think putting it in Varnish VCL is such a great idea, however. Perhaps an integrated LUA implementation (as has been talked about for Varnish, and for which a VMOD exists) could be nice, but may not be very realistic at this point. I think Varnish pass'ing all requests to a Node.js backend would be reasonable, along with ESI.

Mwalker (WMF) (talkcontribs)

I'm definitely not opposed to writing this as an extension that can then be loaded by VCL and processed. Regarding lua if I'm already going to be in the varnish runtime; I'd much rather be using C directly. Lua has the same drawbacks as running standalone node with none of the advantages.

A varnish solution would be: I would write a compiled library in C that varnish will load in VCL; it'll hit it in a guarded expression in vcl_recv (like we do with geoiplookup) which will then launch a backend request based on a string passed back from the library. The library would maintain some shared memory with the data it loads from the PHP backend. The data would be able to be purged via HTTP call to the server, and it would independently request new data based on an expiry timestamp provided with it. (There would be two copies so that old data could be continued to be served whilst new data is fetched.)

I'm a little bit confused though; are you saying you'd rather have the bits/text varnishes pass the request back to a dedicated server and not have any logic on the front end varnish? Or... are you saying that you'd accept a solution like I described above that lives on the front end varnish? I personally would rather have the request be passed back to a dedicated backend so that any code issues we have will not affect larger cluster stability.

If the former; you seem to be presenting an argument to have it be a varnish (and not a node) backend?

In terms of development -- analytics and statistics are not something I'm terribly worried about. We have a solution already that would not be changed by the deployment of this solution. In fact; these requests already pollute our current statistics (because they currently look like page requests) so not collecting them into the main pool will probably be a net gain.

Mark Bergsma (talkcontribs)

I guess the terminology here is a bit confusing

If you want to write it as a Varnish plugin in C, it should probably be a Varnish VMOD as opposed to inline VCL C, which is fairly limited and hacky.

If it's done as a backend to Varnish, it can be done with Node.js and would probably live on another server. That would likely mean it gets each and every request forwarded to the backend, but on a high performance, low latency network that's not a very big deal.

My main concern is that it'll actually be Varnish handling the frontend HTTP request, and not another (custom) solution. That makes maintaining this a lot harder for Operations, and seems to have little advantage for developing this. I think I have a slight preference for this to be implemented as a backend to Varnish rather than a (VMOD/C/LUA) plugin, because it's cleaner and a bit more portable, and just slightly less efficient. But, with a well implemented and very efficient Varnish plugin (VMOD) I'm open to having it on the Varnish boxes, too.

Reply to "Node.js as frontend proxy"