We'll be talking about this RfC on Wednesday 30 July on IRC -- please join us.
Talk:Requests for comment/CentralNotice Caching Overhaul - Frontend Proxy
See Architecture_meetings/RFC_review_2014-08-13#Meeting_summary, RfC followup section.
User:Mwalker (WMF), have you had a chance to update the RFC per the meeting from October?
<off-topic>Mentions doesn't work with LQT</off-topic>
Matt, would you like to sprint together on this perhaps Thursday of this week?
As Matt explained just before he updated it:
- I'm going to unify the URL to bits.wikimedia.org; varnish will initially serve the request before forwarding that to a backend; I'll add a bit to talk about CSP and how I dont currently care about it w/ justification that the first load is REALLY important; I'm going to add a note on where the 200GB comes from
- the only thing I haven't touched on is the performance question -- but I think gwicke and I have slaughtered that in other RfCs and we don't have much of a choice given mark's comments on the talk page (e.g. doing this in a VMOD is highly discouraged)
I agree in principle; it's not something I introduce lightly. However, having to call down a JS file first would introduce a round trip and delay, something I'm explicitly avoiding (otherwise I could just hack RL to have it be the first delivered module and be done with it.)
In theory the static JS file could be client side cached and thus remove the RTT from the picture; however... comScore's numbers on the site indicate that the 50th percentile user visits the site only every 17 days indicating that we would have to have a cache timeout on the local file of at least that if not 30 days -- at which point it's on par with page expiry.
.. because ..
Fundraising gets most of our donations from the first impression. We don't know if the time to display factors in for banners; but for landing pages milliseconds matter on if a user will decide to donate.
Based on these things; it still seems the most sane to put it directly in the page.
The only issue is that such a script will be broken when Content Security Policy is finally implemented in core.
That's a fair point. I talked with Chris on Friday about this and he thinks that the timeline for CSP deploy is greater than nine months. I'll have to check with Mark; but I'm guessing we'll have Varnish deployed to the text cluster before then.
When I do a Wikimedia technical content search for Content Security Policy via http://hexm.de/mw-search , all I see is https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Removing_inline_CSS.2FJS_from_MediaWiki . That preliminary work is, I think, not done. I also don't see mention of a CSP in Wikimedia Engineering/2014-15 Goals. So I think we can work on this RfC under the assumption that we are not about to launch a Content Security Policy.
IRC meeting 2013-10-02
<gwicke> mwalker: re load, did you do further benchmarking on real hardware?
<mwalker> gwicke: no; not yet -- more on that later
<TimStarling> I am reading the CentralNotice one, I haven't seen it before
<TimStarling> btw RFCs should generally be subpages of https://www.mediawiki.org/wiki/Requests_for_comment/
<TimStarling> otherwise my scripts will get confused
<mwalker> yep yep - I started it as a brainstorming page on the extension
<mwalker> I can add a redirect to it -- would that make things work better?
<Elsie> Instead of moving it?
<TimStarling> and also having RFCs there makes sure that the page is clear about its purpose -- i.e. definitely an RFC rather than some other kind of design page
<mwalker> Elsie: I'm not opposed to moving it -- but I dont have those rights
<legoktm> mwalker: anyone can move a page
<Elsie> It's a wiki, man.
<legoktm> (if not, someone will just give you sysop rights)
<mwalker> oh hey -- moving is allows -- don't know why I thought it wasnt
<TimStarling> so there will be a script tag in the header that provides a JSON blob
<TimStarling> how does the data from the JSON blob get into the actual page content?
<mwalker> the rest of the CentralNotice JS will be delivered via resourceloader
<TimStarling> but that part will be semi-static JS?
<mwalker> but having the banner content already available reduces the amount of time it takes to display -- and will also get rid of a round trip (to get the geoiplookup)
<Elsie> Where does the 200GB figure come from?
<mwalker> "but that part will be semi-static JS?" -- yes; the bit in the head will be as small, simple, and static as I can make it
<mwalker> "Where does the 200GB figure come from?" CN has a potential space of all projects, languages, countries, user states, buckets, and slots -- which comes to a large number which is then multiplied by the average size of a fundraising banner and varnish overhead
<mwalker> *trying to find my worksheet on that now
<TimStarling> presumably some of those dimensions would have to be fairly large
<TimStarling> is that what you mean by "worst case", that they are all as large as possible?
<TimStarling> e.g. buckets and slots, there are not always a lot of those, right?
<mark> i wonder how many we've actually got cached right now
<mwalker> so right now the space is 14 projects * ~300 languages * ~200 countries * 3 device types * 30 slots * 4 buckets * 2 uesr states
<mark> not entirely trivial to figure out though
<mwalker> mark: most of them :(
<mwalker> the timeout is 15minutes
<mwalker> and for everything but wikipedia they're empty
<gwicke> mwalker: I believe that your performance estimation for node might be about right, but it would definitely be good to establish a baseline on a real machine
<gwicke> I'm getting about 7k req/s on my laptop with a trivial http server
<TimStarling> so what is the reason for using a separate domain name?
<TimStarling> connection setup is expensive
<mark> so, we don't need to do that
<mark> in the discussion page I argue it shouldn't be a separate (node.js) server, but should probably just use varnish and be a backend to or a plugin of that
<mark> and then we also have the option to do this on one of the existing host names/clusters
<TimStarling> it seems pretty similar to the mobile varnish stuff
<TimStarling> the problem is that it is dynamic in various annoying ways, right?
<TimStarling> and we want to resolve that to some smaller number of cacheable objects
<mwalker> TimStarling: it was a naive suggestion thinking that I could separate the infrastructure entirely from the rest of the site so that if it goes down nothing else suffers
<mwalker> but I agree with mark that bits varnish could 'pass' a request to bits.wm.o/banners or something to a backend
<MaxSem> separate LB and varnish boxes?
<mwalker> TimStarling: and yes; your summary of the problem is correct
<TimStarling> have you considered redirecting?
<mwalker> it costs a round trip; and additionally still has to be cached
<mwalker> the round trip is important because something a lot of users complain about is the 'page bump' that happens when a banner loads
<TimStarling> I mean, have the dynamic part redirect to the static part
<mark> ESI is one of the options on the table for that
<TimStarling> presumably the way to avoid a page bump is to load early
<mwalker> ya; that's the thought process
<TimStarling> then you could use document.write()
<TimStarling> like advertising, advertising usually uses document.write(), doesn't it?
<mwalker> potentially yes -- banners are fairly dynamic though so it's not the best solution
<TimStarling> what is the difference between this and what you have proposed?
<mwalker> not much; in my original proposal I had the proxy being the machine LVS directed to; in marks suggestion LVS directs to bits.wm.o which will vcl_pass to a backend
<mark> bits or anything else
<mark> can be a separate cluster, but doesn't have to be
<TimStarling> you are saying that JS in the head will fetch some dynamic data
<TimStarling> then a script on the client side will use this dynamic data to form a URL to request some static data
<TimStarling> then that static data will be used by another script to actually display the HTML
<TimStarling> is that fair?
<mwalker> no; everything dynamic is fetched in the first call; which is then used by a resourceloader script to actually display
<TimStarling> so the banner HTML is delivered in the first call?
<mwalker> yes -- that's the plan
<TimStarling> ok, I agree with mark that doing this in varnish would be better than doing it in node.js
<TimStarling> I think a node.js frontend server would add quite a lot of complexity
<mwalker> do you have thoughts on the backend technology?
<gwicke> I see not reason why varnish can't do the front-end, with all requests being forwarded to a node backend
<mwalker> if the backend should be written as a VMOD to eventually move into the frontend -- or if it should be a node server?
<gwicke> should keep the backend simple
<mark> I think I prefer to have it as a backend
<TimStarling> I think we should continue this on the talk page
<mark> there's not much reason to integrate it into varnish itself
<mark> other than that it needs to see every request
<mwalker> ok -- I will update the RfC and we can continue with other topics
<mark> varnish can't cache anything there
Node.js backend vs Varnish plugin
- + Failure of the banner code does not take down the rest of cluster
- That's a strong statement which may not be true for either case. :)
- + Faster development with fewer bugs
- + Faster deployment in case of bugs/changes
- - Requires a wrapper to be written around Maxmind's GeoIP libraries and it's another place to update that data
- That data is already automatically updated on all servers by Puppet, so that doesn't matter.
- - Requires additional servers to be provisioned and maintained in all our data centres (for optimal latency)
- Not necessarily, because we could choose to run them on the same servers as Varnish. With a Varnish plugin, we don't really have a choice.
- ? Node's efficiency
- ? Portability -- we will be locked to this technology
- ? Though not addressed in this RFC, with node we can run dynamic JS locally on the server that is served with a banner from the backend that can determine if it wants to display or not -- saving bandwidth. Potentially we could do the same thing in a VMOD with Lua.
- Bandwidth is hardly a problem between these servers.
Varnish VMOD Considerations
- + Can be developed as a standalone library with VMOD/Nginx/PHP bindings for portability without changing core code
- + Can use GeoIP code already written
- Existing GeoIP code is just a few lines calling into libgeoip, so that hardly makes a difference.
- + Can eventually reside on the frontend proxy obviating the need for additional servers once proven
- Actually needs to reside on the Varnish frontend proxies, as opposed to a backend which can reside anywhere.
- - Bigger, more rapidly changing, list of servers that will need to be tracked by CentralNotice for purging purposes (possibly can use built in MediaWiki purge mechanism with some changes)
- - Slower to develop / deploy
- ? Likely to be faster / more memory efficient than node once optimized
- Yes. Of course, it's also possible to use C/C++ for a backend implementation, which is an independent question. But I think Node.js is reasonable for now.
In general, writing it as a (Node.js) backend means less coupling and more flexibility. I don't see any strong drawbacks to that approach in the above.
OK; node it is to start then. :)
Node.js as frontend proxy
I mostly like this proposal, and think it's the right way to go at it. I do however disagree with the idea of choosing Node.js as the frontend proxy. I would much rather like to see Varnish be the frontend proxy, with the CentralNotice banner selection code be a backend or plugin to that - even if it means that every request needs to pass through it anyway.
From an Operations management perspective, using Node.js as frontend proxy instead of one of our existing solutions is not a great idea:
- Even though its performance and scalability is not bad, it's not as good as the existing solutions we use (Varnish, nginx)
- Having an additional piece of software as a frontend proxy means that we need to integrate it into our solutions for logging, analytics, statistics, alerting, configuration management, IP ACLs & banning, etc, which is quite a lot of work on top of developing just this proxy
- We have far less expertise dealing with performance and reliability problems of Node.js
- Node.js would also need to work with whatever solution we use for SSL termination - e.g. X-Forwarded-For and X-Forwarded-Proto handling with nginx now, possibly stud with the PROXY protocol in the future
- We would be unable to serve these banners from the same servers/clusters (e.g. bits or text), which may be good to reduce hostname lookups, and save hardware resources. For each additional cluster, we're normally talking about 4 servers per datacenter. Today we have 4 data centers, with more planned for the future. Therefore, the option to consolidate is always nice.
In short, I think there's more to it than simply selecting node.js because it's a popular and nice choice to put a little bit of logic into.
I don't necessarily think putting it in Varnish VCL is such a great idea, however. Perhaps an integrated LUA implementation (as has been talked about for Varnish, and for which a VMOD exists) could be nice, but may not be very realistic at this point. I think Varnish pass'ing all requests to a Node.js backend would be reasonable, along with ESI.
I'm definitely not opposed to writing this as an extension that can then be loaded by VCL and processed. Regarding lua if I'm already going to be in the varnish runtime; I'd much rather be using C directly. Lua has the same drawbacks as running standalone node with none of the advantages.
A varnish solution would be: I would write a compiled library in C that varnish will load in VCL; it'll hit it in a guarded expression in vcl_recv (like we do with geoiplookup) which will then launch a backend request based on a string passed back from the library. The library would maintain some shared memory with the data it loads from the PHP backend. The data would be able to be purged via HTTP call to the server, and it would independently request new data based on an expiry timestamp provided with it. (There would be two copies so that old data could be continued to be served whilst new data is fetched.)
I'm a little bit confused though; are you saying you'd rather have the bits/text varnishes pass the request back to a dedicated server and not have any logic on the front end varnish? Or... are you saying that you'd accept a solution like I described above that lives on the front end varnish? I personally would rather have the request be passed back to a dedicated backend so that any code issues we have will not affect larger cluster stability.
If the former; you seem to be presenting an argument to have it be a varnish (and not a node) backend?
In terms of development -- analytics and statistics are not something I'm terribly worried about. We have a solution already that would not be changed by the deployment of this solution. In fact; these requests already pollute our current statistics (because they currently look like page requests) so not collecting them into the main pool will probably be a net gain.
I guess the terminology here is a bit confusing
If you want to write it as a Varnish plugin in C, it should probably be a Varnish VMOD as opposed to inline VCL C, which is fairly limited and hacky.
If it's done as a backend to Varnish, it can be done with Node.js and would probably live on another server. That would likely mean it gets each and every request forwarded to the backend, but on a high performance, low latency network that's not a very big deal.
My main concern is that it'll actually be Varnish handling the frontend HTTP request, and not another (custom) solution. That makes maintaining this a lot harder for Operations, and seems to have little advantage for developing this. I think I have a slight preference for this to be implemented as a backend to Varnish rather than a (VMOD/C/LUA) plugin, because it's cleaner and a bit more portable, and just slightly less efficient. But, with a well implemented and very efficient Varnish plugin (VMOD) I'm open to having it on the Varnish boxes, too.
Migration? & minor cleanup
Migration strategy is key here, please add a section. I'd like to see a single skin change, ahead of time, which can be used to deliver banners from the existing system. Then, we can toggle back and forth between banner providers without the cache purge overhead screwing with the experiment.
It would be good to mention that the design presented here is meant to be trivially migrated to ESI when that is available, cos this is a huge win.
U are using "mincut" in a way that I have not seen before, I assume you mean the allocation mapping's image, and preimage grouped by output element... Please give a definition! Also, the example does not demonstrate how we are compressing the codomain down to the mapping's image—as you explain in the text, there are two normalized tables, one gives criteria bits -> allocation row id, and the second is the banner allocation for that preimage. "mapstring" and "mapline" need more defining also.
I don't get this step: "2. In each map line, check if offset is set"
"name" is not a variation parameter for caching a single banner
"VCL" is the mini-language inside of varnish, so I think you mean "C library called from VCL".
- I'd like to see a single skin change, ahead of time, which can be used to deliver banners from the existing system. Then, we can toggle back and forth between banner providers without the cache purge overhead screwing with the experiment.
The experiment being? If we make more money from the new system? If we have removed the banner bump? Whilst I agree with adding a toggle switch; we will not be able to make meaningful measurements for months. At least 2; probably more like 3. (First month for natural page cache expiration, next to for cache epoch purging.)
- It would be good to mention that the design presented here is meant to be trivially migrated to ESI when that is available, cos this is a huge win.
I would not describe this as trivial -- it still requires user side data. We can ESI deliver the initial default content only.
- U are using "mincut" in a way that I have not seen before
Yes, bad terminology on my part. I really meant the set of disjoint unions over allocations.
- "VCL" is the mini-language inside of varnish, so I think you mean "C library called from VCL".
You can write native C in VCL. It does not necessarily require a native library.