Requests for comment/Unified Zero design

Background
ESI fragments may not work. So instead of doing that, we're examining tags as an alternative.

Please review Requests_for_comment/Zero_Architecture and Requests_for_comment/Data-driven_Zero_Varnish_Configuration to become familiar with the challenge.

Problem
Wikipedia Zero webpages are served to users on mobile devices with participating mobile carriers. The number of Wikipedia Zero cached pages, in excess of non-Wikipedia Zero mobile-formatted pages, is roughly:

cached_pages = 0 foreach carrier c: cached_pages += c.one_or_two_subdomains_from_m_or_zero_subdomains * c.num_languages_supported

Carriers support Zero-rating of .zero.wikipedia.org, .m.wikipedia.org, or both. They also support up to ten customized free languages, otherwise they support all languages.

The amplification of cached pages means more hits at the origin servers than wanted, meaning slower loading pages for Wikipedia Zero users. Furthermore, the current page caching scheme employed via Wikipedia Zero introduces a challenge to differentiating such Wikipedia Zero cached pages from non-Wikipedia Zero cached pages - a problem when the Wikipedia Zero team wants to purge the cache to modify aspects of the Wikipedia Zero experience without impacting other aspects of the Wikipedia mobile-formatted experience.

Varnish logic
For all mobile traffic (both ZERODOT & MDOT), set X-CS2 if from a carrier network (already implemented):
 * if HTTPS, use top value of X-Forwarded-For
 * If ip matches ANY proxy, use top value of X-Forwarded-For
 * If ip matches ANY carrier, set req.http.X-CS2 = carrier ID


 * In vcl_deliver, append req.http.X-CS2 to resp.http.X-Analytics
 * All m. & zero. results are varied on X-CS, X-SUBDOMAIN headers

Banner image generation

 * Banner is included with
 * Banner is rendered as a short, non-customizable image for the specific carrier in format "free from ". The &lt;img> tag will not set width as they will differ
 * if not zero traffic according to carrier's settings, returns 1px x 1px
 * for ZERODOT - on partner network:
 * banner response is RED WARNING if lang.zero not supported; the article content still comes back, though
 * otherwise, it's the normal banner
 * for ZERODOT - non-partner network
 * show UNCACHED error with the IP address

TBD

 * image library i18n exotic character support
 * Image width/height JS/CSS and old HTML support
 * For smartphones, what is the best way to override banner just-in-time
 * Avoid FOUC - Flash of Unstyled Content
 * Prevent downloading of unneeded banner
 * Acceptable stats deviations from reality
 * Acceptability of "one" banner format per language
 * If on ZERODOT okay to show article content, BUT have red banner. This is still relatively low bandwidth and should be a rare occurrence.
 * Any impacts on existing app APIs? Don't think so off top of head, but need to check

Analytics
Since we are switching from accurate tagging of only applicable traffic for whitelisted domains to tagging all of the carrier's traffic, X-ANALYTICS will contain an X-CS value more frequently. This means that we will not look at the total M. + ZERO. traffic, but rather look at what the actual carrier is whitelisting. Since ZERO. and .ZERO. AND .M. is negligable for most carriers, the number change is negligable as well. If carrier only whitelists ZERO, we will need to only look at the ZERO subgraph, not the total. Also, if carrier only whitelists common languages in M., the graph will also be slightly inflated by non-whitelisted M. languages; but since they are not frequently used, the difference should not have high impact.. Alternatively, could the MapReduce routines could be coded to only count eligible subdomain traffic?