Analytics/Reports/Clients without JavaScript

Goal
The goal of this project is to get a rough estimation of how big of percentage of our page requests come from browsers with partial (or none) Javascript support. The methodology used will provide a rough - but not a detailed - estimate. The idea is to be able to know if the number of clients is smaller than 10%, 1% or 0.1%.

Metodology
Every request to all wikimedia projects is stored in hadoop for about 30 days. Requests are segregated into "text" (requests to desktop websites), "mobile" (requests to apps and mobile website) and "bits" (requests to our static domain, from which javascript and css are served). More so, for every request to "text" and "mobile" we store whether the request is a pageview or not according to the new pageview definition:.

Our method for the rough estimation goes as follows:

At the end of this step we know for example that "1.2% of our pageviews  come from IE10". We do not take the device into account
 * 1. For a timeperiod T get all requests to text and mobile.
 * 2. Calculate browser percentages for all those requests (Let's call this 'set#1').


 * 3. For timeperiod T get all requests of javascript files to bits.
 * 4. Calculate browser percentages on javascript bits data (Let's call this 'set#2').
 * 5. Compare set#2 with set#1, set#1 should be a super-set of set#2. Browsers that are on set#1 but do not appear on set#2 represent the set of browsers for which javascript is not enabled

The timeperiod we have choosen was the 2nd week of January.

Caveats
This methodology will not catch users that navigate with a modern browser (say Chrome 39) but with javascript turned of. To detect those a very specific experiment is needed. We need to choose a precision to report our data, as below a certain percentage browser numbers are imprecise. Since we consider some bots requests pageviews our report will include bot requests and count those as clients without javascript.

We do not expect browser percentages on text and mobile to match exactly those on bits as requests for static files are subjected to different cache ratios than requests for main content that, in the case of our projects, is never cached on the client side. But browser percentages ratios should match. For example: if we get 0.6% of our pageview requests from Chrome 39 on Mac Os X with version 10.6 and and 0.4% of pageviews come from Mac Os X  with version 10.9 the ratio 0.6/0.4 should be about the same on browser percentages on bits.

Preliminary Results
Results with data from the 2nd week of January. Need to double check this with data from a different week.

The approximate total of pageview requests without javascript enabled is about 10% but note that this includes bot requests. If we remove the 3 main bots we see: Bingbot, YandexBot and Googlebot the percentage is much lower, about 3%

Details
Percentage of pageview totals for browsers that do not request javascript files on bits.

With OS Info
Note that browsers responsible of less than 0.01% of pageviews are not n this list and that is why IE6 and Opera Mini (for which there is a lot of fragmentation across OS) do not appear. Total of browsers that do not support javascript (3 main bots removed) according to this list is 3.15%.

0.0101 {"os_minor": "6"  "os_major": "10"  "os_family": "Mac OS X"  "browser_major": "454"  "browser_family": "CFNetwork"} 0.0112 {"os_minor": "3"  "os_major": "9"  "os_family": "Symbian OS"  "browser_major": "7"  "browser_family": "Nokia Browser"} 0.0115 {"os_minor": "0"  "os_major": "8"  "os_family": "iOS"  "browser_major": "4"  "browser_family": "Sleipnir"} 0.0115 {"os_minor": "11"  "os_major": "3"  "os_family": "Linux"  "browser_major": "2"  "browser_family": "Python Requests"} 0.0124 {"os_minor": "-"  "os_major": "-"  "os_family": "Mac OS X"  "browser_major": "-"  "browser_family": "Safari"} 0.0138 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "7"  "browser_family": "IE"} 0.0145 {"os_minor": "4"  "os_major": "9"  "os_family": "Symbian OS"  "browser_major": "7"  "browser_family": "Nokia Browser"} 0.0164 {"os_minor": "1"  "os_major": "6"  "os_family": "iOS"  "browser_major": "609"  "browser_family": "CFNetwork"} 0.0167 {"os_minor": "-"  "os_major": "-"  "os_family": "Nokia Series 40"  "browser_major": "3"  "browser_family": "Ovi Browser"} 0.0176 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows"  "browser_major": "5"  "browser_family": "IE"} 0.0215 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows"  "browser_major": "4"  "browser_family": "IE"} 0.0221 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "1"  "browser_family": "TwitterBot"} 0.0287 {"os_minor": "-"  "os_major": "-"  "os_family": "Mac OS X"  "browser_major": "-"  "browser_family": "Other"} 0.0323 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows 95"  "browser_major": "-"  "browser_family": "Other"} 0.0368 {"os_minor": "-"  "os_major": "-"  "os_family": "Nokia Series 40"  "browser_major": "2"  "browser_family": "Ovi Browser"} 0.0408 {"os_minor": "0"  "os_major": "7"  "os_family": "iOS"  "browser_major": "672"  "browser_family": "CFNetwork"} 0.0425 {"os_minor": "-"  "os_major": "-"  "os_family": "Red Hat"  "browser_major": "-"  "browser_family": "Other"} 0.0432 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows XP"  "browser_major": "-"  "browser_family": "Safari"} 0.0454 {"os_minor": "0"  "os_major": "7"  "os_family": "iOS"  "browser_major": "2"  "browser_family": "bingbot"} 0.0551 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "6"  "browser_family": "UP.Browser"} 0.0943 {"os_minor": "13"  "os_major": "3"  "os_family": "Linux"  "browser_major": "2"  "browser_family": "Python Requests"} 0.1101 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows XP"  "browser_major": "8"  "browser_family": "Opera"} 0.1122 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows CE"  "browser_major": "4"  "browser_family": "IE"} 0.1182 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows 2000"  "browser_major": "5"  "browser_family": "IE"} 0.2476 {"os_minor": "0"  "os_major": "8"  "os_family": "iOS"  "browser_major": "711"  "browser_family": "CFNetwork"} 0.2718 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "9"  "browser_family": "Chrome"} 0.3111 {"os_minor": "-"  "os_major": "-"  "os_family": "Windows 98"  "browser_major": "-"  "browser_family": "Other"} 0.5503 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "-"  "browser_family": "YandexBot"} 0.6775 {"os_minor": "0"  "os_major": "6"  "os_family": "iOS"  "browser_major": "2"  "browser_family": "Googlebot"} 0.882 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "-"  "browser_family": "Slurp"} 6.0139 {"os_minor": "-"  "os_major": "-"  "os_family": "Other"  "browser_major": "2"  "browser_family": "bingbot"}

Without OS info
The percentage of browsers without javascript enabled bots removed is still in the same ballpark (2.47%). Note that this list reports browsers responsible for at least 0.001% of pageviews at the OS level, that makes visible Opera Mini.

0.0012 {"browser_major": "15"  "browser_family": "Chrome Frame"} 0.0012 {"browser_major": "21"  "browser_family": "Opera Mini"} 0.0013 {"browser_major": "2"  "browser_family": "Opera Mini"} 0.0013 {"browser_major": "22"  "browser_family": "Opera Mini"} 0.0014 {"browser_major": "0"  "browser_family": "Maxthon"} 0.0014 {"browser_major": "25"  "browser_family": "Opera Mini"} 0.0014 {"browser_major": "7"  "browser_family": "Opera"} 0.0014 {"browser_major": "8530"  "browser_family": "BlackBerry"} 0.0017 {"browser_major": "5"  "browser_family": "Baidu Browser"} 0.0017 {"browser_major": "9700"  "browser_family": "BlackBerry"} 0.0019 {"browser_major": "0"  "browser_family": "Kazehakase"} 0.0019 {"browser_major": "11"  "browser_family": "Opera Mobile"} 0.0019 {"browser_major": "14"  "browser_family": "Opera Mobile"} 0.0019 {"browser_major": "2"  "browser_family": "iBrowser"} 0.002 {"browser_major": "4"  "browser_family": "SEMC-Browser"} 0.002 {"browser_major": "537"  "browser_family": "WebKit Nightly"} 0.0021 {"browser_major": "0"  "browser_family": "Python Requests"} 0.0021 {"browser_major": "2010"  "browser_family": "Outlook"} 0.0021 {"browser_major": "720"  "browser_family": "CFNetwork"} 0.0023 {"browser_major": "1"  "browser_family": "K-Meleon"} 0.0024 {"browser_major": "10"  "browser_family": "Opera Mobile"} 0.0027 {"browser_major": "3"  "browser_family": "Nokia OSS Browser"} 0.0029 {"browser_major": "24"  "browser_family": "Thunderbird"} 0.0031 {"browser_major": "0"  "browser_family": "K-Meleon"} 0.0031 {"browser_major": "548"  "browser_family": "CFNetwork"} 0.0035 {"browser_major": "2"  "browser_family": "Lynx"} 0.0035 {"browser_major": "9"  "browser_family": "Opera Mini"} 0.0036 {"browser_major": "2007"  "browser_family": "Outlook"} 0.0037 {"browser_major": "-"  "browser_family": "CFNetwork"} 0.0046 {"browser_major": "31"  "browser_family": "Thunderbird"} 0.0053 {"browser_major": "4"  "browser_family": "Ovi Browser"} 0.0078 {"browser_major": "3"  "browser_family": "NetFront"} 0.0084 {"browser_major": "18"  "browser_family": "Chromium"} 0.0101 {"browser_major": "454"  "browser_family": "CFNetwork"} 0.0164 {"browser_major": "609"  "browser_family": "CFNetwork"} 0.0207 {"browser_major": "3"  "browser_family": "Ovi Browser"} 0.0221 {"browser_major": "1"  "browser_family": "TwitterBot"} 0.0339 {"browser_major": "7"  "browser_family": "Nokia Browser"} 0.0368 {"browser_major": "2"  "browser_family": "Ovi Browser"} 0.0551 {"browser_major": "6"  "browser_family": "UP.Browser"} 0.1104 {"browser_major": "8"  "browser_family": "Opera"} 0.124 {"browser_major": "2"  "browser_family": "Python Requests"} 0.1377 {"browser_major": "5"  "browser_family": "IE"} 0.1387 {"browser_major": "4"  "browser_family": "IE"} 0.2476 {"browser_major": "711"  "browser_family": "CFNetwork"} 0.5503 {"browser_major": "-"  "browser_family": "YandexBot"} 0.882 {"browser_major": "-"  "browser_family": "Slurp"} 6.0593 {"browser_major": "2"  "browser_family": "bingbot"}

What about IE6?
We do see user agents on bits for IE6, namely this one: {"browser_major":"6","os_family":"Windows XP","os_major":"-","device_family":"Other","browser_family":"IE","os_minor":"-"} Likely this browser is not identified by our code as IE6 and thus is being served Javascript (this is a bug) This browser represents about 1% of total pageviews.