Requests for comment/Server-side Javascript error logging

Providing a stack trace with any error report has been the standard for the last few decades, and greatly reduces time and effort needed to pinpount the source of a bug. However, in the world of frontend development such support is still unusual, and for MediaWiki development it is missing completely. This is a proposal to change that.

The task of making a Javascript error available to the developers, without involving the user on whose browser the error happens, can be split into four parts which are mostly unrelated:
 * how to write a Javascript error handler which can obtain all the information (message, location, stacktrace etc) provided by the browser when an error happens
 * how to get this information to the server where it will be processed
 * how to process it on the server-side to make it maximally useful for developers
 * how to display it/make it available

Catching the error
There are two ways to catch a Javascript error: try/catch and. Exception handling is superior in multiple ways, but has to be added to the code manually or via some sort of automated code generation. (How an exception will be caught by try/catch is not in the scope of this RfC, but providing an easy way to log such exceptions is in scope.)

, on the other hand, is meant to be added globally, without modifying application code, but has its shortcomings:
 * It does not include column numbers on older browsers, which is problematic for minified code. (Although this is a problem for try/catch exception handling as well.)
 * The exception object or stack trace is also not available on older browsers. The WHATWG HTML5 standard includes these parameters in the  parameter list, and recent Chrome and Firefox provide access to the stack trace in window.onerror; Safari and IE do not, although there are hacks for the latter.
 * If the script was loaded from a different domain (which is almost always the case for WMF sites), the browser hides all error details as a security measure; only a non-specific error message such as  is passed. Most recent browsers (at least Chrome, Firefox and WebKit) allow opting in to show this information via CORS, by setting an   HTTP header on the script resource and adding a attribute to the   tag.
 * The CORS standard requires a CORS resource failure to be handled as a network error, which means that whenever the attribute is specified but the HTTP header is somehow missing, script loading will break completely.

To make it easy to connect stack traces with error reports, the error catching script should also generate an error id which can be displayed to the user by the application. This would be some sort of hash generated from the error details (message + filename + position, or maybe just filename + position to avoid duplicates based on language / browser version).

Sending the error to the server
There should be a simple way to transfer error data to the server, which is simple to set up and suitable for most MediaWiki installs; WMF with its huge traffic probably needs something more complex.

The generic solution could simply be an AJAX request to an API endpoint (maybe with some sort of throttling), then use standard logging with a reserved channel; the site operator can set up the normal way where that channel goes. (This assumes that the structured logging RFC will be implemented.)

The WMF solution will need to be able to handle huge traffic. (If things break badly, every single pageview could trigger an error. If things break really badly (e.g. error in a mousemove handler), every pageview might generate hundreds of them.) This could be done by some EventLogging-ish setup (possibly with some sort of throttling or sampling): add an error.gif file to the DOM (TODO: could GET length limits be problematic for huge stack traces?), have Varnish send an UDP packet with the error details, URL, user agent etc., and process it with some script that handles lots of incoming connections well. (EventLogging uses twisted(?); the other obvious alternative would be to use node.js with something like bunyan).

Processing
Before storing:
 * Most errors will reference minified files - we can ask users to reproduce in debug mode (blehh) or use source maps to reconstruct. For LESS etc. source maps would be preferable anyway. (TODO what are the existing tools for this?)
 * The same script or some latter post-processing could try to figure out which groups an error is in (e.g. which extension owns the file) and ping graphite so we have nice error frequency stats.

After storing:
 * We need to deduplicate errors if we want to get any useful overview:
 * The same error can occur on multiple pages, multiple sites, normal vs.debug mode, possibly multiple resource URLs due to different batching of files in ResourceLoader. (Source maps solve this issue, but not all browsers return column numbers.)
 * Error messages might vary due to i18n and browser differences. (discard non-english to make it manageable? ignore message, rely on file/line info?) Besides deduplication, we want to make sure that developers see the English message (also log browser language?).

Displaying
Send results to ElasticSearch, set up a Kibana frontend to it? (This is how backend errors are handled.)

Due to security and privacy issues, the availability of this has to be strongly limited, but knowing about this errors would be very useful for many people who work with JS (gadget maintainers, site admins changing MediaWiki:Common.js etc); publish some sort of stats?

Non-MediaWiki examples
Free software:
 * stacktrace.js - Javascript library for obtaining the stacktrace
 * TraceKit - Javascript library for obtaining the stacktrace
 * Sentry - full-stack error logging service (not JS-specific, but supports JS)
 * jsErrorLog - full-stack JS error logging service
 * ErrorBoard - simple full-stack JS error logging service

Commercial / SaaS: TrackJS, bugsense, JSLogger, Qbaka, Muscula, errorception, ExceptionHub, Bugsnag, Exceptional, Airbrake, Raygun, RollBar

Related bugs

 * : same idea, but WMF-specific

Good articles

 * JS stacktraces. The good, the bad, and the ugly.
 * Error Object Compatibility Table
 * Column numbers in Firefox