Requests for comment/Content-Security-Policy
|Document status||in discussion
The purpose of this RFC is to propose using the Content-Security-Policy HTTP header on Wikimedia Wikis. Skip to tl;dr
<script>doStuff()</script>, etc). In the case of inline scripts in script tags, it also has the option to only allow them if they have a nonce attribute that matches the nonce value supplied in the HTTP header.
With these restrictions in place, it can be very difficult for an attacker who found an XSS to actually exploit it. While the current architecture of MediaWiki limits the applicability of this protection somewhat (particularly, the fact we allow users to make their own JS files, and users expect to be able to load arbitrary wikipages as JS files), I still believe that we can significantly benefit from using CSP.
CSP is of course no substitute for proper security practices like escaping. However, I hope that it can be a defense-in-depth measure that severely restricts the exploitability of escaping related bugs.
I have split this RFC into stages for different uses of the CSP header. I expect stage 0-2 to be relatively uncontroversial, while the other stages will be more controversial as they will inconvenience some gadget developers. I believe all the stages to be worthwhile, but 0-2 to be particularly important.
Potential attacks that are relevant to this discussion:
- [Persistent XSS] User discovers a way to inject into the parser an attribute like onmouseover, which should not be allowed (a surprisingly large number of XSS vulnerabilities allow the user to inject an attribute, but not new elements generally) [For the purpose of this numbering scheme, I'm not counting injecting data attributes here]
- [Persistent XSS] User discovers a method of injecting arbitrary HTML into the parser.
- [DOM XSS] User discovers a method of injecting arbitrary HTML into DOM with the help of client side scripts.
- [Persistent XSS-data] Attacker discovers a method of injecting data-mw-foo, data-ooui attributes into the parser (The sanitizer bans data attributes starting with data-mw, in order that client side js can use them as trusted input)
- [Privacy fail] A misguided administrator adds Google Analytics to MediaWiki:Common.js
- [Privacy fail] Somebody loads images/iframes/etc from an external source. Similar to the Google Analytics case, but not as severe
CSP can control what resources the browser loads, and what scripts it executes. It has two modes, report-only and enforce. I want to first enable it in report-only to see what breaks. Once there is sufficient awareness and enough scripts have been migrated, we continue to enable it in enforce mode. The aim of this feature is to make it very difficult to exploit an XSS bug in the parser or in the upload file filter.
- [stage 3-6]:
- We use CSP to ban JS from domains not controlled by Wikimedia.
<script>doStuff();</script>, inline event handler attributes like
onmouseover="doStuff()", linking to
- For inline script tags that MW needs, there is a nonce value supplied in an HTTP header. Script tags like
<script nonce="[nonce value from http header]">doStuff();</script>are still allowed to run. The nonce changes on every page view for non-Varnish users, and whenever cache is cleared for Varnish users
- On the server side, we use a regex to look for
<script>tags, and remove them from parser output. This is in order to have the benefit of CSP while still allowing custom user JS. The assumption being, it's difficult to ban every type of script loading thing, but its easy to just blacklist
Done: Stage 0 (Create CSP reporting end-point)
Content-Security-Policy (and Content-Security-Policy-Report-Only) allows one to specify a URL where the browser will send violation reports
Stage 0 was to merge https://gerrit.wikimedia.org/r/#/c/274022/ so we have an API end-point to receive these reports. This was completed in 2016.
Stage 1 (Enable CSP reporting for uploads)
Write a Varnish patch to add the following to images served from upload.wikimedia.org:
Content-Security-Policy-Report-Only: default-src 'none'; style-src 'unsafe-inline' data:; font-src data:; img-src data:; media-src data:; sandbox; report-uri https://commons.wikimedia.org/w/api.php?reportonly=1&src=image X-Content-Security-Policy-Report-Only: default-src 'none'; style-src 'unsafe-inline' data:; font-src data:; img-src data:; media-src data:; sandbox; report-uri https://commons.wikimedia.org/w/api.php?reportonly=1&src=image X-Webkit-CSP-Report-Only: default-src 'none'; style-src 'unsafe-inline' data:; font-src data:; img-src data:; media-src data:; sandbox; report-uri https://commons.wikimedia.org/w/api.php?reportonly=1&src=image
This should ensure we get a report any time something on upload.wikimedia.org tries to execute a script, or load an external resource
Stage 2 (CSP enforce for uploads)
This will stop attack #5.
If there are no unexpected consequences of stage 1, remove the report-only part of the headers from stage 1, to start enforcing CSP.
This will also make sure that any old SVGs uploaded before the current filters are in place, render in the browser much closer to how we render them, so will probably be a good move for consistency's sake too.
I think it would be a good idea to also put a
.htaccess file in the upload directory of default MediaWiki that adds these headers. This would make third-party users who use Apache more secure against malicious uploads.
Stage 3 (Prepare Wikimedia wikis for general CSP)
The next stage would be to try and enable CSP on our main sites. Before doing that, we should fix cases we already know are going to break:
- Where previously using event handler attributes in extension HTML was already strongly discouraged, it must now be banned entirely.
- We should allow
- Announce to the Wikimedia community that use of inline event handlers and
- Merge the XSS filter patch. This will remove
<script>. Thus we blacklist script tags directly on the server, and rely on CSP to filter all the other ways of loading JS. This of course means we can't really protect against DOM-based XSS.
Stage 4 (Report only CSP on testwiki)
This requires I80f6f469b.
We enable a Content-Security-Policy-Report-Only: header on testwiki. This will allow us to see what parts of MediaWiki trigger CSP, without getting overwhelmed with reports.
We could set
$wgCSPReportOnlyHeader = true;
to do this. But that might make a very long header:
Content-Security-Policy-Report-Only: script-src 'unsafe-eval' 'self' 'nonce-2jGMIpo4g6rZ+pvkjldH' 'unsafe-inline' *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org www.wikidata.org m.wikidata.org test.wikidata.org *.wikivoyage.org www.mediawiki.org m.mediawiki.org wikimediafoundation.org advisory.wikimedia.org affcom.wikimedia.org auditcom.wikimedia.org boardgovcom.wikimedia.org board.wikimedia.org chair.wikimedia.org checkuser.wikimedia.org collab.wikimedia.org commons.wikimedia.org donate.wikimedia.org exec.wikimedia.org grants.wikimedia.org incubator.wikimedia.org internal.wikimedia.org login.wikimedia.org meta.wikimedia.org movementroles.wikimedia.org office.wikimedia.org otrs-wiki.wikimedia.org outreach.wikimedia.org quality.wikimedia.org searchcom.wikimedia.org spcom.wikimedia.org species.wikimedia.org steward.wikimedia.org strategy.wikimedia.org usability.wikimedia.org wikimaniateam.wikimedia.org; default-src * data: blob:; style-src * data: blob: 'unsafe-inline'; report-uri /w/api.php?action=cspreport&format=json&reportonly=1
Hopefully header compression in HTTP/2 will reduce the impact of the header length. Nonetheless it is very long. We want to ban loading scripts from domains like Google for privacy reasons, however given that users can create their own script pages, there's probably not much security benefit to being so strict which Wikimedia sub-domains to load from. So instead, lets just include *.wikimedia.org
$wgCSPReportOnlyHeader = [ 'includeCORS' => false, 'script-src' => [ '*.wikipedia.org', '*.wikinews.org', '*.wiktionary.org', '*.wikibooks.org', '*.wikiversity.org', '*.wikisource.org', 'wikisource.org', '*.wikiquote.org', '*.wikidata.org', '*.wikivoyage.org', '*.mediawiki.org', '*.wikimedia.org', 'wikimediafoundation.org', ] ];
Which makes a header like:
Content-Security-Policy-Report-Only: script-src 'unsafe-eval' 'self' 'nonce-ukjJjdMn5uApeN2LKiqN' *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org *.wikimedia.org wikimediafoundation.org 'unsafe-inline'; default-src * data: blob:; style-src * data: blob: 'unsafe-inline'; report-uri /w/api.php?action=cspreport&format=json&reportonly=1
Which is still sort of long, but significantly shorter.
If we're willing to break some things, we could make the assumption that most cross wiki user scripts are either on a different language edition of the same project, at commons (e.g. HotCat), or at meta, and simply whitelist *.wikimedia.org and *.<currentproject>.org. I'm not sure how many things that would break. I'm not sure if this is worth the back-compat cost, but if we're willing to accept more breakage, we can shorten things down to:
Content-Security-Policy-Report-Only: script-src 'unsafe-eval' 'self' 'nonce-ukjJjdMn5uApeN2LKiqN' *.wikipedia.org *.wikimedia.org 'unsafe-inline'; default-src * data: blob:; style-src * data: blob: 'unsafe-inline'; report-uri /w/api.php?action=cspreport&format=json&reportonly=1
[Assuming we're on en.wikipedia.org. If on different project replace *.wikipedia.org with *.<projectname>.org. *.wikimedia.org is always needed as meta has a special role, and probably also loginwiki. We could even kill the *.<projectname>.org if we really wanted to]
So what does this do? It tells the browser to report if any of the following happens
- Loading a script from a non-Wikimedia controlled domain. (Attack #6)
- an inline event handler (attribute starting with on) is executed
- A script tag that has neither a src attribute, or a nonce attribute is executed. e.g.
<script>doStuff()</script>. More on the nonce attribute thing below
The actual header can be parsed as follows:
script-src 'unsafe-eval' 'self' 'nonce-ukjJjdMn5uApeN2LKiqN' *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org *.wikimedia.org wikimediafoundation.org 'unsafe-inline'
unsafe-evalmeans that you can use
$.globalEval()in JS code. This is needed for RL module caching. Most uses of eval in core are sane (with the exception of jquery.ui.datepicker). The main benefit to disabling eval would be to prevent users from doing stupid things in gadgets, since eval is easy to misuse. However, we need it, so it's still enabled.
'self'means we can load JS from ourselves. Kind of repetitive since we list basically every Wikimedia domain, but good as a fallback just in case.
'nonce-ukjJjdMn5uApeN2LKiqN'means that inline script tags (
<script>doStuff();</script>) are not allowed. But
<script nonce="ukjJjdMn5uApeN2LKiqN">doStuff()</script>is allowed. It also causes
- The domains obviously allow loading scripts from those domains.
'unsafe-inline'is a fallback line for browsers that don't support
'nonce-foo'. Browsers supporting nonce should ignore it.
style-src * data: blob: 'unsafe-inline';
- For style, we allow CSS from everything.
blob:is probably not needed here strictly speaking, but it doesn't seem to hurt anything.
unsafe-inlinemeans we allow
styleattributes on elements.
default-src * data: blob:;
- For everything else (images,
<video>, iframe, etc). We allow all domains,
blob:is needed for Special:Upload and UploadWizard)
Note: This uses some CSP2 features, so we only use the modern Content-Security-Policy header (not the X- variants) which has less browser support compared to stage 0-2. This should work on modern Firefox, Chrome and Opera.
Todo: Tyler suggested adding a meta tag with a more restrictive policy after page load. This would help mitigate the fact that anon users get repeated nonces, and also help provide better protection for browsers that don't support CSP2. Need to investigate this.
Stage 5 (Enforce CSP on testwiki, report on others)
At this point we start enabling report-only mode on other wikis (perhaps starting with smaller wikis first so we don't overwhelm ourselves with reports). We start working with users to fix their scripts. It is expected that this stage will take a long time.
- how to differentiate personal scripts and site-wide situations (like Common.js or a default gadget),
- This will be fairly easy - If in a half hour period there are > 10,000 reports, probably a sitewide script. The reports also include line number and file name (however that's less useful in the context of non-debug mode RL) -bawolff
- For privacy violation aspects, that probably won't come to much later, once we have the main script security violations figured out. Initial deployment will be hard enough with just the anti-XSS features. But once that eventually happens, I imagine someone (like me) will be monitoring the log, and if action is needed, will forward it to the appropriate people. -bawolff
- how to communicate with developers/local administrators for the rest.
loginwiki might also be a good first candidate to start enforcing CSP on.
Stage 6 (Actually enforce CSP)
Start enforcing on some wikis as popular gadgets get fixed. I expect this will not happen all at once, but gradually we're notice things like - hmm, frwikisource stopped sending CSP reports, let's enable it there, etc.
Once we've done the first 6 stages, we should have mitigated attacks 1,2,5,6 for all users, and attack #3 totally for logged in users, and sort of partially for logged out users. Leaving just #4 and #7.
Stage 7 (CSP to prevent tracking pixels, etc.)
For privacy reasons, we could start enforcing CSP for other resources (
<img>, iframe, etc). The idea being, even in the event of an XSS (or just unfiltered style attribute), the attacker should not be able to leak things like IP addresses of our users to a third party, or embed tracking images with third party cookies.
This will fix attack #7. At least against a naive attacker or someone just misguided and not malicious. There are still side channel attacks that a sophisticated attacker could probably use.
The header would have to look like:
Content-Security-Policy: script-src 'unsafe-eval' 'self' 'nonce-1yaAqSy3ScolhtfIS+Kz' *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org *.wikimedia.org wikimediafoundation.org 'unsafe-inline'; default-src 'self' data: blob: *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org *.wikimedia.org wikimediafoundation.org; style-src 'self' data: blob: *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org *.wikimedia.org wikimediafoundation.org 'unsafe-inline'; report-uri /w/api.php?action=cspreport&format=json
Stage 8 (Option in the installer?)
Eventually I think it would be a good idea to add CSP as an option in the installer for security-conscious third-party users. However I don't think it should be enabled by default since it breaks many legitimate use cases for third-parties.
Potential solutions to data-mw attributes (Attack #4)
The largest hole in this scheme is the
data-ooui prefix attributes. The MediaWiki sanitizer bans attributes starting with those prefixes in order to allow trusted input to be given to the client side. Some parts of MediaWiki use
data-mw to mark interface elements. An attacker forging them could potentially confuse the user into doing something they don't want, or mess up the interface. Older versions of OOJS allowed constructing links to
data-ooui attribute. New versions of OOJS don't have this issue and CSP would ban the
One idea I had to mitigate this issue which I would like feedback on, is to do something similar to the nonce attribute for script tags. We could add a per page nonce variable to (the client side) mw.config.get(). When we want to use a trusted data attribute, we would add an attribute like
data-mw-nonce="<number of data attributes>;<nonce>" Then instead of
.data(), client side scripts could use some new function
.verifiedData(). This code would look for the data-mw-nonce attribute, and only return the other data attributes if the number of data attributes, and the nonce value, is correct. The nonce would change everytime the page is parsed.
Thus in order to inject an evil data attribute, one would have to have an XSS directly in the middle of an existing element containing a data-mw-nonce attribute. And one would have to be able to do more than just add attributes, since the number of data attributes is included in the nonce (e.g. If someone only forgot to escape quote marks, that's still no good, as that only allows you to add additional attributes, not end the element early).