Security for developers/Architecture

This is being developed to help lead developers, architects, and product managers make decisions that protect MediaWiki's users when developing new features.

All MediaWiki developers can follow these principles and process when developing new core features or extensions. If a developer or team is planning to have their code deployed on the Wikimedia cluster, following these guidelines will ensure the security review process is quick and requires minimal changes before deployment.

This guide interrelates with the Architecture guidelines and Performance guidelines.

Developing Securely in MediaWiki
When writing a new feature or refactoring old code in MediaWiki, it's important that you consider security throughout the process. You should:
 * Know what types of information we are trying to protect in MediaWiki
 * Before starting, assess whether this feature is a good idea
 * While designing (architecting), prefer decisions that promote secure design principles
 * For security-critical features, model the threats, and get input from security reviewers on how you will mitigate them
 * When writing code, ensure you avoid common security mistakes
 * Ensure you have browser tests for all attack surfaces, so we can automate security testing
 * Identify someone who will be responsible for addressing security vulnerabilities in the future

What are we trying to protect?
Although MediaWiki is primarily designed to allow anyone to edit, and to provide access to other user's information, there are a few things that the system attempts to keep private:
 * Confidentiality of deleted & suppressed content. E.g. content in an article, edit summary, username of editor, or specific log entries that have been deleted by an admin, or suppressed by a user with suppression rights.
 * Confidentiality of data protected by the WMF privacy policy
 * IP address or UserAgent of editors
 * Email address or password associated with an account
 * Other private data defined in our Data retention guidelines
 * Allow for deletion and suppression of any contributed content. Any content that comes from a user (actual content, or metadata such as username or edit summary) must have the ability to be removed from normal users (deletion) and also admins (suppression).
 * Integrity of content, attribution and logs. Your feature should attribute content to the author, and allow neither the attribution to be changed, nor the content of the contribution without further attributing the change to the user who made it. Logging information shouldn't be deleted or changed, unless there as an approval and oversight process. In general, a user should not be able to deny that they made an edit attributed to their user, nor should an admin be able to deny taking an administrative action that the logs report they took. For any actions that change the state of the wiki (or add content), ensure anti-CSRF mechanisms are used. Any contributed content must integrate with the CheckUser extension so vandalism and threats to personal safety can be appropriately investigated.
 * Prevention of site Denial of Service (DoS). Normal/anonymous users shouldn't be able to asymmetrically cause MediaWiki to do a lot of work. If a feature is easy for the user to request, but requires significant resources for the server to fulfill, the feature should be limited per user or by total instances running on the server.
 * Prevention of content DoS, e.g. vandalism and spam. All contributed content should integrate with spam prevention tools (SpamBlacklist, AbuseFilter), and spam-bot prevention tools (ConfirmEdit). New features should not allow contributions by users or IPs who have been previously blocked.
 * Prevent accounts from elevating their privileges without authorization. Your feature should protect the existing access rights to content, and your feature should not allow changing user rights outside of the existing process.
 * Although MediaWiki doesn't attempt to allow enforcing fine-grained access controls, it is especially important to ensure that your feature doesn't allow read access if the wiki is private.

Assess your feature's security
A feature may not have any of the implementation flaws listed on Security for developers, but if it fails to protect the items listed above, then we don't want this feature running on our sites. Always ensure that the process that your feature enables isn't itself flawed.

When starting a new application or re-factoring an existing application, you should consider each functional feature, and consider: If your feature has specific security features (authenticates users, implements its own XSS sanitization, or similar tasks), these specific features should have their design reviewed by someone on the security team. Additionally, high risk applications (e.g., those dealing with sensitive data, or handling new, complex file formats) should have their design reviewed also. See "Does my application need a security review" to determine if you need to review the architecture before you start implementing your feature.
 * Is the process surrounding this feature as safe as possible? In other words, is this a flawed process?
 * If I were evil, how would I abuse this feature?
 * Is the feature required to be on by default? If so, are there limits or options that could help reduce the risk from this feature?

Secure Design Principles
There have been many design principles for security features discussed in academia and the information security community for many years. Both the lists from Saltzer and Schroeder's 1975 paper, and OWASP's 2005 Developer's Guide are often cited. Although a case could be made that all code in MediaWiki should follow all of the principles from both lists, the items that we especially value and look for during review are:


 * Minimize the attack surface. For any feature (and especially security-critical features), limit the avenues of attack.
 * Bad example: In the early days of the API, each API method generated and distributed its own anti-CSRF token for state changing operations. However, when we expanded the API to handle JSONP calls, many of the individual token handling functions weren't updated. This resulted in bug ??, in which we added anti-CSRF tokens available via JSONP in several API methods. Since JSONP allows reading javascript responses from other domains, this allowed attackers to read a user's token from another website, and perform CSRF attacks against them.
 * (pos) Maintenance scripts can only be run from the cli
 * Simplicity
 * (neg) User identification
 * Secure (fail-safe) defaults. Prefer code that defaults to being most restrictive, and give administrators or developers the ability to only enable features that they need. Prefer to filter via whitelist instead of trying to blacklist insecure values.
 * (pos) Ex:RSS requires administrator to whitelist known good feeds for inclusion in a wiki, instead of blacklisting sites that have caused problems. This limit's the server's exposure to client-side vulnerabilities in curl and related libraries.
 * Least Privilege. Users of MediaWiki should be able to work with the least set of privileges necessary to compete their task. Additionally, MediaWiki should strive to function well with minimal rights on the hosting server.
 * (neg) For a long time, MediaWiki
 * (pos) MediaWiki's web installer allows the user to provide two database users. One that has privileges to create the tables during the initial installation, and one that will be used during operation which only needs basic CRUD rights on the existing tables.
 * (pos) Many useful tools run on the WMF's toollabs system, instead of being built as extensions or functionality in MediaWiki. Further, many of these tools now use OAuth, so users only grant these tools limited access rights on their behalf. For example, the image rotate tool on commons.
 * Psychological acceptability. "It is essential that the human interface be designed for ease of use, so that users routinely and automatically apply the protection mechanisms correctly."
 * (pos) When enabling https for all logged-in sessions, we gave users the option to still opt-out of using https for their session
 * (neg) MediaWiki's html generation classes (Xml and Html) are often shunned by developers as being too complex and cumbersome to use, so they generate the html themselves, and sometime generate XSS vulnerabilities.

Threat Modeling
When looking at specific security considerations and controls in your feature, you also need to consider how your feature will be attacked. For each possible attack, you need to decide if the risk is worth mitigating with a technical control and how that should be done, or you can make the conscious decision to accept the risk posed by the threat. Accepted risks should be communicated to your stakeholders.

To think through the different threats that your feature will face, in Threat Modeling: Designing for Security (ISBN 1118809998), Adam Shostack recommends first drawing a data flow diagram representing the external actors, processes, and datastores for your feature and how the data flows between them with trust boundaries drawn around the actors and processes that trust each other. Once this diagram is drawn, for each place where data flows across a trust boundary, consider how your feature will prevent Spoofing, Tampering, Repudiation, Information disclosure, Denial of Service, Elevation of privileges (STRIDE).

Alternatively (or in addition to STRIDE modelling), you can also use MITRE's list of common attack patterns (CAPEC) to think through common attacks on your feature, and how you can mitigate each if applicable. STRIDE is often a useful mnemonic when you are initially thinking through the major ways your code can be abused, while CAPEC is a long, but fairly comprehensive list of attacks an may be useful as a checklist to review your design.

Implementation
As you implement the feature and controls, you need to make sure that:


 * The controls you identified while doing the design and threat models were correctly implemented
 * The code does not allow attacks on the site's users (XSS, CSRF) or the server (SQL or Command Injection). Review Security_for_developers to be familiar with the most common mistakes.
 * Verify that
 * MediaWiki authorization structure is enforced
 * You integrate with MediaWiki's anti-spam and anti-vandalism controls (at minimum Checkuser, AbuseFilter, and SpamBlacklist)
 * If your code processes XML, you have disabled XML external entity loading (XXE Attacks)
 * If your code redirects or proxies requests, the location has been sanitized and approved via whitelist

Security Testing
Security critical units of code should have a comprehensive set of unit tests. Both positive and negative tests should be used (e.g., a user can be authenticated with the correct password, and the user is not authenticated with the wrong password or a blank password). The tests should show that basic, adversarial inputs are accounted for.

Most (or all?) ways that users can interact with your feature should have browser tests defined. This will help us run security scanning software to test all of the ways a user can interact with your feature for security vulnerabilities.

Ongoing Response to Security Bugs
Before your project or feature is put into production, people or teams who will be responsible for providing security features in the future must be identified.

Security bugs will all be made public in bugzilla. To see the types of bugs we have fixed in MediaWiki, go here. Each bug should include all the information specified in Security bug information, so you can see what the problems were, and how we fixed them.

Resources

 * OWASP top 10 - https://www.owasp.org/index.php/Top_10_2013-Top_10
 * CWE top 25 - https://cwe.mitre.org/top25/
 * CAPEC - http://capec.mitre.org/data/definitions/1000.html
 * STRIDE - http://msdn.microsoft.com/en-us/magazine/cc163519.aspx
 * BSIMM - http://bsimm.com/