Requests for comment/Encrypting DB fields

This is more just a random idea page at this point than an actual solid proposal or anything. It is definitely not ready for prime time yet, although I of course would love any thoughts and feedback.

Problem
Right now MediaWiki has very little defense in depth against data breaches. If a malicious person gets access to the DB servers or any mediawiki application server [as the MW user] they know basically everything that MediaWiki has collected.

Idea
We should consider encrypting sensitive fields in the database (i.e. Anything non-public whether that's emails or checkuser, etc). The encryption key should be stored in a central point (or a few central points for redundancy) that exposes a decryption service. The decryption service never exposes the key directly, but is a small trusted base responsible for decrypting strings, logging [And alerting], and maybe rate limiting. Direct access to this service would be extremely restricted (If we really want to go hard core, I guess we could even do TPMs). This service might also be responsible for calculating brute-force resistant hashes, where that makes sense.

I believe this approach is sometimes referred to as "crypto-anchoring"

Threats
So there are a number of threats to consider. There are a couple that this approach would prevent, but the bigger benefit is it would convert many attacks from being offline attacks to online attacks increasing the likelihood of detection, perhaps even mid-attack. It would also ensure more fool-proof logging to allow better auditability and reconstruction after the fact. There are of course many attacks that this doesn't prevent.

Prevents

 * SQL-injection to dump sensitive database fields (since now they are encrypted). This is a major one, as although we haven't suffered from all that many, it is a super common vulnerability in web apps, and pretty low skill to exploit.
 * Malicious actor gains access to a private DB backup
 * Malicious actor gains access to just the DB server (unlikely)

Mitigates
Mitigate here means that although it doesn't prevent the attack, we now would have an audit trail, maybe alerting. Additionally the attack would have to be online and much slower, possibly allowing detection mid-attack.
 * Malicious insider attempts to use shell access extract stuff from the DB [malicious insider does not have access to key server]
 * Although the first thing that comes to mind would be e.g. a massive dump of passwords, other possibilities might be trying to secretly run an unauthorized check-user without it showing up on the on-wiki log.
 * Malicious actor hacks a MediaWiki application server, and wants to extract sensitive data
 * There are a lot of potential vulnerabilities here, where this is probably the end result. Unserialization, RCE in MediaWiki, phising someone with access, somehow getting a malicious ssh key authorized, stealing someone's open laptop, etc

Does not prevent
Nothing fixes everything
 * Physical access to all the servers [perhaps TPM's would have some affect on this, but I don't think its realistic we can do much against this threat]
 * Live capturing data as it comes in
 * Lots of other stuff.