Requests for comment/Moving database abstractions out of MediaWiki core

Just quick notes for now.

Background
We "support" mysql, postgres, sqlite, oracle, and mssql as database abstraction layers in MediaWiki. But in reality only mysql and sqlite are well supported. MediaWiki has no support whatsoever for third-party database drivers, meaning that if someone wants to take it on themselves to add a new abstraction layer, the code must live in core. This leads to numerous problems, however, and is furthermore inconsistent with many other parts of the MediaWiki software where it is very easily customized and extended by third-party code.

Problem
For oracle and mssql especially, there are very few developers who can support those codebases, and don't frequently have the time to do so. Skizzerz, for example, maintains the mssql implementation, but only plans on maintaining the LTS releases and doesn't have time to keep up with master's development cycle. Even for DBMSes that are free (postgres), bundling the libraries in core when they may not fully support the latest version or are partially broken gives false pretenses to users about the level of support when they see the options available in the installer. If the user needed to explicitly install the libraries themselves, they would have knowledge of how well-supported said DBMS is and can make a more educated decision on whether or not to go with it for their mediawiki installation. Additionally, our unit test suite does not touch at all on alternate DBMSes, which further exacerbates the issue of not knowing how well-supported or buggy a particular implementation is.

A simple solution is to remove these abstractions from core and be done with it, however if they are removed from core it becomes impossible to use them without hacking core code to put them back in. This RFC aims to provide the underlying infrastructure support in mediawiki to allow for database abstractions to be installed/included much like extensions and skins are. This gives us the best of both worlds, where someone downloading the core code can be assured that the functionality and features they want to use are supported by all bundled database abstractions and also have the option of installing third-party abstractions that may offer less functionality, are less supported, or were made for older versions of mediawiki with full knowledge of the caveats that result by doing so.

Proposal

 * Allow a new type of extension for custom database abstractions. This needs to be as low-friction as possible, quite possibly with auto-detection of database types as we cannot rely on LocalSettings config in order to register them since they are needed as part of the install process where LocalSettings doesn't exist yet. To reduce code attack surface and to prevent issues going forward as breaking changes are made, it may be wise to only autodetect/autoload third-party database abstraction layers during the install process and rely on LocalSettings to explicitly load them after the install is complete.
 * These could live in the $IP/db directory perhaps to differentiate them between normal extensions and skins, as they provide low-level functionality. While existing database stuff lives in $IP/includes/db, I'm not sure if that's the best place for third-party code (breaks convention from extensions and skins which are top-level, among other things).
 * As part of this, could possibly move existing database abstractions that are natively supported into $IP/db as well (e.g. mysql and sqlite). This would provide a good test-bed to ensure that the ability to load database abstractions from that location works as expected, and would be consistent with core skins that live alongside third-party skins as well.
 * Speaking of testing, it would be interesting to explore whether or not we can run the unit test suite using a specified DBMS as the backend instead of only using sqlite. This would allow a number of benefits across the pipeline for third-party database abstractions, as well as first-party ones. A patch that touches a particular abstraction could be automatically run using that dbms in jenkins so that we know if it causes any regressions. Third-party developers can also use the unit tests as a guideline for determining that their abstraction supports all of the things it needs to in order to work properly with core code or any given extension that defines its own tests.
 * The difficulty with jenkins auto-running these is for the commercial DBMSes that require purchasing software licenses or other requirements, how do we manage that? Do we accept people donating properly-licensed slaves? From a non-technical standpoint, could adding "official" support in the development pipeline for commercial/proprietary DBMSes be construed as not conforming to our mission (and if so, how do we avoid that)?
 * The bane of unit testing on a myriad of DBMSes (or even supporting them in the first place) are instances where raw SQL strings are passed in to be directly queried. Is it possible/feasible/desirable to deprecate the ability for an extension to pass in arbitrary SQL and instead enforce usage of our abstraction wrappers? There are cases in the wrappers as well where raw SQL fragments can be passed in, and eliminating those would be part of this as well. Hopefully no core code would be impacted by this, but I can foresee extensions being impacted by this change, especially extensions written for corporate environments where cross-DBMS functionality would be considered a waste of time rather than a desirable feature.