Requests for comment/Dependency injection/2014

This RFC proposes a lightweight mechanism for dependency injection. An implementation with tests and examples is provided. A facility like this could be combined with improved autoloading, and we could add it to core as a first step in an iterative development process. Other options are also discussed.

Problem statement
Dependency injection (DI) is a design pattern that can facilitate unit testing, loose coupling and architecture description. Although it's more useful in some languages than in others, it is a well-established pattern, and there is a solid ecosystem of DI libraries for PHP.

Mediawiki doesn't have a dedicated DI mechanism, though adding one has been discussed, and some new code in core does DI by hand. Also, WikibaseQuery has classes for DI.

Adding simple DI support to core would be a first step towards consistent, concise use of this pattern. Since we'll probably need at least a few iterations and use cases to get it right, this first step could be a kind of "internal API beta feature".

Previous discussions
Using DI in Mediawiki has been considered before. Here are some earlier conversations about it: Some issues that came up: In those discussions, it was generally accepted that DI would facilitate testing and would have to be added to Mediawiki incrementally. The apparent conclusion to the kitchen-sink-anti-pattern threads was that no, DI is not a version of that pattern, because the central registry is only accessed during boostrapping. With regard to using  as a central registry, it seems that that does not make sense since   appears throughout Mediawiki code, so using it would increase coupling, not loosen it.
 * Discussion of changes to Architecture guidelines at Wikimania 2013
 * Section on DI for external resources in Talk:Architecture guidelines
 * Discussion of TitleValue at the Arcitecture Summit 2014
 * The TitleValue RFC and the ServiceRegistry section on that RFC's Talk page
 * DI adds some kinds of complexity and reduces others. Is there a net benefit?
 * Will it be easier or harder to refactor code that uses DI?
 * Since access to a central instance registry is needed, is this a version of the kitchen sink anti-pattern?
 * Could  serve as a central instance registry?
 * Can we avoid DI leading to ugly zillion-parameter constructors?
 * It's easy to do DI wrong.
 * Would DI would be a barrier to entry for volunteer developers?

Note that those specific conclusions were not formalized; the above is an interpretation. However, at the Architecture Summit 2014, a decision was made to refactor  and include the by-hand DI currently in core (see below).

Current functionality
The proposed implementation is minimalistic, though usage is not too different from some existing DI libraries. Here's how it works.

Let's say you have the following interfaces and classes:

Assuming you want only one instance of these classes per request, you can set up and use DI like this:

(Note: the global variable  and the class   are actually called   and   in the version of the implementation in Gerrit.)

First we register the types (in this case, interfaces) and their realization classes (the concrete classes). When a type is requested, an instance of the corresponding realization class will be provided.

The default scope (in fact, the only scope implemented) is. In this scope,  creates and caches a single instance of the realization class the first time it (or rather, its corresponding type) is requested. So the first time we request, an instance of   is created and cached. The next time  is requested, the same object will be returned.

To instantiate,   looks at the type hint in the constructor and notices that the type   is also registered and that its realization class is. So it creates or fetches from its cache the singleton instance of  and injects it.

Note that the type registered doesn't have to be an interface. It can also be a superclass of, or even the same class as, the realization class.

Factory example
The proposed implementation is much less featureful than most DI libraries. However, by creating factories, you can still use it to set up loose coupling for classes that you need many instances of. For example, suppose that we have the following interface-class pair for something we need to be able to create on demand:

In this case we can create a factory and inject that into the class that will be creating the s:

Here, we isolate the call to  for the concrete class  in a very simple factory, and inject the factory into. That way,  doesn't know about the implementation details of , and limits its knowledge to just the contracts it needs, that is,   and the very simple.

Possible combination with better autoloading
The verbosity of registrations is sadly reminiscent of Mediawiki's also-verbose autoloading registrations. However, better autoloading is definitely possible. If both DI and improved autoloading were added to core, it might make sense to combine them or link them up somehow. Convention-over-configuration could be used to reduce the verbosity of setup code for the most typical configurations.

Tests and example
Please see the Campaigns extension for examples and unit tests.

Other options
DI is already used to some extent in Mediawiki and sister projects. We could adopt one of the approaches already in use instead of the implementation presented here. We could also use an external library. Following is a quick discussion of some of these options and their advantages and disadvantages.

DI by hand in Mediawiki
At its simplest, dependency injection just means that code outside a component sets up the component's dependencies. A dedicated facility is not required.

This sort of injection "by hand" is used in core for new classes introduced with. For example:
 * The interface  sets out the contract that a page link renderer must fulfill.
 * is the default implementation of.
 * The legacy class  has been updated to use.
 * Instantiation of  is embedded in legacy code that cannot be easily changed, so method injection, rather than constructor injection, is used. The   method allows different implementations of    to be switched in.
 * also contains code to instantiate the standard implementation,, if no   is provided via method injection.

DI in WikibaseQuery
The DI in WikibaseQuery (part of Wikibase) is a midpoint between DI by hand and declaratively-set-up-and-library-handled DI. Here's roughly how it works. (The classes mentioned below are in  or  .)
 * Objects are built by builders, which are subclasses of  and must implement.
 * Builders can get pretty complicated. See, for example,.
 * Builders are registered with a  and are associated with a string key.
 * provides an outward-facing interface for obtaining instances. It has a method for each type of instance that can be obtained. (For example, .) Most of the application does not access the   directly, but rather goes through   to get instances.
 * According to the documentation, if object caching is needed, it should be done in.
 * The singleton instance of  is obtained via a static method on , which also provides a means of switching in different DI configurations.

External libraries
There are several popular DI libraries for PHP, such as Pimple, PHP-DI and Symfony DependencyInjection. They offer more features than the approaches described here, including the proposed implementation. Some are rather heavyweight—for example, Symfony DI supports optional dependencies, multiple scopes, configuration via YAML, XML or PHP, and service tags.

Discussion

 * It seems likely that after a while DI by hand would get more complex and verbose than DI using a dedicated facility.
 * DI by hand does help loosen coupling. But it doesn't add much as regards architecture description.
 * The approach taken in WikibaseQuery feels a bit unusual and verbose, though sufficiently flexible.
 * Doing DI with an external library would have similar advantages and disadvantages to using external libraries for other MW functions: the potential for more features and better support, synergy with other free software ecosystems, the introduction of complexity due to unused features, possible security implications, the need for coordination with upstream. See this RFC and this one for more.
 * The implementation proposed here aims to be concise, extensible and similar enough to external libraries that switching to one would be easy.

Proposed methodology
As has been stated in previous discussions, any addition of DI to Mediawiki should be gradual. This RFC is not about refactoring existing Mediawiki classes to use DI, but about adding lightweight DI facilities to Mediawiki. Such facilities could be used with new and non-central MW code on the understanding that they are experimental and could change or even disappear at any time. They would be a sort of internal "beta feature". Reviewing how they are used and whether or not they improve code quality would be a central task. This makes sense since internal APIs, just like user-facing features, must integrate with many systems, including human and social ones, so they are just as hard to get right! That's why an iterative process that includes lots of review seems like the best bet. If we rework core code to use DI facilities, it would probably be wise to only do so once those facilities have stabilized.