Requests for comment/TitleValue

This request for comments introduces a new class named TitleValue to take over many uses of the current Title class.

Background
During the MediaWiki architecture discussion at Wikimania 2013, the merits of using value objects over active records were once more discussed. The consensus was that instead of making a fundamental decision and planning major refactoring, we will try out the idea on a part of the codebase where it appears to be beneficial. The idea is to continue the architecture discussion once we have collected some experience with the new approach.

A quick primer about value objects:


 * Methods in value objects have no side effects.
 * Value objects can easily be serialized and stored.
 * Value objects can be instantiated easily and efficiently.
 * Value objects represent the value, and operations on the value, but not operations with the value.
 * Value objects are typically, but not necessarily, immutable.
 * Value objects follow the principle "hair should not know how to cut itself". If you want to use a value in an operation, you need a service object that operates on the value.

Motivation
The old Title class is huge and has many dependencies. It relies on global states for things like namespace resolution and permission checks. It requires a database connection for caching.

This makes it hard to use Title objects in a different context, such as unit tests. Which in turn makes it quite difficult to write any clean unit tests (not using any global state) for MediaWiki since Title objects are required as parameters by many classes.

In a more fundamental sense, the fact that Title has so many dependencies, and everything that uses a Title object inherits all of these dependencies, means that the MediaWiki codebase as a whole has highly "tangled" dependencies, and it is very hard to use individual classes separately.

Instead of trying to refactor and redefine the Title class, this proposal suggest to introduce an alternative class that can be used instead of Title object to represent the title of a wiki page. The implementation of the old Title class should be changed to rely on the new code where possible, but its interface and behavior should not change.

Architecture
The proposed architecture consists of three parts, initially:


 * 1) The TitleValue class itself. As a value object, this has no knowledge about namespaces, permissions, etc. It does not support normalization either, since that would require knowledge about the local configuration.
 * 2) A TitleParser service that has configuration knowledge about namespaces and normalization rules. Any class that needs to turn a string into a TitleValue should require a TitleParser service as a constructor argument (dependency injection). Should that not be possible, a default TitleParser can be obtained from a global registry.
 * 3) A TitleFormatter service that has configuration knowledge about namespaces and normalization rules. Any class that needs to turn a TitleValue</tt> into a string should require a TitleFormatter</tt> service as a constructor argument (dependency injection). Should that not be possible, a default TitleFormatter</tt> can be obtained from a global registry.

So far the basic design. It can be extended and elaborated in several ways, for example by defining:


 * a WikiLink</tt> class with subclasses for internal links, interwiki links, and external links.
 * a TitleResolver</tt> service that provides URLs for a given TitleValue</tt>.
 * a UserPermissions</tt> service that can check a user's permissions with respect to a TitleValue</tt>.
 * PageStore</tt> and RevisionStore</tt> services for looking up whether a title exists, loading the latest revision, etc.

Implementation
Below are interfaces/stubs for the proposed classes:

We might want to include functions for getting the DBKey</tt> for a title directly from the TitleValue</tt> object, for convenience. And perhaps also the form used in URLs (especially important for section titles). That is tempting, but means that we can not inject a different DB key format or section identifier for use in a different context.

Alternatively, the TitleFormatter</tt> interface could define just a single method, and there would be separate implementations for different forms of the title. That means handling more services, but allows more fine grained control over which formatting is used for what and where:

For getting access to the TitleParser</tt> or TitleFormatter</tt> objects, a global registry object is proposed. This should be used only where dependency injection is not possible, such as static hook functions or where there is no control over constructor calls. In general, explicit dependency injection as a constructor parameter is preferred.

There are two major use cases for using the registry singleton:

Firstly, gaining access to the service objects in a static context, such as a hook handler function. Here, the singleton would be used to get the service objects that then get injected into an object that implements the actual logic that should be attached to the hook:

Secondly, support for legacy code. For instance, Title::getLocalURL</tt> should be changed to use a <tt>TitleFormatter</tt>, but there is no good way to inject a <tt>TitleFormatter</tt> into a <tt>Title</tt> object. So, <tt>Title::getLocalURL</tt> would use the global registry instance to get the service.

As an alternative, it would be possible to use the <tt>RequestContext</tt> class as a registry, or make the registry available from <tt>RequestContext</tt>. But that would mean that all code that uses <tt>RequestContext</tt> directly or indirectly (which is pretty much everything in MediaWiki) would then needlessly also depend on the new services. This would make the problem of entangled dependencies worse instead of improving it.

Instead, use of the registry object should be restricted to a few isolated places. The registry should not be passed around at all. Anything that needs a service should ideally ask for that service explicitly in the constructor.

Usage
In MediaWiki core, <tt>Title</tt> objects are often used where a reference to a wiki page is needed. However, because they are so heavy weight, they drag in a large amount of dependencies and make testing the respective code quite hard. TitleValue could be used in places where only a reference to a wiki page is needed. For example:


 * in <tt>Revision</tt>, to represent the title of the page the revision belongs to.
 * in the <tt>Linker</tt>, specifying which page to link to.
 * in <tt>WatchItem</tt>, specifying which page to watch.
 * etc.

Each of these classes needs to perform some operation on the title that <tt>TitleValue</tt> itself does not support, like getting the DB key form, or checking whether the page exists. Service objects for performing these tasks would need to be injected. This may seem troublesome, but is actually an advantage: it means that we can control how that class checks whether a title exists, and can provide a dummy method for use in tests.