Requests for comment/Linker refactor

Static methods are widely used throughout the MediaWiki code base. This introduces procedural code, global state and makes it quite a challenge to do good unit tests of the MediaWiki code. Some of the most egregious uses of static methods is in the Linker class. Linker static functions are used nearly 500 times throughout MediaWiki. Especially heavy use occurs in special pages, ChangesList / Enhanced Changes, Difference Engine, etc., making them difficult to test fully.

MediaWiki should move away from using static functions and move towards turning utility classes like Linker into service objects. Then other classes using Linker can mock out those methods and isolate their own methods for testing.

Linker also mixes in code that handles formatting comments, such as in recent changes pages. This should be separated into distinct value and service class (Comment/Summary + formatters).

Linker also contains code for formatting the table of contents in pages. This could be split into its own formatter class.

Any refactoring of the Linker class needs to keep performance in mind and how the code interacts with the LinkCache. Methods such as Linker::link can be quite inefficient, especially if called numerous times. (e.g. on watchlist)

Justification

 * Improve testability and code quality of MediaWiki
 * More bugs caught via tests and prevented from making their way into production

Static methods and tests
Static methods are widely used throughout the MediaWiki code base. This introduces procedural code, global state and makes it quite a challenge to do good unit tests of the MediaWiki code. It is difficult to isolate pieces of code and test each piece, each class, each method and is not possible to mock the static methods of another class.

To move forward in improving quality of MediaWiki, we should strive to make the code more testable and have more unit tests, in addition to the browser tests. As of September 2014, ~6-7% of MediaWiki is covered by unit tests. In the September 2014 quarterly review of the Release/QA engineering team, concern about this was raised by Lila ("we need all three (integration, browser, unit tests) ", "at Sugar, we enforced "no commit without tests") and others ("some of MW's tech debts make it actively hostile to unit tests" --robla, meeting notes)

At the same time, WMF is migrating to HHVM and HHVM prides itself on 100% passing tests on dozens of frameworks and php projects. For MediaWiki, that is misleading since it's actually only 6-7% and the rest, we really don't know except for "test in production" to find out. It is scary to assume all is good with MediaWiki and HHVM when such a small portion of MediaWiki has test coverage.

Implementation

 * Introduce comment / summary value classes and formatters to eventually replace code in Linker
 * Deprecate comment handling code in Linker
 * Split table of contents formatting code + deprecate existing code in Linker
 * Split out image linker code into separate class
 * Introduce a new Linker (name suggestion please?) service object class for more general linking code
 * Phase out / deprecate the static methods in the current Linker class

Summary classes
Summaries (or "comments") are stored in the recentchanges and revision tables (and maybe elsewhere also). The comments are stored in a format such as " ". The first part is the "autocomment". In Wikibase, to make autocomments more suitable for multilingual environment, they get stored in a "cryptic" format like " ". In the recent changes table, comments are stored in a field with varchar(255) while they are stored in revision as tinyblob. (255 bytes)

As these are stored in the database, we will always need a way to parse and work with this format. Then we need a value object to represent the summary, and then a service for formatting them.


 * Summary - value object
 * SummaryParser - to transform what is stored in the database into an object
 * SummaryFormatter - to format in places such as recent changes

In the future, it may be useful to store comments in a more consistent manner and perhaps accommodate storing them in a structured way (e.g. json).

Formatters

 * LinkFormatter - creates and formats various types of links (might be useful to split this up further)
 * TableOfContentsFormatter - handle formatting for table of contents
 * ImageLinker - makeImageLink, makeThumbLink2 and related code factored out