- <TimStarling> temporary solutions have a terrible habit of becoming permanent, around here 
The Technical Debt Program has been put in place to help the Wikimedia Foundation and the broader community better understand and manage MediaWiki technical debt. This article's purpose is to help establish a common understanding of technical debt for the Wikimedia Foundation and the broader MediaWiki community.
What is Technical Debt and why is it important?
As put simply by Dan Rawsthorne - "Technical Debt is what makes code hard to work with."
Although simplistic, the spirit of the above definition is generally at the core of most industry experts’ own definitions.
The reason it’s important is because code that is hard to work with generally hampers developer’s productivity and results in less stable code.
All too often the term “technical debt” ends up being applied to a wide range of issues, and as such, becomes unmanageable. Technical debt is NOT a bug or the lack of a feature. Technical debt is not just a fancy name for “sloppy code”.
Technical debt is generally not visible to a user of the system. It’s visible to developers and those that have to work with the source code in some capacity. Although defects themselves are not technical debt, they can be a result of technical debt.
I. Debt incurred unintentionally due to low quality work
II. Debt incurred intentionally
II.A. Short-term debt, usually incurred reactively, for tactical reasons
II.A.1. Individually identifiable shortcuts (like a car loan)
II.A.2. Numerous tiny shortcuts (like credit card debt)
II.B. Long-term debt, usually incurred proactively, for strategic reasons
|I||Code developed and submitted that doesn’t follow coding standards.||This results in code that is difficult to read and update. Resulting in lost productivity and/or defects.|
|II.A.1||Avoiding refactoring a new feature in order to get immediate user feedback. The plan is to refactor it once feedback is received.||In these cases, it’s not sure that the new feature will remain this way until user feedback is received.|
|II.A.2||Used nondescriptive function/variable names “for now”.
Didn’t create unit test for newly fixed defect because we need to get the fix out ASAP. We’ll automate it later.
|The saved time in both of these cases is trivial compared to the cost over time. In the first case, each time a developer touches the code, they’ll need to learn what the function/variable is used for.
In the second case, each time the code is changed, there will be a risk of undetected breakage.
|II.B||Decision to use the standard MVC design pattern in iOS app even though there’s a belief that app performance will suffer as user base increases.||This is a strategic decision to use a design pattern that is well understood now knowing there will likely be a scaling issue in the future.|
Is Technical Debt Bad?
Often we are faced with the decision to take a shortcut in order to get code deployed faster or make progress in another area. Those shortcuts have a future cost, not unlike real debt. Do the short-term benefits of the shortcut outweigh the long-term costs? If so, it might be worth it. But as with any debt, accrual of the debt without plans to repay it is a bad idea. It leads to ongoing “interest” payments in the form of slower progress whenever working in that part of the code. So, if you consciously make a decision to incur the technical debt, do so with a plan to repay it as soon as possible.
Looking at the aforementioned Technical Debt Taxonomy, types I and IIA.2 should really be avoided at all costs. They generally don’t result in any near-term advantage and are generally difficult to track.
When deciding whether or not to take on technical debt, make sure that your stakeholders are involved in the decision. They will not only know what to expect in the near term, but also be able to support refactoring work later.
Removal (paying off debt)
Removal of technical debt is focused on the overall reduction of existing technical debt. Within the MediaWiki ecosystem, we have a large existing codebase with an undefined amount of technical debt. This existing technical debt may or may not be impacting our daily efforts. How do we find it and determine if it needs to be removed?
Patterns to look for are:
- An area of code that people avoid at all cost due to fear of breaking it or general lack of understanding.
- An area of code that seems to break often.
- An area of code that is developed and maintained by a single person (because they are the only one who understands it).
Generally speaking, those that work on the code gain a pretty good understanding of those areas that are hard to work with. They may even have some ideas as to how to make it easier to work with.
Once areas have been identified, what then? Prioritizing the work can be challenging and often results in the “low hanging fruit” approach. All things being equal, the LHF approach isn’t a bad one. When you have equally impactful areas that need attention, choose the easiest ones first. However, all too often, impact-fulness is not factored in when selecting the LHF. As a result, a bunch of work is done (tasks completed), and people feel good about the shorter list, but the actual impact is marginal.
For that reason, impact should be the primary criteria when prioritizing the removal of technical debt. For example, if there’s an area of code that is routinely modified vs an area of code that is changed only once every 3 years, the impact of removing the technical debt from the prior is probably more impactful. This is perhaps an overly-simple example, but it does demonstrate some of the factors to consider.
Once the areas are identified, the work at hand needs to be identified. To that end, sound design and implementation principles should be followed or one risks removing some technical debt while introducing different technical debt. For that reason, it’s important that all that contribute to MediaWiki employ sound software development principles. The Code Health Group, a cross organizational group responsible for defining these principles, is a good place to start.
In-Practice: All too often developers go into refactoring efforts with the expectation that nothing will break. However, the reality is, things WILL break. As with any other development, having adequate tests in place will help manage the breakage. Don’t be afraid to break stuff, expect it.
Avoidance (not incurring new debt)
Where removal focuses on the reduction of technical debt, avoidance is about minimizing the introduction of any additional technical debt. As noted previously, technical debt isn’t necessarily bad, provided it’s a conscious decision with well understood benefits and costs, and a plan to remove it at a later date.
Avoidance is made up of two distinct parts - knowing how to design and implement a debt-free system, and understanding when to incur technical debt and when not to.
Knowing how to design and implement a debt-free system is outside the scope of this post. Please see Code Health for more resources on that topic. That being said, examples are: code complexity, coding standards, and test coverage.
Equally important is when to make the decision to incur technical debt and when not to. The following is a diagram of a decision tree to help make a good technical debt decision.
Key points from the diagram:
- Don’t incur the tech debt if it takes longer to discuss and document it than to take the more desirable approach.
- A lot of small shortcuts are difficult to document and track. Over time, they will crush productivity.
- Although discussing tech debt decisions with peers and your direct engineering manager may be enough for tactical short-term tech debt, longer-term strategic tech debt should also be discussed with the Product Manager and Tech Committee as the impact of the tech debt may be broader.
UX debt is similar to technical debt, rephrasing Dan Rawsthorne: "Design Debt is what makes designs hard to work with." UX debt is exposed to the users. This makes refactoring hard, because users get used to the (debt ridden) UX. After a refactoring of the design, users may complain that the design "feels wrong" and that it lowers their efficiency (since the old usage strategies don't work anymore).
Causes of Design debt can be:
- Design incoherence after several changes due to AB-testing
- More and more elements have been added in controls that can only handle a few items well (e.g. Tabs)
- E.g. We used to have all function in a little icon bar, but now that bar holds 15 functions, since more and more have been added.
- Quick "hacks"
- Like using an icon instead of a link, since it is smaller
- Adding implicit modes like "If the user just has edited something, this button also does…" or "if you click in the empty space, function X is automatically called.
- Paradigms change between different parts of the software, e.g.
- VisualEditor uses a lot of popups, the "old" mediawiki uses dedicated pages, other parts may use sidebars, overlays…
- A transition is not done yet, e.g. from jQuery UI to OOUI
- Some pages have OOUI in the main area, but old buttons for "save" and "cancel" (like the Beta features page as of June 2017)
- Wikidata builds on its own style based on jQuery UI and does not yet use OOUI
- Technical Debt SIG
- User:Daniel Kinzler (WMDE)/Avoiding the Tech Debt Trap (interesting essay and ensuing discussion on the topic)
- Bug management/Development prioritization#Why are there so many open tasks?
- ↑ "Temporary solutions" quip (archived on Toolforge) - one of many MediaWiki Quips archived after the Bugzilla->Phabricator migration
- 18f blog post series: Managing technical debt and Preventing technical debt
- Wikimedia Foundation Blog: The what and how of Code Health