User:Duesentrieb/Architecture Guidelines

This is a DRAFT of a PROPOSAL for a RECOMMENDATION of BEST PRACTIVE regarding code architecture. It is written with MediaWiki core in mind, but should be applicable to any software project of similar size and intent, writte for a similar platform.

The general principles laid out below should serve as a guideline when writing and reviewing new code, and when refactoring old code.

Rationale
The principles layed out below are designed to improve two desirable properties of any software system: maintainability and reusability. Key to improving these two properties is the idea of modularity, meaining the idea that software components (functions, classes, libraries, and applications) should have well defined, narrow interfaces and well defined, minimal dependencies. Better modularity also means better testability, which again improves maintainability because it allows confident change.

It should be noted however that none of these principles is absolute, and sometimes they even conflict, so a balance needs to be struck. For instance, improving the separation of concerns may have a negative effect on information locality. More generally, reducing the complexity of individual components often means increasing granularity, causing an increase the complexity of the larger system that composes and uses these components. Finer granularity brings a higher degree of abstraction, making it easier to understand intent, whereas coarser granularity makes it easier to understand the actual operation. The right balance between such aspects of a software system often depend on external factors, including the available tools, code review process, as well as social and cultural factors.

Furthermore, even though the below principles are generally helpful for creating maintainable and reusable software, any of them may be discarded temporarily or even permanently for a given component, if there is a good reason to do so. Such a reason should however be thoroughly documented, to allow others to understand why the component was written in violation of best practice, and thus avoid others trying to fix that perceived shortcoming.

With regards to MediaWiki, it's important to realize that a lot of the core code is quite old, or was built upon antique foundations, and often does not conform to the principles described in this document (as of summer 2015). This legacy code can more often be used as a bad example rather than a guide to how new code should be written. However, writing new code in such an environment often requires compromises, improving the legacy code base requires patience and due care as well as attention to practical factors such as runtime performance and scalability.

-> Reusability (link/move)
 * documentation

-> Maintainability (link/move)
 * what vs why
 * operation vs intent
 * understand and change

---

No Global State

 * global state means no isolation/modularity. This is bad for:
 * debugging
 * testability
 * re-use
 * global state is an implicit addition to the declared interface
 * Singletons are a lie (or rather, they are relative)

Dependency Injection
Dependency injection refers to the idea that objects should get all they need to function in the constructor (or through a call to a setter, if the respective dependency is optional). Ideally, an implementation would have static dependencies only on interfaces. Concrete implementations would be "injected" by passing the respective instances to the constructor. Using this pattern, classes explicitly declare their dependencies either


 * Constructor vs. Setter
 * minimal static dependencies
 * ad hoc / no framework
 * control over construction
 * Gaia - no kitchen sinks

Bootstrapping
 * App level registry/factory
 * Static entry points

-> Less static analysis -> security -> Migration (move/link)

Interface Segregation

 * narrow interfaces -> avoid unnecessary depdencies
 * less to implement if an alternative imple (or mock) is needed
 * Interface design should be driven by usage/need, not by "what can be done with this".
 * Interfaces cost little. Create many.


 * transitive dependencies
 * Lo level vs high level, hubs vs authorities...

Separation of Concerns

 * Reduce component complexity
 * mixing concerns -> bindeling depdenendcies
 * warning signs:
 * "and" in the description
 * some dependencies only used in part of the interface or implementation
 * more than 500 ELOC per class, more than 50 ELOC per function.

Information Locality

 * need to know
 * keep it local
 * information hiding

Composition
Composition vs. Inheritance Protected vs. Private Static comp vs injection

Error Handling

 * Use Exceptions
 * Catch late
 * Localize later
 * Document which exceptions are thrown when

Performance (link/move)

 * Hot path vs cold path
 * no premature, no micro. measure!
 * scalability >> performance
 * -> lazy init
 * -> cacheing
 * -> DB

---

Values

 * Typically "newable"
 * Equals, toString, hashing, etc
 * Immutable, unless there is a very good reason
 * If Mutable: LSP, cloning
 * Often no interface, single impl
 * If multiple implementations, use a factory
 * Mutable value -> Model, mutable list/map, etc


 * -> LSP
 * avoid mutable
 * needs generics
 * typical issue for lists/sets

Builders and Cursors

 * Represent state
 * Cursors typically for I/O
 * Builders typically for complex values
 * Model == Builder?

Services

 * Typically application scope
 * "Stateless" singletons (state: lazy initialization, caching)
 * Storage, views, etc
 * converters (formatters, serializers, etc)

Factories

 * newXXX has parameters (or the whole purpose is lazy init)
 * inject context knowledge
 * Constructs values or controllers, rarely services

Registries

 * Lazy init of services / factories / registries
 * newXXX has no parameters
 * DI entry point
 * Should not be passed around
 * App: defines application as service network
 * App: may have static singleton

Controllers

 * Business case
 * Use services
 * Created by factory
 * May be stateful
 * MVC

Glue

 * adapters
 * decorators
 * event/hook handers
 * callbacks

---

Class Level

 * Concept
 * Purpose
 * Relationship with other classes/interfaces
 * Usage

Method Level

 * Contract vs Implementation
 * array structures
 * formats / escaping

Medium Level

 * logical data model
 * information flow
 * hooks, options, etc

Guides and Howtos

 * Install & Update
 * Backup/Restore
 * Extending

Testing

 * Injection & Mocking
 * CI & confidence
 * Coverage
 * Unit vs Integration
 * Security -> Interface contracts (escaped vs unescaped)