User:DKinzler (WMF)/Software Design Practices

From mediawiki.org

This page aims to establish best practices for software design. It focuses on design patterns and practices that improve the stability of a software system. Improved stability means that changes can be made to one part of the system without having to update other part of the system as much, which improves productivity and reduces the likelihood of errors.

TBD: Cross-link and further integrate with Manual:Coding conventions/PHP.

TBD: expand to cover Dependency_Injection and complete that page.

TBD: relate to en::SOLID

TBD: relate to :GRASP

TBD: Integrate Architecture_guidelines

Modularization[edit]

  • Cyclic dependencies between components should be avoided. Two components that have a (public) cyclic dependency (directly or indirectly) do not behave like separate components, but like a single component: when changing one, we can not be sure that the other does not need changing too.
  • One critical step for modularization is removing cyclic dependencies between components.
  • There shall be no cyclic dependencies between modules, public or private. That is, for code to be split into a separate module, all of the dependencies it has on the code that remains in the old module needs to be removed.
  • Cyclic runtime dependencies between modules are ok, but should be handled with care, and should be documented when expected/desired.
  • There shall be no cyclic public dependencies between any components (modules, namespaces, classes, etc).
  • Cyclic private dependencies between components are acceptable (but still undesirable) if the components are members (sub-components) of the same higher level component (e.g. classes in the same namespace may depend on each other internally, methods of the same class may call each other internally, etc).
  • All classes are considered part of their defining module's public interface, unless they are marked as internal. Internal classes are still part of the public interface of their namespace, and can be used from other namespaces in the same module.
  • In the namespace hierarchy, it is discouraged for the code in a namespace to depend on code in any of its child (or more remote descendant) namespaces. This is primarily because it is expected for child namespaces to depend on code in their parent namespace, and having dependencies both ways would create a cycle.

Service Objects[edit]

  • Most objects should be either immutable values or "stateless" services. Auxiliary types of objects are mutable value objects, builders, and cursors (which include iterators and streams). Some commons kinds of services include factories/registries and persistence services.
  • Service objects do not have to be truly stateless, they may maintain "hidden" state, that is, use things like caching or lazy initialization. However, operations offered by services are to be idempotent unless otherwise noted. Operations that are not idempotent should be reserved to services that provide persistence or I/O, and must clearly be marked as such.
  • Only value objects should be "newable", services should never be instantiated directly by application logic. Cursors should be obtained from factory services.
  • The code for instantiating services constitutes the "wiring" of the application, and should be isolated as much as possible from application logic.
  • Services shall ask for all required configuration and all required services in their constructor. Optional services and configuration that has defaults may use setters. Services should not rely on global state, any reference to a global variable or call to a non-pure static function is technical debt.
  • Accessing global service instances via the global service locator is preferred to using global state directly. However, use of the global service locator should only be accepted as an intermediary step towards proper injection of services via the constructor.
  • Service containers should never be injected into service objects, because doing so introduces a dependency on all services. Generic configuration objects should never be injected into service objects, because doing so introduces a dependency on all settings.
  • Lazy initialization in value objects is acceptable and sometimes necessary, but should not be taken lightly: hidden cost and hidden failure modes can cause hard to find issues. There should always be a non-lazy alternative (e.g. a plain value implementation).

Stability[edit]

  • Components that are used in a lot of places should themselves not depend on many other components. Components that do depend on other components a lot should not be used in many places. This keeps the dependency graph shallow.
  • Only pure functions shall be static methods (or global or namespaced functions).
  • Subclassing should never be used just to share code. Use composition and traits for code sharing instead. Static composition (one service-like object directly instantiating another service-like object in its constructor) is acceptable, but injection is generally preferred.
  • Classes from another component should never be extended (subclassed), unless this is explicitly allowed in the documentation of this class. This restriction means that protected members of a class are not per default part of the public interface of the module, and that the behavior of instances of the class are fully under the control of the module.
  • Knowledge locality (aka single source of truth) improves code stability. This means for instance that, while separation of concerns tells us that there should be separate interfaces for serialization and deserialization, a concrete class implementing a specific serialization format may well implement both interfaces, so the knowledge about encoding and decoding a given format resides in a single place, rather than in two classes.
  • To provide confidence in the interface of mutable or configurable objects, a method's contract should document the expected interaction with other methods. For instance, the contract of a put() method on an interface for hash maps may specify that the value passed to the put() method is guaranteed to be returned by the get() method when called with the same key.
  • Interface contracts should be enforced by a compliance test. Traits can be used to conveniently apply the same test cases to all implementations of an interface.
  • Static entry points that serve as hook handlers should contain minimal code, typically getting any needed services from the global service locator, setting up the handler object, and calling the non-static handler method on that. This way, the actual handler code is isolated from global state, so unit tests for it do not require global fixtures.
  • Interfaces make bad extension points, because they cannot be changed without breaking implementations; use abstract base classes instead.

Refactoring[edit]

[This section is fairly abstract, and should probably live in a separate document. It definitely needs fleshing with examples and rationales]

  • factor out: we move any components used by both N and M to yet another new module K. Now, both M and N can depend on K. This creates a tight coupling of M to N (and of both N and M on K).
  • abstract out: for the relevant component C in M that is needed by X, introduce an interface C' that C implements. This interface may then live in the new modules N, or in a module K that both M and N depend on. This creates a tight coupling of M to N (and potentially K).
  • translate: for the relevant component C in M that is needed by X, introduce an alternative D for use in N. Code in M now has to translate C to D (and vice versa) when calling code in N. This creates a loose coupling, providing isolation of vocabularies on the domain boundary, but also means writing and touching more code. This should be used when M and N model distinct domains, rather than M using the domain modeled by N directly.
  • hide mutability: A mutable class may implement an immutable interface that can be exposed to consumers that must not be able to modify the object. Such consumers may also operate on a copy of the state exposed by the immutable interface. Note however that such an interface only guarantees immutability in one direction: it guarantees that the consumer cannot modify the object, but it does not guarantee to the consumer that the object cannot change due to side effects triggered by actions the consumer takes. Code that requires such a guarantee would have to create an immutable copy. For this purpose, such immutable interfaces should always have a truly immutable implementation as well as the original, mutable implementation.

Definitions[edit]

  • Component: a collection of code with a well defined public interface. A typical example of a component are classes, but namespaces, modules, and even individual functions can be considered components.
  • Module (aka package, aka library, aka bundle): a set of components that share a versioning history and release cycle. In practice, this usually means things that are in a single git repo. In some cases, a single repo may contain multiple modules (or proto-modules) with the intention to move them into separate repos in the future.
  • Public dependency: any mention of (or knowledge of) another component in the public interface of a component. Public dependencies constitute tight coupling. Note that protected methods shall be considered part of the public interface of a class. The constructor signature of a class is considered part of the public interface if the class is considered "newable".
  • Private dependency (internal dependency, implementation dependency): mention of (or knowledge of) another component merely in the private code or declarations of a component.
  • Runtime dependency: one component using another component at runtime.
  • Newable classes: classes that application logic may instantiate directly, as opposed to requesting an instance from a factory or from a service container.
  • Static entry point: a code execution point that is static by necessity. This is at least the top level code of the application's public scripts (such as index.php), but also callback functions, such as hook handlers, that are referenced from declarations in configuration or extension.json. Nothing can be injected into such code, and all it can work with are the parameters passed to it, plus global state.