User talk:DKinzler (WMF)/Software Design Practices

About this board

Smalyshev (WMF) (talkcontribs)

There shall be no cyclic dependencies between modules, public or private. That is, for code to be split into a separate module, all of the dependencies it has on the code that remains in the old module needs to be removed.

Not sure I understand this. If I split code from module A to module B, why B can't depend on A? Surely I don't want two-sided A<->B, but I don't see why one-sided dependency on old module is bad? This obviously makes no sense if the old module was core, since every extension would depend on core. But even between two extensions - if I had search functionality in Wikibase and I took it out to separate module, that module would still depend on Wikibase.

Maybe it was meaning the reverse - that the code that remains in old module can not depend on the code being moved out? That would make sense, if we don't consider hooks and other API overrides (such as configurable factories, etc.) a dependency.

In the namespace hierarchy, it is discouraged for the code in a namespace to depend on code in any of its child (or more remote descendant) namespaces

This looks a bit too limiting. E.g. if I have Search class that has fields and queries, I would have Search\Field and Search\Query namespaces where field and query related interfaces would live, and then the class that implements the main engine may live in main Search namespace, but still use interfaces and possibly implemented code from Search\Field and Search\Query. Not being able to do this requires hierarchies to be completely flat.

Most objects should be either immutable values or "stateless" services

Not sure what is meant by "most". Given that PHP has no concurrency, I don't see why we should emphasize immutability that much.

Service objects do not have to be truly stateless, they may maintain "hidden" state, that is, use things like caching or lazy initialization.

I think here it should be included that if the state is hidden, there should be a way to either clear it or at least reset it to a known state, otherwise such class is untestable.

Reply to "More notes"

Details I find confusing or disagree with

1
Thiemo Kreuz (WMDE) (talkcontribs)
"In the namespace hierarchy, it is discouraged for the code in a namespace to depend on code in any of its child (or more remote descendant) namespaces. This is primarily because it is expected for child namespaces to depend on code in their parent namespace, and having dependencies both ways would create a cycle."

Isn't this the wrong way around? Typically, I put the smaller classes that have less dependencies in the longer, deeper namespaces. These are then used by classes higher up in the namespace hierarchy, or even in adjacent branches.

"[…] is technical debt."

I wouldn't use the term "technical debt" here as if this would be the only possible definition of technical debt. I would argue that every code we write, no matter how well written it is, is technical debt. We introduced it, we need to maintain it, we need to constantly pay a little bit just to be able to have it around. The only code without any technical debt is code that doesn't exist.

"use of the global service locator should only be accepted as an intermediary step".

Thanks for this. The relevance of this can not be stressed enough. I see more and more code using service locators like nuts, but it creates almost all of the same problems as singletons a.k.a. global state.

"Generic configuration objects should never be injected into service objects, because doing so introduces a dependency on all settings."

That's a bit far stretched. When I have a class that accepts a Config object, and I clearly document the settings this Config object is expected to have, I can inject the global Config because it is compatible. This does not suddenly create a "dependency on all settings". I'm just passing useless settings the class doesn't use. As long as the class does not processes arbitrary settings, this is not a dependency.

"Subclassing should never be used just to share code."

Again, that's a bit heavy. There is not really a clear border between "this is clearly only done to share code" and when it is not. There is a gray area of personal style, opinions, and what a group of developers can agree on.

"Interfaces make bad extension points, because they cannot be changed without breaking implementations; use abstract base classes instead."

I find this somewhat weird to say. The only interface I know that is always acceptable does have a single, very trivial method. Using an abstract class instead of an interface does not make it easier to change the signature of this method. In case this is about the addition of more methods to the interface, I have to disagree. There should not be a need to "extend" an interface this way. It is a different interface then. Give it a new name, possibly make it extend the old one, or just copy the one trivial method header from the old one, and deprecate it.

Also, the basically only thing abstract classes do is enabling code sharing – which this very document forbids just a few lines above.

Reply to "Details I find confusing or disagree with"
CParle (WMF) (talkcontribs)

The first six bullets in "modularization" are all about cyclic dependencies. I think it'd be useful to spell out what a cyclic dependency is, and what the difference between a public and a private cyclic dependency is - before you talk about them, rather than at the end in 'definitions'. If we had this I think maybe some of the bullets could be consolidated.


This bullet

In the namespace hierarchy, it's also acceptable for classes in (transitive) parent namespaces and (transitive) child namespaces to depend on each other internally (but never publicly)

... suggests that that something in a parent namespace might depend on something in a child namespace, which I guess is not what you meant. Maybe something like

In the namespace hierarchy, classes lower in the hierarchy may depend on classes higher up, but not the other way around, and not on classes in other namespaces at the same level. Examples:

  • a class in Namespace/SubNamespaceA can depend on a class in Namespace
  • a class in Namespace should know nothing about classes in Namespace/SubNamespaceA
  • a class in Namespace/SubNamespaceA should know nothing about classes in Namespace/SubNamespaceB

I'm not really sure what you mean by "never publicly" ... do you mean that a class in Namespace/SubNamespaceA shouldn't directly inherit from a class in Namespace, but composition is ok?

DKinzler (WMF) (talkcontribs)

> In the namespace hierarchy, classes lower in the hierarchy may depend on classes higher up, but not the other way around, and not on classes in other namespaces at the same level.

I'm tempted to agree, but I have seen many examples to the contrary. E.g. factory methods in the parent namespace knowing about specialized implementations in nested namespaces. Or a class defined in the parent namespace pulling in utilities from nested namespaces as composites. Or even all the exceptions used in the parent namespace living in a sub namespace.

My idea here was to allow this as an implementation detail, but not publicly. Public dependency would not just be subclassing, but also any mention in a public (or even protected) method signature.

But of course, any cycles is always undesirable. And I'm rather unsure about whether there should be any special rules for the namespace hierarchy, or whether the hierarchy should be ignored entirely, and we just have rules for access between namespaces.

Because, as far as I can see, as long as there is no cycle, any access to any namespace from any other is generally allowed (we can forbid some, but the default is to allow it). At least for your third rule, I don't think it's enforcable - otherwise, nothing in the MyApp namespace (or any of it's child-namespaces) could ever use anything from outside MyApp, forbidding the use of any libraries. That doesn't make sense to me.

So, if MyApp/Foo can access CoolTool/Barf/Frobniz, why should MyApp/Foo not be allowed to access MyApp/Bar or MyApp/Foo/Bla?

The rule I am trying to establish is that "private" cycles between classes are "acceptable" (but still undesirable!) within a namespace. And maybe also "along" the hierarchy? But I'm happy to drop that, if it's not needed.

CParle (WMF) (talkcontribs)

> if MyApp/Foo can access CoolTool/Barf/Frobniz, why should MyApp/Foo not be allowed to access MyApp/Bar or MyApp/Foo/Bla

(scratches head)

Ok ... but if there are no access rules between levels of a hierarchy, what's the point of the hierarchy? Is it just a filing system?

DKinzler (WMF) (talkcontribs)

In my mind, yes, It's primarily a system of grouping. Namespaces form "components" and their dependencies should be acyclic, but that is unrelated to the hierarchy.

I have searched around a bit, and it seems like there are a number of people who agree with your rule of "you can access any other namespace except your descendant namespaces". But the reasoning seems to be based on the assumption that the child namespaces would naturally depend on the parent namespaces, so also depending the other way around would create a cycle.

So, it seems to me like this rule is really a natural consequence of "no cycles", and can be used as a rule of thumb to avoid cycles among closely related namespaces, but it doesn't provide additional value beyond that. As far as I can see, having a dependency on a child namespace is only bad because it's likely to create a cycle.

Now, all this doesn't really say anything about whether it should be acceptable to have "private" cyclic dependencies along lines of the namespace hierarchy. I think I'll just remove that bullet point. It's bound to create confusion, and since we want to get rid of all cycles, why tollerate more?

DKinzler (WMF) (talkcontribs)

I removed the bullet point about cycles being acceptable in the hierarchy.

I introduced a point discouraging dependencies on child namespaces:

  • In the namespace hierarchy, it is discouraged for the code in a namespace to depend on code in any of its child (or more remote descendant) namespaces. This is primarily because it is expected for child namespaces to depend on code in their parent namespace, and having dependencies both ways would create a cycle.
Reply to "Modularization"
Smalyshev (WMF) (talkcontribs)

Classes from another component should never be extended (subclassed), unless this is explicitly allowed in the documentation of this class.

This seems to be very restrictive. Component author may not foresee that somebody would need to extend/amend the functionality of the class. Requiring this would generate either copypaste or needless proxying.

Static entry points that serve as hook handlers should contain minimal code, typically getting any needed services from the global service locator, setting up the handler object, and calling the non-static handler method on that. This way, the actual handler code is isolated from global state, so unit tests for it do not require global fixtures.

I presume, this is for hooks that actually do depend on global state? Because a lot of hook implementations are not, though they can mutate global state (e.g. change registries). In this case, not sure how handler objects would help...

DKinzler (WMF) (talkcontribs)

> I presume, this is for hooks that actually do depend on global state? Because a lot of hook implementations are not, though they can mutate global state (e.g. change registries). In this case, not sure how handler objects would help...

"Pure" functions can be static, that's true. I can make that explicit if you like.

Handler objects help with exactly one thing: they make it easy to inject mocks for testing.

The idea is to consider the object separate from the wiring code in static methods (entry points). Pushing any access to globals out of the object makes it easier to reason about the object's behavior.

CParle (WMF) (talkcontribs)

+1 - I've been using this approach, and like it

Smalyshev (WMF) (talkcontribs)

Very common pattern for hooks is passing something to the hook (e.g. registry, state object, plain array, HTML, etc.) and the hook modifying it. Not sure if it counts as "pure" function strictly speaking but probably close to one as it doesn't change state outside what is explicitly passed to the function. Another pattern is for the hook to register certain things with a certain registry without passing that registry in the arguments - usually because there's no single object that can be passed. That would be harder to deal with without refactoring the hooks.

DKinzler (WMF) (talkcontribs)

> Very common pattern for hooks is passing something to the hook (e.g. registry, state object, plain array, HTML, etc.) and the hook modifying it. Not sure if it counts as "pure" function strictly speaking but probably close to one as it doesn't change state outside what is explicitly passed to the function.

This would be pure enough in this context, since the hook handler doesn't need to access ''global'' state, even though it modifies ''some'' state.

> Another pattern is for the hook to register certain things with a certain registry without passing that registry in the arguments - usually because there's no single object that can be passed. That would be harder to deal with without refactoring the hooks.

Well, according to this guideline, you could get it from MediaWikiServices::getInstance() and inject it into the handler object.

But a refactoring of (pretty much) all the hooks is in the pipeline anyway, see Platform Evolution/Roadmap/Projects/Reduce Extension Interface Surface Area and https://phabricator.wikimedia.org/T212482.


DKinzler (WMF) (talkcontribs)

> This seems to be very restrictive. Component author may not foresee that somebody would need to extend/amend the functionality of the class. Requiring this would generate either copypaste or needless proxying.

The whole point is that we want to support only explicitly defined extension points. Things that are not explicitly defined extension points should not be used as such. Extensions modifying core behavior in "unforseen" ways is exactly the problem we are trying to avoid.

So yes, it's restrictive. This restriction imposed on extension developers provides freedom to modify core, because the following changes don't have to be treated as breaking changes: modification and removal of protected methods and fields, broadening of parameter types, narrowing of return types, addition of optional method parameters, addition of abstract methods.

Treating all these things as breaking changes for all classes is really annoying and slows us down. And in fact, we don't: the deprecation policy already says that protected methods are only stable on classes that are explicitly defined to support sublcassing by extensions. So, in practice, subclassing of random classes is already unsupported.

Smalyshev (WMF) (talkcontribs)

My concern here is that there's no way we can predict how extension authors - including ourselves in the future - would need to modify existing functionality. If the method is protected and not private, it is already usually intended to be a flexibility point at least inside the module - then it is conceivable that outside extension may have similar needs.

I certainly get that only extending in specific pre-declared points makes refactoring core easier. On the other hand, it makes extending harder. I would probably be fine with "if you extend in places that are not specific extension point, then you'd have to fix your code once it changes" but if we're making policy that no extension should go beyond these points ever (of course this would be the case only for WMF extensions, we can't control anything else) then it would come back to core as a flood of requests for more extension points because it's not possible to foresee all needs.

So I guess the ultimate question is what "unsupported" means. If it means "we'll be trying to avoid it, and be extra vigilant when it happens and ready for breakage when refactoring occurs" then it's fine. If it means "we won't accept code that does it" then it's probably too narrow.

DKinzler (WMF) (talkcontribs)

> it would come back to core as a flood of requests for more extension points because it's not possible to foresee all needs.

I indeed want to see that flood, because it give me a lot of information about into what direction the platform should evolve.

Whether this is a SHOULD NOT or a MUST NOT can be discussed. The main point is really about the deprecation policy. I'd say that if an extensions messes with things that are not extension points in core, it should be expected that this extension breaks, since things that are not part of the public interface may change without notice.

Reply to "Stability"
Smalyshev (WMF) (talkcontribs)

Not sure what is the purpose of this section - is this just a list of possible refactoring methods? The recommended ones? Frequently used ones? Guide on how to do refactoring correctly? Some clarification IMHO would be useful.

DKinzler (WMF) (talkcontribs)

It's a brain dump. Ignore it for now.

Reply to "Refactoring"
DBarratt (WMF) (talkcontribs)

This part:

Interfaces make bad extension points, because they cannot be changed without breaking implementations; use abstract base classes instead.

Implies that the stability of an abstract class is greater than an interface. I would argue that an interface is more stable because it cannot be changed. Instead of changing it, a new interface must be created (or it is a breaking change, which is fine if the implementers have enough time to update).

In short, this statement is saying that the stability of the implementers is more important than the stability and consistency of the callers. I think the callers should be preferred over the implementers. Therefor, Interfaces should be preferred over (or used in addition to) abstract classes.

DKinzler (WMF) (talkcontribs)

This has all been discussed in length at phab:T193613. In short: introducing a new interface forces callers to do an instanceof check. Adding methods to an abstract base class don't force callers to do anything, and implementors are fine too, as long as the method has a default implementation. Interfaces cannot be changed in compliance with the deprecation policy. So abstract base classes are totally against what the textbook says, but our collective wisdom couldn't come up with anything better, even though nobody particularly likes the solution.

Reply to "Stability"
CParle (WMF) (talkcontribs)

It's hard to know what you're getting at here. Is 'abstract out' a better solution than 'factor out'? Is 'translate' better still? How is 'hide mutability' related to the other bullet points?

No doubt there are many people in the foundation who are comfortable with notation like M, N, K, C, C', X, etc, but it's 25 years since I did any math, and I am not one of those people. I would like to see descriptive variable names instead

DKinzler (WMF) (talkcontribs)

Yea, this is all a bit abstract and unconnected. A rough brain dump, really. That section should be moved to a subpage, and nobody should look at it until I flesh it out a bit :)

Reply to "Refactoring"
KHarlan (WMF) (talkcontribs)

I think I followed most of this document but for me it will be really useful to have concrete examples either linked to, or below each bullet point to illustrate "do this" and "don't do this".

Tangentially, it would also be helpful if those examples eventually made it into Manual:Coding conventions/PHP.

DKinzler (WMF) (talkcontribs)

Good examples are a lot of work :) I agree, of course, but making or finding *good* examples will take a long time.

The Coding conventions operate on a much lower level, I don't think the same examples would be useful. The documents should be cross-linked, though.

KHarlan (WMF) (talkcontribs)

The Coding conventions operate on a much lower level, I don't think the same examples would be useful. The documents should be cross-linked, though.

They mostly do, but the sections on Visibility and Integration in that document go beyond stylistic concerns. Anyway, yeah, a separate document would be fine as long as it's easy to find from there.

Reply to "Examples please"
There are no older topics