Topic on User talk:Daniel Kinzler (WMDE)/MCR-SlotRoleHandler

Comments on the May 30 rewrite

22
Anomie (talkcontribs)

SlotRoleHandler seems mostly ok. What is a "MessageLocalizer", and what exactly is the "heading text" that it is supposed to determine? ... Ah, apparently it's a new micro-interface for IContextSource for some reason. It's still not clear what "heading text" it's supposed to determine.

SlotRoleRegistry seems mostly ok, although it seems to be different in a few ways from what was discussed at the Hackathon: we had "getHandler" taking a SlotRecord and "getDefaultHandler" taking $role and $title, and didn't have "getDefaultContentModel" at all (presumably leaving that to the handler itself?).

SlotParserOutputProvider seems rather pointless to me. If you really want a micro-interface for that, let's drop the talk about having an implementation other than RevisionRenderer. I'd also say there's little point in making some non-ParserOutput interface around ParserOutput.

RenderedRevision probably also needs getters for the RevisionRecord and ParserOptions. Why is "RevisionRenderer" not the obvious answer to "what constructs this"?

RevisionRenderer: $forUser can't be just a User object. I doubt it's needed at all (access restrictions should probably be checked before getting a rendering rather than being the responsibility of the renderer), but if it is needed we'd need options for "public" and "raw" too (cf. CentralIdLookup's $audience or RevisionRecord's $audience + $user).

SlotOutputCombiner seems like another useless micro-interface to me, and even more unnecessary as a concrete class.

Tgr (WMF) (talkcontribs)
  • SlotRoleHandler::getRawHtml: the way it is defined seems somewhat overcomplicated compared to just returning a ParserOutput containing both the HTML and the metadata.
    Also I don't understand the point of passing in a SlotParserOutputProvider - isn't that exactly what the SlotRoleHandler's purpose is, to contain role-specific logic about how a given slot should be rendered? I don't get what circularity is being avoided here.
    Also, the "raw" HTML should not be wrapped in a div etc. There should probably be a separate method for that. The raw HTML will be needed for visual diffs, probably some AJAX calls...
  • SlotRoleHandler::__construct / MessageLocalizer: would the MessageLocalizer include the language or is it just a stateless service? In the first case, different pages might require different rolehandler instances, which seems ugly. In the second case, where does it get passed in?
  • SlotRoleRegistry::defineRole: $defaultModel seems unnecessary. I suppose you want to make it easy to use the same generic instantiator for different roles; I would rather see something more flexible used in place of a callable (maybe an ObjectFactory configuration array, or an extended callable as in Hooks) which can take that extra argument when needed. Same goes for defineCustomRole.
    Also, how does it handle multiple extensions sharing a role? There would have to be some kind of priority parameter (or would it enforce that exactly one instantiator can claim the slot?)
  • SlotRoleRegistry::getRoleHandler: the older draft passed a SlotRecord, which can take the content model into account. I'm not sure that can be avoided.
    Also, as in the older draft, a getDefaultRoleHandler(role, LinkTarget) would be needed, for cases where the SlotRecord does not yet exist. Which also means two different instantiators.
  • SlotRoleRegistry::getDefaultContentModel: should be a method of the SlotRoleHandler IMO.
  • SlotRoleRegistry::isSupportedContentModel: if it's not supported, SlotRoleRegistry::getRoleHandler can just return false (as long as it gets a content model or full SlotRecord). OTOH we might need getSupportedConentModels for something like Special:ChangeContentModel - and for that SlotRoleHandler would need a similar method too.
  • SlotRoleRegistry::getAllowedRoles/getRequiredRoles: these are good but are they enough for the UI layer? It might make sense to add some of the optional roles to the initial edit form (say, categories are not required for a new page, but we might want to hint to the user that they should be provided), which would in addition require something like getSuggestedRoles or getDefaultRoles.
    Also, should these only depend on the title and not the contents of the current revision?
  • RenderedRevision::getParserOutput: nitpick: I liked the name getCombinedParserOutput better.
  • RevisionRenderer::combineSlotOutput: there doesn't seem to be any need to expose this. RenderedRevision can just use a private callback.
  • RevisionRenderer::getRenderedRevision: (nitpick: I'd just call this renderRevision) having both $forUser and ParserOptions::getUser seems like unnecessary duplication.
  • SlotOutputCombiner: seems unnecessary (but given that this is an internal detail of RevisionRenderer and not a public interface, can be discussed later).
Duesentrieb (talkcontribs)

getRawHtml(): If getRawHtml does not usually modify the HTML, then it could just return the ParserOutput object. If it does, constructing yet anouther ParserOutput, with the need to copy over complex state (which is quite tricky, as it turns out), seems overkill. It would make the interface prettier, but it would make the code more complicated, both in inside the method and in the caller.

SlotParserOutputProvider is passed in because it serves as an in-place cache. I added an explanation to the spec.

"raw" here refers to the fact that the modifications that ParserOutput::getText() applies have not been applied. We could also leave wrapping and adding the header to the caller. But then, not much is left of the idea to let the SlotRoleHandler control how the slot is represented on the page.

Perhaps getRawHtml should not exist, and all of this should be left to RevisionRenderer? The SlotRoleHandler could still indicate via getOutputPlacementHints() that the slot should not be shown, or where the output is placed. It could however not provide different output. But perhaps that's a good thing?

MessageLocalizer: MessageLocalizer exists since 1.30 and is compatible with IContentSource. It does not take a language parameter but uses the user interface language. However, the output language can be controleld by the caller using Message::inLanguage().


SlotRoleRegistry::defineRole: If you don't like the arguments, let's just have one method that takes a callable. The idea was indeed that most roles can be defined using the SlotRoleRegistry base class.

Multiple extensions cannot define the same role. defineRole() should fail for a role that is already defined.

A generic handler for the "main" role is pre-defined. Overriding that for specific pages (namespaces) could be supported later, but I don't think we need to work that out in detail just yet. But it does mean that getRoleHandler() needs to know the page title.

getRoleHandler: I decided to try out Brad's idea to let SlotRoleHandler be agnostic of the content model. That means the SlotRoleHandler always depends on the role and the title, but never on the concrete model.

getDefaultContentModel is now in SlotRoleHandler.

isSupportedContentModel is now in SlotRoleHandler. We may have specialized SlotRoleHandlers that can not handle all kinds of content. But then, maybe it's easier to implement fallback behavior inside such slot handlers. Then this method would not be needed.

getAllowedRoles/getRequiredRoles are indeed basic, but should be sufficient for now. In my mind, more advanced control over the edit form should reside in the page type handler.

By the logic that the role handler only depends on the role and the title, the allowed/required roles only depend on the title, not the concrete model.

getParserOutput renamed to getRevisionParserOutput.

combineSlotOutput: good point on not exposing this, and passing it as a callback instead. Thank you, done!

getRenderedRevision: the user used for rendering, and the user used for audience checks, is not necessarily the same. In particular, when passing no user for audience checks (that is, the public audience), we still need a way to say what user should be used when resolving ~~~~.

SlotOutputCombiner: replaced by an anonymous callback.

Anomie (talkcontribs)

I think we'll need to keep isSupportedContentModel on the SlotRoleHandler. Consider TemplateStyles, if it allows for storing the CSS in a non-main slot on the template that slot should only contain sanitized-CSS. Some slot that stores structured data might require a JSON content model (and do something other than Content->getParserOutput() to make the output too) instead of making its own content model.

we still need a way to say what user should be used when resolving ~~~~

Tilde-expansion is part of PST. If something does need a user for some reason, it should get the user from the ParserOptions object.

Tgr (WMF) (talkcontribs)

Aside: what do you think about using mock gerrit patches for interface/architecture discussion in the future? Inline gerrit comments (with their "mark as resolved" option) seem like a better way of keeping track of many small threads of discussion than Flow or wikitext.

SlotRoleHandler::getRawHtml:

> If [SlotRoleHandler::getRawHtml] does [modify the HTML], constructing yet another ParserOutput, with the need to copy over complex state (which is quite tricky, as it turns out), seems overkill. It would make the interface prettier, but it would make the code more complicated, both in inside the method and in the caller.

If getRawHtml were replaced with getParserOutput, it would not need to merge anything AFICS, just clone the ParserOutput it gets from the ContentHandler and modify the HTML. And the caller can just fetch the raw HTML from it and be no worse off than using getRawHtml. And the method would be able to add ResourceLoader modules and whatnot, which might be easily a requirement (and might be different for different roles using the same content type so cannot be punted to the content handler).

> SlotParserOutputProvider is passed in because it serves as an in-place cache. I added an explanation to the spec.

The spec says "This follows the in-place caching approach of the current architecture." but I don't get what that refers to. Is the current equivalent $content->getParserOutput()->getRaw()? I don't see any caching in that. I would expect the new classes to behave similarly - if you want caching, use getRenderedRevision::getParserOutput, not the SlotRoleHandler directly.

> "raw" here refers to the fact that the modifications that ParserOutput::getText() applies have not been applied.

So the assumption is that those transformations will always be applied to the combined slot, and consequently the same options apply to all slots (e.g. no disabling section edit links for one slot but not the others)? That seems reasonable, just want to make it explicit.

Anyway my original point was that we need both a way to get the "naked" HTML (for visual diffs and maybe action=render&slot=foo) and to get the wrapped one.

> Perhaps getRawHtml should not exist, and all of this should be left to RevisionRenderer?

I don't think we can avoid having a method for slot-specific wrapping of the raw content HTML, for adding things like the documentation header (and possibly associated styles/scripts). I can't think of a use case for slot-specific generation of the raw content HTML, I think it's fine to have callers use the Content for that. (This is the kind of limitation that should be called out in the RfC though, to see if someone can spot a use case it would break.)

SlotRoleRegistry::getRoleHandler:

The spec says "Note that the same role may have different handlers depending on the page title" and here you say "defineRole() should fail for a role that is already defined." which seems contradictory. I agree that the name "SlotRoleHandler" kind of implies you get one for each role; but as it is specified now, that seems problematic.

SlotRoleHandler::getRawHtml gets a SlotRecord, so it doesn't make much difference whether different role handlers are returned for different content types, or always the same one and it adapts behavior to different content types via some internal behavior (proxying to a subhandler or whatever). That is not the case for getSlotHeadingText/getOutputPlacementHints. I'm not sure about the latter but changing the heading text / wrapping logic based on content type (or title, if that approach is prefered) seems like a reasonable use case. So either that should be passed to all methods, or it should be passed to getRoleHandler (and then instead of singleton-like handler objects we could have a separate one for each revision slot and it could hold the SlotRecord or LinkTarget as an internal variable).

As for content model agnosticism: using a SlotRecord instead of a LinkTarget is more generic as you can always get a LinkTarget from a SlotRecord (in theory it might require a DB lookup, but it's hard to imagine a situation where the slot record is loaded but the revision record isn't). And whatever use case you have in mind for behavior splitting, it seems like a fragile assumption that it will always be deterministic based on the title.

(Unless your goal is to discourage/disallow using the content model as "mutable" state for a slot. If that's the case, it should be more explicit as other parts of the system will have to be adapted - e.g. what will happen to the content type when the page gets renamed? Plus, you need to special-case the main slot again. )

supported content model:

As I said above, I see two options:

  • content model is expected to be a deterministic option of the role name and title (with special casing for things like Flow in the main slot). No need for isSupportedContentModel as it's basically the same thing as getDefaultContentModel, but there should be something like updateContentOnPageMove($oldTitle, $newTitle) which converts the old content into a new one that has the expected content model.
  • content model is sort of a parameter to the role which acts as mutable state (so an editor can change the slot from documentation<wikitext> to documentation<markdown>). We'll need getSupportedConentModels (not isSupportedConentModel) to tell users trying to change the content model what they can change it to.

The difference between the two options is not huge - you can achieve the same thing with having separate documentation-wikitext and documentation-markdown roles. It will result in different diff behavior though, so IMO the question is, what behavior do we want there? Having a single role (so that diffs work through content type changes) seems like the more user-friendly option to me.

getAllowedRoles/getRequiredRoles:

> In my mind, more advanced control over the edit form should reside in the page type handler.

Anything that can be customizable per role must be in the role handlers, as the page type handler cannot know what roles are added by extensions it is not aware of.

> By the logic that the role handler only depends on the role and the title, the allowed/required roles only depend on the title, not the concrete model.

That is unrelated. Allowed/required flags of course can't depend on their role slot's content model (since these flags are only relevant for roles that do not exist in the current revision) but they could depend on the main content model and on the existence/nonexistence of other roles. None of the use cases listed on the RfC seem to require that, but in general it's not hard to imagine situations in which it is useful.

Also, if these flags depend on the title, what happens when the page is moved and existing slots become forbidden / missing slots become required? The spec addresses that but only superficially. IMO it's one of the questions where we should involve UX experts and consider what kind of user interactions would be needed. Shouldrole handlers be able to prevent a page move when the slot does not make sense under the new title? Do we need some role handler function that provides a nice error message when that happens? (Or not prevent, but show a warning?) Should they be able to do some kind of auto-conversion?

getRenderedRevision:

> In particular, when passing no user for audience checks (that is, the public audience), we still need a way to say what user should be used when resolving ~~~~.

Only for PST. Is there any use case when we pass an unsaved revision to the renderer, and the audience is not the user who created it?

Anomie (talkcontribs)

A danger with mock gerrit patches is that it might encourage getting too far into the code level. We'd have to watch out for that.

I can't think of a use case for slot-specific generation of the raw content HTML,

A slot storing structured data might use a generic JSON content type instead of subclassing it to customize the standalone display. OTOH, there's no reason it couldn't subclass it either.

We'll need getSupportedConentModels (not isSupportedConentModel) to tell users trying to change the content model what they can change it to.

We could do array_filter( ContentHandler::getContentModels(), [ $slotRoleHandler, 'isSupportedContentModel' ] ) to get that list. Or isSupportedConentModel() could be implemented with in_array( $model, $this->getSupportedConentModels(), true ).

Tgr (WMF) (talkcontribs)

> A danger with mock gerrit patches is that it might encourage getting too far into the code level.

I meant mock patches with zero code - just interfaces and abstract classes with signatures and comment docblocks only, and maybe docs/*.txt files. The amount of information presented on MCR-SlotRoleHandler is about right, it's just hard to have granular discussions about it and know which comment is still relevant for the current version of the text.

Duesentrieb (talkcontribs)
Aside: what do you think about using mock gerrit patches for interface/architecture discussion in the future? Inline gerrit comments (with their "mark as resolved" option) seem like a better way of keeping track of many small threads of discussion than Flow or wikitext.

Absolutely! In fact, I need to write code in order to write specs, so that mock already exists: https://gerrit.wikimedia.org/r/c/434544/

I was a bit shy about sharing in, sins Brad complained about me writing code before we have agreed on architecture... it's the hen and the egg all over.

SlotRoleHandler::getRawHtml

I completely changed that, see my comment to brad (or better, see the code). I hope you like it better now.

So the assumption is that those transformations will always be applied to the combined slot, and consequently the same options apply to all slots (e.g. no disabling section edit links for one slot but not the others)?

yes, that's the idea.

The spec says "Note that the same role may have different handlers depending on the page title" and here you say "defineRole() should fail for a role that is already defined." which seems contradictory.

It's really about the main slot being special. This is not completely finished in my head yet, since all the options are a bit annyoing. But my current inclination is to say that you can register title based overrides for the main slot, but only for the main slot.

changing the heading text / wrapping logic based on content type (or title, if that approach is prefered) seems like a reasonable use case.

In my mind, different wrapping/hading means it should be a different SlotRoleHandler instance. Though internal dispatching is also possible.

I still have to update the proposal for SlotRoleRegistery. I'll probably make two versions, one where the handle depends on the model, and one where it depends on the title. Actually, I think what I really want is that the handler depends solely on the role, except for the main slot, where is solely depends on the page type.

you can always get a LinkTarget from a SlotRecord

No you can't. But with the updated proposal, SlotRoleHandle::getSlotParserOutput gets the full RevisionRecord, and you can get the title from that.

We'll need getSupportedConentModels (not isSupportedConentModel) to tell users trying to change the content model what they can change it to.

I'd want that per page type

Having a single role (so that diffs work through content type changes) seems like the more user-friendly option to me.

Diffs across content types are not required to work. Though they generally do work if both types are text based. It's nto a big deal, since content type changes are rare.

Anything that can be customizable per role must be in the role handlers, as the page type handler cannot know what roles are added by extensions it is not aware of.

I see it a bit differently: we give role handlers some controle. The default page type handler replies on that. Specialized page type handlers can override for some roles they know about. For other roles, they just use the default behavior provided by the slot hander.

what happens when the page is moved and existing slots become forbidden / missing slots become required?

Required / allowed is always only about new revisions. The code must handle non-conformat revisions gracefully.

Shouldrole handlers be able to prevent a page move when the slot does not make sense under the new title?

OMHO a page type handler would control whether the page can be moved to a given other title. I don't see what that has to do with slots.

Which of these considerations do you believe to be essential for specifying the interface now?

Is there any use case when we pass an unsaved revision to the renderer, and the audience is not the user who created it?

I seem to recall running into something like that, but I don't remember where... I'll look.

Tgr (WMF) (talkcontribs)

> I completely changed [SlotRoleHandler::getRawHtml], see my comment to brad (or better, see the code). I hope you like it better now.

Yup, thanks!

The assumption here is that we won't need to call getSlotOutput with more than one mode in the same request (as that would result in parsing multiple times). I think that's reasonable: the solo and combined views can't both appear in the same request, and while HTML and tracking can, in most cases, those processes are either completely intertwined (so requesting one always results in the other happening as well) or completely separate (so there is no overhead in doing two separate calls). Also the situations in which tracking information is needed are fairly limited.

> my current inclination is to say that you can register title based overrides for the main slot, but only for the main slot. > Actually, I think what I really want is that the handler depends solely on the role, except for the main slot, where is solely depends on the page type.

Having the main role handler depend on (or be the same as) the page type handler makes sense. For the rest, you have three options:

  • make the role handler depend on the content model (or title, or both)
  • pass in the content model (or title, or both) to all methods which should depend on it, like getSlotHeading, getOutputPlacementHints, supportsRedirects, possibly all the functionality discussed in "ContentHandler functionality that might be missing in SlotRoleHandler"
  • accept that all these functions cannot depend on the content model and title (well, some of them will need a content model anyway, so on the title in those cases).

The third option is not the end of the world but it does seem a bit restrictive to me. Varying slot headings (and edit textarea title and similar chrome that's presumably also going to be supplied by the role handler) on the content model seems like a nice ability to have.

> I'd want [to tell users trying to change the content model what they can change it to] per page type

For the main slot, that's fine. What about the rest? The page type handler won't know about most slots. Do you want to disallow changing non-main slot content types (unless maybe the page type handler specifically handles that one slot)? Otherwise, you'll need a SlotRoleHandler method for getting the list of supported content types.

> Diffs across content types are not required to work.

That depends on the use case. E.g. if a documentation slot can be either wikitext or Markdown, and someone converts it from one to the other (and changes the text accordingly), I'd want to be able to get a diff of that. (I'm not saying that's a must-have, but preferred.) Diffs between two arbitrary content types are not required to work, of course.

> we give role handlers some controle. The default page type handler replies on that. Specialized page type handlers can override for some roles they know about. For other roles, they just use the default behavior provided by the slot hander

That sounds fine. But it still means that anything that can be customizable per role must be in the role handlers, otherwise the page type handler cannot defer to them.

>> Is there any use case when we pass an unsaved revision to the renderer, and the audience is not the user who created it? > I seem to recall running into something like that, but I don't remember where... I'll look.

A better question would have been, is there any use case where we pass an unsaved revision and the audience matters? Audience checks are only relevant for revdeleted/suppressed revisions, and those are always post-PST content.

Anomie (talkcontribs)
I was a bit shy about sharing in, sins Brad complained about me writing code before we have agreed on architecture

That's only because of the issues we've run into so far:

  • Resistance to rewriting the code if it turns out to fall short in some way, because of deadlines.
  • It's easy to get caught up in code-level stuff rather than looking at the architecture. Missing the forest for the trees, so to speak.
  • The whole architecture isn't visible because only pieces are written, and are spread across multiple patches implementing the details.
  • Waiting for new code slows down the discussion.

On the other hand, I agree it can be hard to know what architecture is needed without actually trying things out. I don't know how to properly balance this.

We'll need getSupportedConentModels (not isSupportedConentModel) to tell users trying to change the content model what they can change it to.
I'd want that per page type

We need supported content models for each slot, not just for the main slot. That could be done with either getSupportedConentModels or with isSupportedConentModel filtering the list of all content models.

Shouldrole handlers be able to prevent a page move when the slot does not make sense under the new title?
OMHO a page type handler would control whether the page can be moved to a given other title. I don't see what that has to do with slots.

If title "Foo" allows role X, and title "Bar" does not allow role X, what should happen if someone tries to move the page at "Foo" (that has role X) to "Bar"? Keep in mind that the "page type handler" may not even exist, and if it does it may not actually know about some extension's role X.

The other half of the question is the reverse: if "Quux" requires role X and someone tries to move a page from "Bar" (that doesn't have role X) to "Quux", what should happen?

Tgr (WMF) (talkcontribs)

> $forUser can't be just a User object. I doubt it's needed at all (access restrictions should probably be checked before getting a rendering rather than being the responsibility of the renderer), but if it is needed we'd need options for "public" and "raw" too

Is there such a thing as rendering content for a raw audience? I can't think of a use case.

Anomie (talkcontribs)

I don't think there's any need for permissions checks at this level at all, permissions checks should probably happen before the content gets rendered.

If anything in the rendering does depend on the user somehow, it should probably be using ParserOptions::getUser() anyway.

Tgr (WMF) (talkcontribs)

> I don't think there's any need for permissions checks at this level at all, permissions checks should probably happen before the content gets rendered.

How would you do that when the parser can fetch the content of arbitrary titles for transclusion?

I agree ParserOptions should be used for that (semantically it's a bit ugly, but there isn't really a choice since parser extensions need to be able to do permission checks themselves, and they already receive the Parser object, and we surely don't want to add a User to all those hook interfaces). Which is why the question about raw audience is important as ParserOptions cannot express that.

Anomie (talkcontribs)
How would you do that when the parser can fetch the content of arbitrary titles for transclusion?

How does the parser do it now? Is there any need to change the mechanism?

Duesentrieb (talkcontribs)

RenderedRevision calls getContent() on the RevisionRecord. getContent() does an audience check. I quite dislike that fact, but I decided to not change this when replacing Revision by RevisionRecord. Changing this would have meant changing a lot of logic throughout the code.

Anomie (talkcontribs)

It could pass RevisionRecord::RAW for the audience, if the contract of RevisionRenderer/RenderedRevision specifies that the caller is responsible for such permission checks.

Tgr (WMF) (talkcontribs)

The hackathon meeting notes say we'll use message key conventions for i18n (role names and descriptions). Which is a good idea in general, but I'm not sure it removes the need for interface methods for getting those messages - other code still needs to be able to get them, while keeping the message key construction logic in one place.

Anomie (talkcontribs)

Construction logic is just `"some-prefix-$role"`, isn't it? I wouldn't go further than a static[1] method somewhere, along the lines of User::getRightDescription() or User::getGrantName().

[1]: Or I suppose in a DI world if it returns the Message or Message->text() instead of just the message key it'd be some mostly-stateless method on a service somewhere, where the only state used is the MessageLocalizer or whatever.

Tgr (WMF) (talkcontribs)

Yeah, we shouldn't push the burden of message construction to the caller - passing around message key strings is something we need to move away from. The methods will be trivial, but the SlotRoleRegistry still needs to expose them. (Also, it is not completely inconceivable that the description or even name of some role would depend on system configuration, in which case there is need for some way for handlers to override the default or pass parameters.)

Duesentrieb (talkcontribs)

Why have a method for getting the message key, if we can have method for getting the Message object?

Duesentrieb (talkcontribs)

Hm... I want to be logged in with my work account here, but with my private account on wikidata.org...

Reply to "Comments on the May 30 rewrite"