Talk:Requests for comment/Multi-Content Revisions

About this board

Status of the Revision class

5
Summary by DannyS712

Was deprecated in 1.31, hard deprecated in 1.35, and removed in 1.37

HappyDog (talkcontribs)

Now that this has landed, what is the plan for the Revision class? The release notes for MW 1.31 say this class is now deprecated, however this page indicates that it will remain as a simple wrapper for the most common use-case; accessing the main slot (and, commonly, the main slot of the current revision, e.g. instantiation vianewFromTitle()).


Which is it? Deprecated or not?


If it is the former (the class is deprecated and is destined to be removed) we need some migration documentation set up. I would be happy to create the template for this, but would require some more knowledgeable people to populate the details.

Jdforrester (WMF) (talkcontribs)

It's still being worked upon. I don't think there will be a settled status for at least another year of development, sorry.

HappyDog (talkcontribs)

Thanks for the reply. Does that mean that it hasn't been decided whether Revision will be removed yet, and it may end up being kept? Or is it that it will definitely be removed in due course but we don't yet know the detail about what the new code will look like or when that will be?


The release notes for 1.31 say it has been deprecated, but it sounds like this is premature and should perhaps be removed from the release notes, as these are what guide us extension developers about what things we need to fix in order to remain compatible. If there is no change to how the class should be used, nor any recommended alternative, then surely it hasn't been deprecated yet?

HappyDog (talkcontribs)
Jdforrester (WMF) (talkcontribs)

Two comments in context of templates

2
PerfektesChaos (talkcontribs)

I have two comments in context of templates:

  1. Unified history for template programming and documentation – this is absolutely not desirable.
  2. JSON of TemplateData on separated unit – may be, if static generation, but other methods are to be supported as today.

Unified history for template programming and documentation

  • Today these are mostly separated pages with separated history.
  • This helps to get easily an overview what happened in case of debugging.
  • Imagine that someone complains about strange behaviour, observed multiple times somewhere since about two weeks.
  • Looking into history of programming page there has been a change 19 days ago. On closer look under certain conditions a side effect is obviously responsible for the problem. Last edit before has been 63 days ago.
  • The doc page has been changed 3 days ago, 7 days ago, 10, 15, 20, 25, 33 days ago, on several days multiple edits.
  • How should I see which of these edits has been effective for transclusion?
  • The programming page is transcluded in thousands of pages.
  • Documentation, test collection, examples and so on are belonging to the same template, but transcluded one or zero times. Why throwing all these histories together into one big pile only?

JSON of TemplateData on separated unit

  • That may be done, if and only if this is a static JSON object only.
  • However, TemplateData may be generated by template transclusion using #tag: syntax.
  • Look at w:en:Template:lang-en, lang-fr, lang-es, and hundreds more.
  • A central scheme may be maintained, which does most of documentation. That is called by template parameters en and English, or fr and French, and so on.
  • It is not desirable to be forced to create a separated JSON page for every single template.
  • See the corresponding w:de:Template:enS, frS, esS and hundreds more, derived from shared patttern.
Ciencia Al Poder (talkcontribs)
Unified history for template programming and documentation
  • A possible additional problem: Updating documentation of a template should not trigger an invalidation of the cache of all pages using that template
Reply to "Two comments in context of templates"

Support for non-atomic updates to existing revisions?

3
Adamw (talkcontribs)

I couldn't find this mentioned anywhere, apologies if I missed it. I'm wondering how an MCR-aware workflow such as saving media with structured info will behave if extracting that structured info takes a non-zero amount of time, for example, if we need to open the media file, calculate some metadata, and add it to the media info content slot. It seems that we would want to store the revision and its main slot content, then kick off a background job which will come back and write media info to its own slot. Will this be in a new revision that somehow point to the old revision, or will it be adding the slot content to the old revision?

Duesentrieb (talkcontribs)

Revisions are immutable - once a revision exists, its content cannot be modified. So, if new information becomes known after a revision was already created, it has to be stored in a new revision. That doesn't mean copying any information: content of slots that are not modified in a given revision are "inherited" from the parent revision.

So, short answer: if extracting some kind of information during upload has to be done asynchronously because it takes a non-trivial amount of time, storing it will create a new revision.

Longer answer:

Modifiable slots for derived data was considered in the original MCR proposal, and may be added in the future. But a slot can be either human editable or modifiable, not both. So the extracted data would have to live in yet another slot, not the MediaInfo slot. Information from that slot could then be merged with the human editable information in the MediaInfo slot to define "virtual statements".

Another wrinkle: perhaps we don't need a modifiable slot, we already have a place to store a meta data blob: img_metadata. This is not versioned, and can be updated at will, and MediaInfo could use it to expose virtual statements.

Virtual statements for extracted data seem like a really nice idea, but a lot of nitty gritty UI stuff needs to be sorted out to make them work. The idea has been floated several times, but it not mature.

Adamw (talkcontribs)

That's helpful, thank you for the explanation. So it seems tricky to retrieve a historical revision along with extracted data, since I would need to know the revision at which the main slot modification of interest happens, then search forwards for the next derived slot edit which happens before another main slot edit.

This is all a hypothetical use case of course, so it isn't urgent or anything, I'm just hoping to better understand the internals.

Reply to "Support for non-atomic updates to existing revisions?"

Protection of individual slots

5
Yair rand (talkcontribs)

Are we going to be stuck with a situation where documentation of a template/module/script can't be edited by most users without unprotecting the entire template/module/script itself? This page seems to indicate as much: "[T]here is a need for ... bundling different kinds of information on a single page ... to allow the different kinds of information to be watched, protected, moved, and deleted together." Will individual slots also be able to be watched, protected, etc. individually?

Anomie (talkcontribs)

My take:

  • Watched individually seems unlikely to me.
  • Protected individually is a requirement if we want to be able to move template documentation into a slot. Without that, template documentation would have to remain a subpage.
  • "Moved individually" would be something like a history merge/split and could probably use more thought.
  • "Deleted individually" might generally turn out to be the same thing as editing the page to remove/blank the slot. Deleting all the history of one slot while not deleting other slots would, again, probably be something like a history split.
  • Whether we'd need individual-slot RevDel is an option question, I believe.
TheDJ (talkcontribs)

> Protected individually is a requirement if we want to be able to move template documentation into a slot. Without that, template documentation would have to remain a subpage.

I note that even file descriptions + files has been discussed for MCR, and for that the same basic requirement would exist.

I agree that i would like to see a bit more discussion and forethought about how these kinds of interactions would work in an MCR world. They are critical to get right.

Anomie (talkcontribs)

Files and their descriptions are already being done in a sort of proto-MCR style. There the 'upload' protection controls the file "slot" and the 'edit' protection controls the description "slot".

This post was hidden by Jdforrester (WMF) (history)
Reply to "Protection of individual slots"
Legoktm (talkcontribs)

I'm all for this proposal, however I'm very concerned that we're still drowning in technical debt from the last similar migration (ContentHandler), and if we go ahead with this without first cleaning up our technical debt we're going to be stuck supporting three different systems, two of which are legacy and deprecated.

So I have a tough time supporting working on this proposal without first finishing the previous migration and getting rid of that technical debt.

Duesentrieb (talkcontribs)

Cleaning up tech debt first is always good advice. I'm aware that there are still some loose ends left from the ContentHandler migration, but I did not the impression that it's overwhelming. In any case, I'd be happy to work on this if I have someone to look at patches in a timely manner. Shall we do that together? Can you give me a list of tickets?

One point I don't follow is the bit about supporting three systems. ContentHandler isn't going to be deprecated, quite the contrary. Do you mean we need to support different db schemas? I agree that this is a concern.

Daniel Kinzler (WMDE) (talkcontribs)

Ah, I suppose the tracking ticket is Phab:T145728. There is quite a few subtasks, but it doesn't look so horrible. A handful of extensions are still using deprecated functions. I'm happy to help sort this out. The main issue seems to be manually testing these extensions if they don't have sufficient unit tests (or none at all).

When we introduced ContentHandler, we decided to not let the deprecated methods issue deprecation warnings at runtime, leaving that for "later". As such things go, "later" never happens until something breaks. I suggest we are a bit more aggressive about deprecating interfaces in the future.

Reply to "Technical debt"

Cross-pollination with bitrot concerns?

3
Lord Farin (talkcontribs)

While reading this, the recent discussion on wikitech-l about bitrot came to mind (https://lists.wikimedia.org/pipermail/wikitech-l/2016-August/086200.html).

It seems to me that one might consider saving the revision id of templates at time of storing a revision on the parent page. This under the premise that people will generally edit again in quick succession if a template produces strange results. So the assumption that upon edit, the page will look reasonable enough to persist that status doesn't seem bizarre (or at least, the representation will be faithful). In principle, if one stores the template invocations (are sub-slots an option here?), the exact revisions and how they look can be recursively identified based on timestamps and the revision history for the templates.

If this would somehow be feasible it sounds like it would alleviate breakage based on template changes, and especially with intricate templates, this can transform the look of an entire page. 't Would certainly be an interesting opportunity to look into IMHO.

Jdforrester (WMF) (talkcontribs)

Possible, and I think MCR would make it slightly more do-able, but I think it's off-topic for this discussion and given the heroïc additional storage requirements for the proposal unlikely to get much support, at least for Wikimedia wikis. I'd suggest making an RfC about such a change if you're interested.

Daniel Kinzler (WMDE) (talkcontribs)

Yea, sub-slots seem like a nice idea for things like this (Gabriel at least seems to think so), but I don't think it's feasible with the proposed storage scheme. It would add an order of magnitude to an already huge table.

Reply to "Cross-pollination with bitrot concerns?"
MarkTraceur (WMF) (talkcontribs)

Just wanted to stop by and mention that this is all looking very good.

As one note, the use case for "virtual" streams seems somewhat unclear to me based on this document, maybe an example could be added?

Duesentrieb (talkcontribs)

Good point, I'll think about how to make it clearer. The idea of "virtual" slots is that derived content doesn't necessarily have to be generated when the primary content is saved, and stored along with that content. It can also be generated on demand - which would be a virtual slot. The any calling code, this would be completely transparent. One possible use of this is getting the HTML rendering of a page. Even though it seems obvious enough, it's not a good example, since it raises a lot of other issues (different renderings for different target platforms, updating when a template changes, etc).

Reply to "+1"
There are no older topics