Extension talk:DiscussionTools/How it works

About this board

Tacsipacsi (talkcontribs)

In case the most recent revision containing the comment has been hidden, but the page hasn’t been deleted, would it be possible to have a link to the most recent non-revdel’ed revision (if there’s any)?

Matma Rex (talkcontribs)

In the currently proposed implementation, it wouldn't be possible, because we delete data about all comment revisions other than the newest and oldest, to save disk space. (I'll note this on the page.)

We could just stop the deleting, but IIRC I did some back-of-the-envelope calculations and came up with "way too much space". I don't have the numbers though, I should re-do this.

It would be possible to have a link to the oldest revision though, at least.

But I think this is unlikely to come up in practice – in what scenario would you hide just one of the revisions containing a comment, but not all? I think that either the comment should be hidden, in which case you'd hide all revisions containing it, or it is just an innocent bystander in the hiding of another comment, in which case the page would be edited to bring it back after hiding the other revisions.

Tacsipacsi (talkcontribs)

I didn’t think of innocent bystanders “surviving” the revision deletion, but there may still be edge cases of this edge case:

  • The page should be edited to bring back the innocent bystander after the revdel, but there’s no guarantee it will happen.
  • If several revisions need to be revdel’ed because the bad comment/edit wasn’t reverted quickly enough, it may happen that one of the points under “It might not be visible anywhere, because” apply to some innocent bystanders.

Since it’s an edge case of an edge case, it’s not very severe, but it would still be nice to have. For the practical part, can’t you hook into the revision deletion? If you do, you could re-add the rows for the latest visible revisions and clean up rows referring to the hidden revisions—which in total would probably (very) slightly reduce the number of rows, since the row about the offending comment would be removed, not replaced (unless the administrator forgot to hide all revisions containing it, but that’s really just the edge case of the edge case of the edge case…).

Matma Rex (talkcontribs)

I admit I didn't think about hooking into revision deletion (it can be done: Manual:Hooks/ArticleRevisionVisibilitySet), but after thinking about it a little, I don't think it's feasible.

To correctly re-add the rows for the latest visible revisions, we have to first find which is the latest revision containing each comment – it doesn't have to be in the previous revision, so we potentially have to re-process the entire edit history of the page (up to the known oldest revision containing it, I guess).

Of course this probably would almost never happen in practice, but it would be possible to maliciously create such a revision that hiding it would result in huge database load, so we'd have to add another special case on top of this to prevent that.

Or if we ignored this possibility, and just looked at the previous revision, we'd occasionally still return incorrect results – just like in the currently proposed version, except that it would be much more difficult to document and understand the limitations.

Matma Rex (talkcontribs)

Also regarding the "just stop the deleting" and "way too much space". I did a test where I just removed the code that deletes old rows (). On my local test wiki, the database tables would take up this much space:

Table name Size in MB Rows
(only oldest and newest) (all revisions) (only oldest and newest) (all revisions)
discussiontools_item_ids 0.86 0.86 4046 4046
discussiontools_item_pages 0.50 0.50 3822 3822
discussiontools_item_revisions 0.95 20.55 8352 171850
discussiontools_items 0.53 0.53 3393 3393
Total 2.84 22.44

(there are 4046 recorded comments or headings across many revisions of 165 pages, some created for testing and some copied from elsewhere)

I'm not sure how representative this is of real-world usage, but I think somewhere between 10x and 100x increase would be expected on a real wiki as well. (Depending on how many comments are present in a usual talk page revision… Most have few, but village pumps etc. can have hundreds, and some user talk pages where the owner refuses to ever archive them have thousands.)

When I was estimating disk usage in production in T303295#7850298, I figured enwiki gets ~11,500 comments per day, corresponding to about ~1-2 MB of data per day (that might have been an underestimate, looking at it now). Handling 100-200 MB per day probably wouldn't be impossible, but it would require additional maintenance and maybe additional hardware and didn't seem worth it (although I wanted to do it before I estimated this).

Reply to "Revision deletion"
PPelberg (WMF) (talkcontribs)

I think there would be value in creating a new section for the set of Usability Improvements we're working on.

To start, I think it would be worthwhile to share how the metadata that appears beneath each level 2 section heading is "calculated".

The above prompted by the question @Nux raised in Topic:Wxw6b0v9kxz0wfri.

Matma Rex (talkcontribs)

Done

Whom should we tell about this?

3
Whatamidoing (WMF) (talkcontribs)

@Enterprisey, @Tacsipacsi, @Ladsgroup, @Jack who built the house, anyone else:

I understand that this page might be useful to a script/gadget writer who wants to incorporate the Reply tool inside a larger process. Do you know of anyone/any group that would be interested in this level of detail?

Tacsipacsi (talkcontribs)
Jack who built the house (talkcontribs)

I myself would certainly be interested, but don't know anyone else except for your mention list yet.

Reply to "Whom should we tell about this?"
There are no older topics