Talk:Requests for comment/Branching

Jump to navigation Jump to search

I think I'm actually interested in something similar to you are (or at least related). I personally would really like to look into a flagged-revs type extension that works a little more like how gerrit works - perhaps over the summer when I actually have free time. However, I do find some of this RFC confusing. "Enciclopedia Libre Universal en Español" doesn't seem particularly relevant - There have been thousands of wikipedia forks. While that one is important for historical reasons of being a very early fork, its far from the only fork and doesn't really have anything to do with branching. I personally don't see the connection between release tags and rev tagging. I'm also doubtful that your proposal would exactly kill view source. You don't cover all the use cases of things being protected. (Sometimes things are protected because we literally want nobody to edit them whatsoever). Bawolff (talk) 15:34, 1 January 2013 (UTC)[reply]

Thank you for the very helpful feedback. I think we're on the same page regarding rev tagging — would you explain how you envision making an extension which works like gerrit? Currently, all changes exist in a single branch, and the review process determines which revision is tagged as the "release". So, it would be more in line with our "gated trunk" source control policy if the master branch of an article contained only reviewed changes, and unreviewed changes had to be rebased to the master branch tip in order to be merged into "production".
You might be right about my "forking" tangent, I've expanded that section a bit to demonstrate that they accomplished this fork with virtually no automated tools at all (therefore, such a thing is even more feasible with better tools), but I should give more examples of what would become possible with a flexible history model. Adamw (talk) 20:17, 1 January 2013 (UTC)[reply]

Edit page always available[edit]

The idea of having the edit page always available, coupled with having every revision displayed in history page, seems infeasible to me. The edit summary would be used for vandalism, spam, obscenity, trolling, etc. This is already a problem, and is dealt with by the liberal use of revision deletion, but the workload for people with revision deletion privileges would be much higher if there were no restrictions on edits to popular articles.

If revision deletion is used more heavily, or if some interface or system is provided for hiding of entire branches from the history page, then that implies a need for schema changes, since either feature would require unlimited row scanning at present. -- Tim Starling (talk) 00:02, 17 July 2013 (UTC)[reply]

Tim, interesting points. I think the vandalism to popular articles might actually be lessened, because the potential for vandals to profit and gain visibility will be greatly decreased. If a vandalized revision is on an orphaned branch, the content would be invisible unless a reader intentionally browses to that revision. Existing bots could be used to screen for offensive or harmful content such as links to attack sites, this is exactly the same problem we face today for unprotected, less popular pages.
The schema would have to be changed to support merging, and more advanced relational metadata between revisions. However, for an initial implementation which does not support multiple inheritance, I think the existing schema is workable. Censoring branches can be done with a constant-time algorithm, if we simply mark each revision with the "oversight" flag, or however we do this task today. Adamw (talk) 04:43, 7 August 2013 (UTC)[reply]


The following links might be relevant:

Helder 12:01, 19 September 2013 (UTC)

April 9th update[edit]

This RfC is due to be discussed briefly on April 9th; join us! Sharihareswara (WMF) (talk) 02:30, 8 April 2014 (UTC)[reply]

Meeting log:

22:36:51 <sumanah> #topic Nonlinear versioning
22:36:56 <sumanah> #link
22:37:01 <sumanah> #info Adam Wight last updated this in August 2013. This feels super experimental so I don't know whether any next actions are necessary; should we encourage Adam to prototype this?
22:37:15 <awight> There's also a human-language RFC at
22:37:16 <sumanah> awight: ^ :) I could be wrong!
22:37:29 <sumanah> #link human-language RfC
22:37:30 <awight> sumanah: no, you nailed it. I need to know if anyone out there cares.
22:37:44 <awight> Technically, it's almost straightforward.
22:37:53 * sumanah imagines awight singing his plaintive lament on a stage. "does anyone caaaaaare"
22:37:56 <TimStarling> "It has been pointed out that the current "revision" table already supports arbitrary directed graphs"
22:38:00 <TimStarling> no it doesn't
22:38:18 <awight> revision.parent can point to anything...
22:38:27 <brion> yeah but not much uses rev_parent
22:38:33 <awight> oic!
22:38:35 <brion> most stuff assumes order
22:38:42 <brion> by timestamp and/or id
22:38:45 <TimStarling> if you actually used that for branches, you wouldn't be able to display a single branch efficiently
22:38:58 <TimStarling> you would have to actually traverse the rev_parent linked list, which would take forever
22:39:21 <awight> I think I had a shortcut for that... yeah, a link table which marks each revision with a branch name.
22:39:27 * brion hrms
22:39:36 <awight> so, along a given branch, the history would look linear.
22:40:04 <TimStarling> a link table which would pretty much replace the revision table for paging purposes?
22:40:18 <TimStarling> hundreds of GB of extra indexes etc.?
22:40:25 <awight> probably, but the history pager would... be a totally different animal.
22:40:35 <sumanah> awight: I know there are people out there - not so much in the active wikitech-l community, but out there - who care about this sort of thing. David Gerard is one
22:40:37 <sumanah> I think
22:41:11 <awight> Possibly, a compelling use case is: "Handle massive editing demand for breaking news articles"
22:41:17 <sumanah> I feel like I keep hearing people at conferences who want to expound on this sort of idea after a beer or two
22:41:24 <awight> lol i bet
22:41:34 <TimStarling> the questions I have about this are:
22:41:38 <TimStarling> 1) do we want it?
22:41:47 <TimStarling> 2) how much hardware are we willing to buy to get it?
22:42:11 <MaxSem> 1) I think it's going to be helluva confusing for most users
22:42:14 <TimStarling> 3) how do you make efficient use of that extra hardware? what schema, what server, etc.?
22:42:27 <sumanah> #action awight ;-) (I bet people would come to your talk)
22:42:28 <awight> MaxSem: agreed, the complexity should be hidden most of the time.
22:42:50 <sumanah> is this maybe less for Wikimedia wikis and more for other wikis that people run for other purposes?
22:43:01 <brion> branching in svn and git is hard enough for devs who work with it every day. a good UX for a branching mechanism is HARD
Branching is very easy in Git; it's a one-liner: git branch <branchname>
--Geremia (talk) 20:15, 27 February 2016 (UTC)[reply]
22:43:09 <awight> Yes absolutely.
22:43:11 <sumanah> digital humanities stuff, SMW installs, art projects?
22:43:24 <TimStarling> you know that we are considering redesigning the revision system along SOA lines
22:43:31 <awight> The first phase would probably not be long-running branches. It would be conflict resolution.
22:43:34 <sumanah> awight: have you already looked at SparkleShare/Glitter Gallery which attempts to make git usable by designers?
22:43:59 <awight> sumanah: thx, I'll check it out
22:44:21 <awight> TimStarling: no, pls send a link if you have it
22:44:23 * sumanah slightly ignores 40-minute meeting limit but doesn't want to go much above 45 or 50
22:44:38 <TimStarling> there's no link afaik
22:44:42 <sumanah> #info <TimStarling> you know that we are considering redesigning the revision system along SOA lines
22:44:53 <awight> My personal next step is to collect edit conflict statistics, so we know if the massive demand to edit current news is a valid use case.
22:44:54 <TimStarling> gwicke has been plugging the idea
22:45:14 <sumanah> #info this might be better for non-Wikimedia wikis, or for use by massively multiplayer editing on fast-breaking news topics
22:45:31 <TimStarling> you can reduce edit conflict rates without branching
22:45:37 <sumanah> #info Tim's questions: 1) do we want it? 2) how much hardware are we willing to buy to get it? 3) how do you make efficient use of that extra hardware? what schema, what server, etc.?
22:45:41 <awight> hehe by fixing oldid for example ;)
22:46:17 <TimStarling> you know we still just use diff3 for edit conflict merging
22:46:29 <sumanah> #info performance, how to alter the DB schema, what does the current schema support
22:46:33 <TimStarling> a system which must have taken less than an hour to implement
22:46:59 <awight> TimStarling: yes, but word-based conflict resolution is currently unsolved AFAIK
22:47:08 <awight> I think it's gonna remain a human task
22:47:56 <TimStarling> unsolved? you're saying someone has tried to solve it?
22:48:38 <awight> The second use case that I think is relevant to WMF is to improve article protection, to make articles editable by anyone, all the time, but to shift the work from spam patrolling on suspicious edits to having a merge-resolution work queue.
22:48:40 <TimStarling> but what proportion of edit conflicts result from intraline editing? in my experience, it is mostly adding new lines to the ends of lists
22:48:57 <awight> TimStarling: that's the thing... we don't have any statistics on edit conflicts. They are not logged.
22:49:30 <sumanah> they aren't? I did not know that
22:49:48 <awight> Argh, I had a patch for that and... cannot find it.
22:49:48 <TimStarling> well, we can already make articles editable by anyone, all the time
22:50:15 <sumanah> We're now 10 min over
22:50:25 <sumanah> next steps for awight?
22:50:25 <TimStarling> it's called pending changes
22:50:48 <awight> TimStarling: it's linear however, so vandalism still affects future editing, right?
22:51:09 <TimStarling> yes, but that's not why it's underused, is it?
22:51:36 <awight> no, it'
22:51:49 <awight> it's probably because it creates a huge workload that nobody is ever going to do ;)
22:52:25 <awight> also, perhaps cos FlaggedRevs is overdue for a rewrite.
22:53:50 <sumanah> so: my opinion is that I want Adam to keep investigating this but it would need more thinking before getting to something where we want to prototype it, and he should have more detailed answers to Tim's 3 questions
22:54:28 <awight> Sure, but fwiw I cannot answer (1) myself
22:54:37 <sumanah> and Adam ought to reach outside the wikitech-l community to find potential users
22:54:39 <TimStarling> I would like to see some more rationale on the RFC
22:54:54 <awight> TimStarling: did you look at the IdeaLab page? I put the motivations there...
22:54:55 <TimStarling> comparing it to other ways of implementing a draft feature
22:55:09 <TimStarling> not yet, no
22:55:31 <sumanah> awight: maybe you should just transclude that stuff onto the RfC - the RfC will basically be the canonical document for motivation & implementation/architecture
22:55:38 <sumanah> (or just copy-n-paste)
22:55:47 <awight> makes sense, thanks
22:55:49 <sumanah> ok, I'm declaring some next actions :)
22:56:35 <sumanah> #action awight to keep investigating this, move motivations from Idealab page onto RfC, compare his idea to to other ways of implementing a draft feature, try to answer Tim's questions 2 & 3, and look for potential users (maybe 3rd party wikis)
Sharihareswara (WMF) (talk) 11:28, 10 April 2014 (UTC)[reply]

June 18 update[edit]

I'm currently working on EventLogging to produce raw edit conflict information that we can use to determine whether "nonlinear" will be helpful for core use cases. Adamw (talk) 17:59, 18 June 2014 (UTC)[reply]

Stable element IDs could be useful[edit]

For the annotation aspects the element ID work in Parsoid land could come in handy. It will let us store annotations in a JSON blob keyed off a unique & stable ID per element. -- Gabriel Wicke (GWicke) (talk) 22:04, 4 November 2014 (UTC)[reply]

Things this should not do[edit]

Would it be correct to say that everyone agrees that the addition of a branching feature should not do any of the following?

  • Enable violations of w:WP:POVFORK on projects where this is unwanted.
  • Enable submitting changes to pages that are specifically semi-/full-protected instead of PC-protected.
  • Add any likeness of Pending Changes or Flagged Revisions (in terms of preventing immediate edits anywhere) to projects that do not already have these.
  • Produce any changes to the content page as it appears to the reader.
  • Show words from the "branching vocabulary", such as "pull", "push", "branch", "fork", "commit", "merge", etc. to ordinary editors.

--Yair rand (talk) 07:42, 10 November 2015 (UTC)[reply]

I would very much appreciate an answer to this, especially since some of the stuff mentioned in the Doc from the WMF session regarding this seems to take as a given some very scary stuff, unless I'm misreading things. --Yair rand (talk) 22:31, 12 January 2016 (UTC)[reply]
Hi, apologies for the impossibly late response! These are great points and I agree with them, this isn't meant to circumvent or replace any existing content suppression tools. Ideally, nothing would look any different from an editor's perspective, but the chance of experiencing an edit conflict would drop to zero. Adamw (talk) 01:03, 7 December 2016 (UTC)[reply]

Decentralizing Wikis[edit]

Just yesterday I was thinking how great using Git as the version control for wiki pages would be, and someone has already begun developing such an extension (Nonlinear)! What's the current status of this project?