Talk:Requests for comment/Branching

I think I'm actually interested in something similar to you are (or at least related). I personally would really like to look into a flagged-revs type extension that works a little more like how gerrit works - perhaps over the summer when I actually have free time. However, I do find some of this RFC confusing. "Enciclopedia Libre Universal en Español" doesn't seem particularly relevant - There have been thousands of wikipedia forks. While that one is important for historical reasons of being a very early fork, its far from the only fork and doesn't really have anything to do with branching. I personally don't see the connection between release tags and rev tagging. I'm also doubtful that your proposal would exactly kill view source. You don't cover all the use cases of things being protected. (Sometimes things are protected because we literally want nobody to edit them whatsoever). Bawolff (talk) 15:34, 1 January 2013 (UTC)


 * Thank you for the very helpful feedback. I think we're on the same page regarding rev tagging &mdash; would you explain how you envision making an extension which works like gerrit?  Currently, all changes exist in a single branch, and the review process determines which revision is tagged as the "release".  So, it would be more in line with our "gated trunk" source control policy if the master branch of an article contained only reviewed changes, and unreviewed changes had to be rebased to the master branch tip in order to be merged into "production".
 * You might be right about my "forking" tangent, I've expanded that section a bit to demonstrate that they accomplished this fork with virtually no automated tools at all (therefore, such a thing is even more feasible with better tools), but I should give more examples of what would become possible with a flexible history model. Adamw (talk) 20:17, 1 January 2013 (UTC)

Edit page always available
The idea of having the edit page always available, coupled with having every revision displayed in history page, seems infeasible to me. The edit summary would be used for vandalism, spam, obscenity, trolling, etc. This is already a problem, and is dealt with by the liberal use of revision deletion, but the workload for people with revision deletion privileges would be much higher if there were no restrictions on edits to popular articles.

If revision deletion is used more heavily, or if some interface or system is provided for hiding of entire branches from the history page, then that implies a need for schema changes, since either feature would require unlimited row scanning at present. -- Tim Starling (talk) 00:02, 17 July 2013 (UTC)


 * Tim, interesting points. I think the vandalism to popular articles might actually be lessened, because the potential for vandals to profit and gain visibility will be greatly decreased.  If a vandalized revision is on an orphaned branch, the content would be invisible unless a reader intentionally browses to that revision.  Existing bots could be used to screen for offensive or harmful content such as links to attack sites, this is exactly the same problem we face today for unprotected, less popular pages.
 * The schema would have to be changed to support merging, and more advanced relational metadata between revisions. However, for an initial implementation which does not support multiple inheritance, I think the existing schema is workable.  Censoring branches can be done with a constant-time algorithm, if we simply mark each revision with the "oversight" flag, or however we do this task today.  Adamw (talk) 04:43, 7 August 2013 (UTC)

Drafts
The following links might be relevant: Helder 12:01, 19 September 2013 (UTC)
 * Bug 37992 - Review and deploy Drafts extension to Wikimedia wikis
 * [Design] Drafts extension
 * [Wikitech-l] Feature proposal: backups while editing articles

April 9th update
This RfC is due to be discussed briefly on April 9th; join us! Sharihareswara (WMF) (talk) 02:30, 8 April 2014 (UTC)
 * Meeting log:


 * 22:36:51 #topic Nonlinear versioning
 * 22:36:56 #link https://www.mediawiki.org/wiki/Requests_for_comment/Nonlinear_versioning
 * 22:37:01 #info Adam Wight last updated this in August 2013. This feels super experimental so I don't know whether any next actions are necessary; should we encourage Adam to prototype this?
 * 22:37:15 There's also a human-language RFC at https://meta.wikimedia.org/wiki/Grants:IdeaLab/Edit_your_replica
 * 22:37:16 awight: ^ :) I could be wrong!
 * 22:37:29 #link https://meta.wikimedia.org/wiki/Grants:IdeaLab/Edit_your_replica human-language RfC
 * 22:37:30 sumanah: no, you nailed it. I need to know if anyone out there cares.
 * 22:37:44 Technically, it's almost straightforward.
 * 22:37:53 * sumanah imagines awight singing his plaintive lament on a stage. "does anyone caaaaaare"
 * 22:37:56  "It has been pointed out that the current "revision" table already supports arbitrary directed graphs"
 * 22:38:00  no it doesn't
 * 22:38:18 revision.parent can point to anything...
 * 22:38:27 yeah but not much uses rev_parent
 * 22:38:33 oic!
 * 22:38:35 most stuff assumes order
 * 22:38:42 by timestamp and/or id
 * 22:38:45  if you actually used that for branches, you wouldn't be able to display a single branch efficiently
 * 22:38:58  you would have to actually traverse the rev_parent linked list, which would take forever
 * 22:39:21 I think I had a shortcut for that... yeah, a link table which marks each revision with a branch name.
 * 22:39:27 * brion hrms
 * 22:39:36 so, along a given branch, the history would look linear.
 * 22:40:04  a link table which would pretty much replace the revision table for paging purposes?
 * 22:40:18  hundreds of GB of extra indexes etc.?
 * 22:40:25 probably, but the history pager would... be a totally different animal.
 * 22:40:35 awight: I know there are people out there - not so much in the active wikitech-l community, but out there - who care about this sort of thing. David Gerard is one
 * 22:40:37 I think
 * 22:41:11 Possibly, a compelling use case is: "Handle massive editing demand for breaking news articles"
 * 22:41:17 I feel like I keep hearing people at conferences who want to expound on this sort of idea after a beer or two
 * 22:41:24 lol i bet
 * 22:41:34  the questions I have about this are:
 * 22:41:38  1) do we want it?
 * 22:41:47  2) how much hardware are we willing to buy to get it?
 * 22:42:11  1) I think it's going to be helluva confusing for most users
 * 22:42:14  3) how do you make efficient use of that extra hardware? what schema, what server, etc.?
 * 22:42:27 #action awight http://opensourcebridge.org/blog/2014/03/submit-your-2014-proposals-today/ ;-) (I bet people would come to your talk)
 * 22:42:28 MaxSem: agreed, the complexity should be hidden most of the time.
 * 22:42:50 is this maybe less for Wikimedia wikis and more for other wikis that people run for other purposes?
 * 22:43:01 branching in svn and git is hard enough for devs who work with it every day. a good UX for a branching mechanism is HARD
 * 22:43:09 Yes absolutely.
 * 22:43:11 digital humanities stuff, SMW installs, art projects?
 * 22:43:24  you know that we are considering redesigning the revision system along SOA lines
 * 22:43:31 The first phase would probably not be long-running branches. It would be conflict resolution.
 * 22:43:34 awight: have you already looked at SparkleShare/Glitter Gallery which attempts to make git usable by designers?
 * 22:43:59 sumanah: thx, I'll check it out
 * 22:44:21 TimStarling: no, pls send a link if you have it
 * 22:44:23 * sumanah slightly ignores 40-minute meeting limit but doesn't want to go much above 45 or 50
 * 22:44:38  there's no link afaik
 * 22:44:42 #info  you know that we are considering redesigning the revision system along SOA lines
 * 22:44:53 My personal next step is to collect edit conflict statistics, so we know if the massive demand to edit current news is a valid use case.
 * 22:44:54  gwicke has been plugging the idea
 * 22:45:14 #info this might be better for non-Wikimedia wikis, or for use by massively multiplayer editing on fast-breaking news topics
 * 22:45:31  you can reduce edit conflict rates without branching
 * 22:45:37 #info Tim's questions: 1) do we want it? 2) how much hardware are we willing to buy to get it? 3) how do you make efficient use of that extra hardware? what schema, what server, etc.?
 * 22:45:41 hehe by fixing oldid for example ;)
 * 22:46:17 <TimStarling> you know we still just use diff3 for edit conflict merging
 * 22:46:29 #info performance, how to alter the DB schema, what does the current schema support
 * 22:46:33 <TimStarling> a system which must have taken less than an hour to implement
 * 22:46:59 TimStarling: yes, but word-based conflict resolution is currently unsolved AFAIK
 * 22:47:08 I think it's gonna remain a human task
 * 22:47:56 <TimStarling> unsolved? you're saying someone has tried to solve it?
 * 22:48:38 The second use case that I think is relevant to WMF is to improve article protection, to make articles editable by anyone, all the time, but to shift the work from spam patrolling on suspicious edits to having a merge-resolution work queue.
 * 22:48:40 <TimStarling> but what proportion of edit conflicts result from intraline editing? in my experience, it is mostly adding new lines to the ends of lists
 * 22:48:57 TimStarling: that's the thing... we don't have any statistics on edit conflicts. They are not logged.
 * 22:49:30 they aren't? I did not know that
 * 22:49:48 Argh, I had a patch for that and... cannot find it.
 * 22:49:48 <TimStarling> well, we can already make articles editable by anyone, all the time
 * 22:50:15 We're now 10 min over
 * 22:50:25 next steps for awight?
 * 22:50:25 <TimStarling> it's called pending changes
 * 22:50:48 TimStarling: it's linear however, so vandalism still affects future editing, right?
 * 22:51:09 <TimStarling> yes, but that's not why it's underused, is it?
 * 22:51:36 no, it'
 * 22:51:49 it's probably because it creates a huge workload that nobody is ever going to do ;)
 * 22:52:25 also, perhaps cos FlaggedRevs is overdue for a rewrite.
 * 22:53:50 so: my opinion is that I want Adam to keep investigating this but it would need more thinking before getting to something where we want to prototype it, and he should have more detailed answers to Tim's 3 questions
 * 22:54:28 Sure, but fwiw I cannot answer (1) myself
 * 22:54:37 and Adam ought to reach outside the wikitech-l community to find potential users
 * 22:54:39 <TimStarling> I would like to see some more rationale on the RFC
 * 22:54:54 TimStarling: did you look at the IdeaLab page? I put the motivations there...
 * 22:54:55 <TimStarling> comparing it to other ways of implementing a draft feature
 * 22:55:09 <TimStarling> not yet, no
 * 22:55:31 awight: maybe you should just transclude that stuff onto the RfC - the RfC will basically be the canonical document for motivation & implementation/architecture
 * 22:55:38 (or just copy-n-paste)
 * 22:55:47 makes sense, thanks
 * 22:55:49 ok, I'm declaring some next actions :)
 * 22:56:35 #action awight to keep investigating this, move motivations from Idealab page onto RfC, compare his idea to to other ways of implementing a draft feature, try to answer Tim's questions 2 & 3, and look for potential users (maybe 3rd party wikis)
 * Sharihareswara (WMF) (talk) 11:28, 10 April 2014 (UTC)

June 18 update
I'm currently working on EventLogging to produce raw edit conflict information that we can use to determine whether "nonlinear" will be helpful for core use cases. Adamw (talk) 17:59, 18 June 2014 (UTC)

Stable element IDs could be useful
For the annotation aspects the element ID work in Parsoid land could come in handy. It will let us store annotations in a JSON blob keyed off a unique & stable ID per element. -- Gabriel Wicke (GWicke) (talk) 22:04, 4 November 2014 (UTC)