Talk:Wikimedia Developer Summit/2016/Collaboration

Reminder: All current and past edits in any pad are public. Removing content from a pad does not erase it.

Collaboration working area (T119030), main topic Make code review not suck (T114419)

https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016/T114419 Working area (Collaboration): https://phabricator.wikimedia.org/T119030 Topic (Make code review not suck): https://phabricator.wikimedia.org/T114419 Related link: https://www.mediawiki.org/wiki/User:AKlapper_%28WMF%29/Code_Review https://etherpad.wikimedia.org/p/WikiDev16-AllNotes Meeting type: Problem-solving

Goals
* What are the areas where people are frustrated. What are the root causes

* What are potential social changes we could do to reduce these frustrations

In particular, I want to emphasize that I want to focus on social proble ms first, not technical solutions

This meeting is meant to brainstorm root causes and solutions. We are not agreeing on a final solution

The ideal outcome is a summary of the ideas generated is presented to wikitech-l for futher evaluation, in the hopes that the wikitech-l discussion will narrow down the choices to the best ones.

Agenda
10 minutes: session introduction

25 minutes: Discussion of root problems of frustration. Or if there even is frustration

30 minutes: Discussion of potential solutions to problems

15 minutes: Conclusion. How do people feel about what's been discussed. Reflections on stuff discussed. Are there solutions that specificly excite people. Next steps.

Summary
Brian introduced the session. (note: the scribe isn't responsible for the summary)

What do other people find frustrating about code review?

How does a reviewer find reviewable patches?

How does a contributor find the person responsible for the code? Especially for code where no one is "responsible"? What about patches that add/change responsibilities?

Are we too conservative about what we accept? Why do things get stalled?

Is operations/puppet more lively than the MediaWiki codebase? (one person claimed code Review makes development un-fun)

How do you learn how to take responsibility for sections of code? Is CR a good learning opportunity for developers?

How do you budget time to do reviews for what you are responsible for?

Is code review necessary? How do we catch security bugs without CR? How should we treat people who make mistakes? Does our current process ensure secure code? How do we focus review on the important stuff, rather than nitpicking? Are there restrictions that force us to have CR?

Is a governance model necessary for CR?

What should we do differently?

Should we require a maintainer be identified for each patch? Should we assign patch review to specific individuals on a per patch basis? Should we ask for people to agree to be assigned maintainers? Do we need a more structured approach? What is the reward for being a maintainer? Can we make reviewers more prominent? Rewarding metrics? What if the assigned reviewer is too busy?

how can we make non-blocking comments on a patch?

Training sessions where main contributors explain the broader background and their plans for areas of code?

ANSWERED: Can we have office hours for code review? Is there value in synchronous review? +2 Yes!

What about post-commit discussion/improvement? Do github-style pull requests and feature branches work? Does Phab's model work as a middle ground between Github and Gerrit?

Are +1s useful at all? Is it a signal for if the reviewer read/understood the code? Is a +1 useful without a comment? If someone doesn't have +2, is it still useful? Does a +1 signal risk?

Does Gerrit embody/imply the policy we want? +2/+1/0/-1/-2 work?

How can we invite outside experts into CR in a patch? (e.g. localization?)

Is automerge a good idea? Does auto-merge demotivate reviewers?

Should we migrate to Phabricator?

Should we assign owners to areas of code?

Can our CR process scale to early Linux level? In CR, is being a jerk better than being silent? Is our code modular enough for this?

Have we got/do we need Code review guidelines?

Should we have a CR committee? Does a CR committee have sufficient ROI?

Are there good examples of WMF teams we can emulate?

Action items: * Greg: set up CR office hours (start the practice) * Kunal: update the ownership list * Bawolff: follow up on questions/notes from this meeting

Detailed notes
[bawolff] Talk about problems people have, what we can do differently. Focus on social conventions, less on technology. What do we ideally want, brianstorm ideas [bawolff] CR is random, from time to the depth of reviews, like playing roulette. Would like it to be consistent. [legoktm] Most frustrating is that when I have free time, it's hard to find reviewable patches. And when I need review myself, I ping people on IRC, because they don't respond to just Gerrit requests. [cindy] For me, hard to figure out who should review the code, unsure who the one person is to review. "when everyone is responsible, no one is responsible", differing opionions on how to implement features, no one to arbitrate that [bawolff] CR responsibility is not shared enough (etherpad) [legoktm points to http://koti.kapsi.fi/~federico/crstats/core.txt / http://koti.kapsi.fi/~federico/crstats/extensions.txt ] [ariel] bunch of MW code that is no longer owned because there is no MW Core team, no one is responsible (literally) [bawolff] that's bad [ori] I'm not a good code reviewer, ready to help with debugging in prod/handling fallout, wants to greenlight a patch even if they are good patches from trusted contributors; no socially acceptable way to take risks with patches. [ori] Compare ori's history in ops/puppet vs mediawiki/core, way more changes in puppet even though more interested in MW. In core CR can be "a drag", people don't take chances or deviate from patterns, doesn't let you build momentum or energy. (ops/puppet allows self-merging). Be more open to risk.

(etherpad) It's easy to size up the cost and risk that comes with modifying code, less easy to size up the cost and risk that comes with erecting barriers to contributions, letting code rot, etc. Our practices are geared toward preventing the former, but do nothing to prevent the latter.

[bawolff] summarizes as CR saps creativity [andre_] reviewers vs. contributors. Should we have a more structured process when it comes to reviewing code: blog post by Sarah Sharp? need more skillfull/confident reviewers? reduce areas where no one is responsible. Go through rotting patches and assigning someone who is responsible? (Long version: https://www.mediawiki.org/wiki/User:AKlapper_%28WMF%29/Code_Review ) [isarra] I have +2 in mediawiki/skins/*, but don't know how to review things [bawolff] there are no instructions for doing CR except for "don't screw up", no guidelines for what's okay or not okay. what's acceptable/minor/etc. You don't want to be the one who +2's something that wasn't supposed to be +2'd. [isarra] If people don't have time to review themselves, then definitely don't have time to teach others [marxarelli] Maintains mediawiki-vagrant, have event in calendar to just review code. considered doing office hours to do 1:1 reviews with people? (notes that vagrant reviews can be slow, initial provisions can take 15min). Dedicated office hours type time [csteipp] CR is necessary. From security perspective, non-zero number of security issues caught in CR. Standardize CR, make it better. We're too hard on people when they make mistakes. [bawolff] CR is the job that is thankless, no one gets appreciation for doing a lot of CR, compared to writing patches/fixing bugs. CR'ers take on the risk for a patch, but no reward [adamw] CR is one of favorite parts of this job. Hallway conversations...? Actual training would be effective, people interested in certain areas of code could teach others about that, problems they see, etc. [bawolff] we don't have anything like that [jrobson] I enjoy code review too. Feel good factor about it, asked volunteer why they contribute to MF, its because [jon] reviews their code. [bawolff] chances of someone sticking around reduces significantly if no review by 72 hours. Wikipedia is based on instant edits, etc. [jrobson] repos of code that are unmaintained. do we have guidelines for what is mergable? We should have CI set up for pointing out trivial errors. A fear that as soon as you merge code it runs on the train, .. possible alternative is development branches. [mah] need governance model, http://mwstake.org/mwstake/wiki/Blog_Post:37. quote. if you don't review people's code, it isn't being used and we're failing. Need a MW governance model. [bawolff] don't think we need governance model for this, that's more for high level decisionmaking [robla] we have architecture docs, kind of drafty, but we can improve those [bawolff] ... [ori] I want to challenge chris steipp on the security angle. You said that code review has found a low but non-zero number of defects. If we had 2 reviewers, we'd find even more security risks. If we upped it to three, we'd find even more risks. On the other side, there's tons of PHP code we have that has no maintainers, which is a huge security risk, and draw off contributors [csteipp] It is the only practical way we have of identifying security issues before releasing to production. I think the current trade-off between risks and obstacles is reasonable. [bawolff] fully standardize what we have to check for, to make reviews actually catch security issues. [k4-713] can't get rid of CR because of PCI (Payment Card Industry) rules. if thast happened, we'd have to stop using MW, or get someon to do reviews for us. [bawolff] I think we have to keep CR, but we need to modify how we do it, possibly radical, more structure will help. [janZ] difference between "moving fast/breaking things" and letting other people clean up afterwards, vs taking a risk together with a reviewer

What we should do differently
[legoktm] Brian proposed a solution on the phab task: assigning a reviewer person to every proposed patch. I think that's the key. Someone has to be in charge of the review process. big problem for maintainer-less areas. [bawolff] Even elsewhere, patches can fall through the cracks. Have to find a person responsible for it. [maxsem] How are you proposing to enforce this? [bawolff] Good question. The first step is to divide up MediaWiki core into sections, and allow interested people to sign up for a section. This comes with responsibilties, and you get removed if you don't meet them. Hope enough people sign up. [maxsem] Result will be zero signups. [ori] No effective-way of making non-blocking comments, that get author's attention but don't block the merge. Have to give a -1, otherwise people don't see the comments.

[dlynch] (etherpad) Differential helps with this.

[andre_] more structured review approach for reviewers. actively check who is using +1/-1, encourage them to receive +2 rights. Assign reviews that nobody selects. Need a review culture for WMF teams. Improve documentation for reviewers and contributors for what is acceptable.

MModell (etherpad): Phabricator will solve this.

g1

[mah] Follow up for what Max said, it sounds like a great idea but people won't do it. What is the reward for doing it for free? [k4-713] Have we tried live review/office hours session type thing? People who wrote code/reviewed code on IRC together. [Audience: no]. Had a lot of success with live feedback. [MatmaRex] What k4 said happens, but in an ad-hoc pattern. [millimetric] We use cloud nine(?) get code in shape and send it to gerrit [adamw] Merge patches if they're passable, leave comments and rely on authors goodwill to fix those issues. There is a big visibility problem with these post-merge comments, though. How does github-style feature branch development work? i.e. move forward with patch and squash later on. [bawolff] Often times the problem is that no one looks at the patch/comments/reviews/says anything, etc. [bd808] re: adam's q about github-style development. composer-merge-plugin is developed primarily on github, rationale was to reach normal PHP devs outside of MW community - and it worked, more contributions from outside. Key differences that are crappy about Github: no patch chains (dependent changes), cannot amend other people's patches - no collaboration [ostriches] Phab gets it right, straddles the difference between GH/Gerrit - less strict than gerrit, but you can do deps and stuff.

Solutions
[ori] Proposal: Get rid of +1's. Bugzilla quip where MZMcBride says "+1 means reviewer has working mouse" Legitimate only for controversial patches, to indicate agreement. (https://tools.wmflabs.org/bash/quip/AU7VVae36snAnmqnK_xL ) [bawolff] patch with +1 means nothing, usually means less likely to be mergable [marxarelli] Phab improves this. Gerrit is only binary. Responsiblity of merging on reviewer. [bawolff] I don't want to limit this to current technology options. [mooeypoo] +1 useful for when there's a single "expert" person, whom we want to give their opinion, but who don't necessarily understand the whole codebase. e.g. moriel is a RTL expert. Comments without +1 are often not noticed. [bawolff] siebrand +1's i18n changes, acting as "expert". [ostriches] re ori - that's configurable, if we agree on this, it can be done. I generally agree with ori, but see mooey's point. Another problem is that +2 implies merging, we backed ourselves into a corner here. Phab gets this right by encouraging actual comments, not numbers.

(etherpad) general etiquette: leave nitpicking comments vs. quickly patching them. Both work but as a reviewer which is expected? AW: IMO it's most polite to write a followon patch. The original author can apply the patch or not, and doesn't have to worry about merge conflicts.

[apergos] I use +1 and +2 like this: +1 is "I reviewed, haven't tested"; +2 is "Merging this, taking the responsibility". +1 makes sense in this way. [anomie] Same. [ostirches] Unlike -2 and +2, a -1 and +1 will disappeara on new non-trivial patchset. [legoktm] If you leave a number without commenting, you should have your rights taken away ;) Also, we need to give reviewers more recognition - e.g. put them on Special:Version, like the list of authors. The reviewer's name is also hidden in commit notes, which tools like Phab don't show by default. Brian's Phab proposal would be a good starting step. We have an initial list of "owners" in Reviewer-bot config.

(AW in etherpad:) Reviewers can already add themselves as automatically watching the repos they can review.

(etherpad comment from Brad): If only the reviewer bot could work on inheritance and not just filenames, I could review more extensions too... [fancy!] [bawolff] In BZ days, we had a monthly report email listing the most active bug closers. That motivated people (or at least, me). Nemo's crstats is also nice. http://koti.kapsi.fi/~federico/crstats/core.txt

(etherpad) number of commits vs. number of reviews

[twentyafterfour] +1 to ostriches ;) Biggest CR impediment for me is the automatic merging; I didn't know that +2 implies merge implies deploying (in some repos). [bawolff] mediawiki-config repo is another conversation. [Isarra] people have to +1 things because they don't have +2. (In particular volunteers.) [bawolff] If somebody's only action on gerrit is +1'ing a patch, that's often not a good sign. [ori] +1 means that someone likes the patch, but thought it was too risky to just merge.

(etherpad comment): for me (Tgr) it tends to mean "willing to merge but don't want to prevent others from reviewing". A "scheduled +2" option would be nice.

(etherpad) in repos with fewer maintainers (e.g. pywikibot) +1 means: ok for the author to self-merge

[janZ] I have used +1 with the comment "I have reviewed this, please merge" because I don't have permissions. A +1 with a comment that says what was reviewed is useful. [mmodell] take ownership https://phabricator.wikimedia.org/owners/

Conclusion / recap
[robla] I am only putting questions in the summary (above), because otherwise could "stop" the conversation. Have to look for easy/hard questions later. If the answer is obvios, maybe we could write it down. [twentyafterfour] Obvious to me: Move to Phabricator. Phab has Owners tool. Call to action: Set up an owners package for stuff you care about.

(etherpad comment): Phabricator still does not know about half the repos on gerrit

(etherpad): The "moving to differential" session was yesterday, and talked about our path to getting everything set up in Phabricator. (Yes, these sessions should probably have been reversed in order.)

[bawolff] We could've done that in SVN day and yet we didn't... [robla] Not so obvious :) [legoktm] Code review guidelines ... ? (missed it) [everyone discusses 'obvious' answers, see above] [legoktm] the Owners thing, the problem is social, not technical. List of maintainers at mw.org need updating. [greg] We need actual guideline/policy for this. [DanA] Look at Linux, it's reviewed on a mailing list and it works? We have smaller scale too. Maybe we have to look elsewhere. [bawolff] I don't know much about Linux code, but I think they have components.

(etherpad comment): Re "one jerk reviews all the code", we had that (although I don't think Brion or Tim are jerks). It didn't scale.

[bawolff] Indifference is even more sapping that bad review / people being assholes, it meant nobody even cares. [ariel] Big prestige of being a contributor to Linux kernel, people more likely to suffer through it. [David Lynch] Owners list is both social and technical. On Phab you get directly added as reviewers, good reason to keep up to date. [ostriches] CR through email didn't scale for us, a looong time ago. It comes down to lack of ownership, not a technical issue. Need to remove inactive owners, etc. (cf. https://www.mediawiki.org/wiki/Gerrit/Project_ownership#Requesting_repository_ownership ?) Need a way to cease maintainership too. [Cindy] My own patch is languishing. Need owners(?), but we also need people responsible for new code. Architectural issue, do you add stuff to an existing one or a new class?

Reiteration of "How does a contributor find the person responsible for the code? Especially for code where no one is "responsible"?

[legoktm] unsure, the patch has had multiple reviews. https://gerrit.wikimedia.org/r/#/c/250039/

[cindy] multiple contradictory reviews pointing in different directions - who makes the final decision?

[bawolff] what is an acceptable way to do things? Standardize CR process, not have it dependent upon which reviewer you get. Need a catch-all owner.

CODE Review WRANGLER

[greg-g] freaking out about no action items [robla] most important thing is to improve the CR process, not make sure the meeting was useful. Need to prioritize list of questions. [ori] CR has to do with rules about how we interact with software development, hard for people who are not invested with special power to change anythin - kind of meta. We should agree that CR is a severe issue that we should appoint some small committe to make binding recommendations on how to change it. [bawolff] just to double-check, who thinks that there is a problem with CR?  [DanA] does addressing this issue have sufficient ROI? [kaldari] Talk with teams that do a very good job at CR, and have volunteers who actually contribute [Catrope] It comes down to ownership. VE didn't have CR backlog because there is clear responsibility. [bawolff] Owners don't have to be WMF teams. They can be anyone. [Catrope] Maintainership is better term than ownership [robla] we can look into other orgs taking on ownership, i.e. use other people's code rather than reinventing the wheel and writing our own. [bawolff] Yay I will fix everything.

[bawolff] specific code review guidelines would be interesting to explore tomorrow at unconference!

Reminders
DON’T FORGET: When the meeting is over, copy any relevant notes (especially areas of agreement or disagreement, useful proposals, and action items) into the Phabricator task.

See https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016/Session_checklist for more details.