Collaboration/Team/Processes

We (the Collaboration team working on Flow) are an agile team. We come up with processes that work to deliver software, donning suits as required.


 * Work on Collaboration board, starting with Code Review column working right to left, top to bottom.
 * Review Echo and Flow open patches
 * Triage bugs
 * monitor test results

TODO: The team is also responsible for Thanks, PageTriage, MoodBar, WikiLove, etc. plus extensions written by the old Growth team such as GettingStarted. Update all these links plus bug triage query to include all these extensions.

Project

 * Our deliverable is simple, but big: Extension:Flow.
 * The team is also responsible for Echo (Notifications), also Page Triage, Thanks, etc.
 * Incorporating Flow into MediaWiki often requires changes to core (BTW "MediaWiki core" is not the same as "Core Features" team"), so we also make changes there
 * configuration of the extension for WMF wikis is in, just like other extensions
 * Flow introduces new features to the MediaWiki UI visual design (formerly "Agora"). We cooperate with other engineers to eventually update core with generally-useful ones..

Review means release!

 * Echo and Flow open patches to review

When you +2 a change to Flow in gerrit, Jenkins will merge it to master. It's now on the release train to appear in front of users!
 * It will appear on beta-labs shortly.
 * It will appear on mediawiki.org next Thursday per the (Deployments calendar)

Meanwhile... Our team, especially our Product Manager who is the sole person who can "Accept" a story, tests new features on the beta cluster, or on our labs test instance ee-flow. So when you +2 any Flow change in gerrit, you must do some housekeeping:
 * In the Phabricator workboard for the sprint, drag the task for the change to "Product Review" and make sure it's obvious how to test.

Master has to work
Everything we merge to Flow master is deployed on en-beta and the ee-flow labs machine within minutes, and then our Jenkins Continuous Integration setup runs our simple  browser tests against it every 12 hours or so. So merged commits must not break.
 * Review the Echo+Flow tests dashboard. TODO: add the PageTriage, WikiLove, etc. browsertests to this
 * You should rebase against latest master and verify it works locally and run tests before you move a card to "BLOCKED" (because it's awaiting final code review)

Any problems you encounter with master should be filed as bugs in, because master is externally visible on en-beta and ee-flow.

Flow releases to production
Flow is enabled on a few pages on mediawiki, metawiki, and several language wikis including enwiki (see Flow/Rollout which also explains the process).

So whatever is in master as of Wednesday morning PST will be deployed to mediawiki.org around noon Thursday as part of the new 1.23wmfNN release in the WMF's one-week release cadence. This then "flows" to metawiki the next Tuesday, and then to enwiki a week later on Wednesday.


 * We have an informal code freeze 24 hours before the Wednesday deployment. Starting Tuesday 11am PDT do not +2 a gerrit change unless it's vital, safe, and tested.
 * Any patches requiring DB updates should be flagged (do they need to be in DB review?).

Backwards and forwards compatibility
Note that mw.org, meta, and enwiki will be running different releases yet talking to the same Flow DB and external store, so all DB updates must support later and older code.

Likewise, Flow information in the cache (memcache/redis) may be shared between different code releases, so we have to carefully consider bumps to.

The Release readiness review ranger
On Tuesday after our unofficial code freeze, en engineer on the team reviews the state of Flow prior to the "release train" phase 0.
 * Update core and all extensions on ee-flow to master, run upgrade.php
 * Git log Echo and Flow (and other extensions) to see changes since the last branch. Try out changes
 * Review browser tests on beta labs on the Echo+Flow CI dashboardtests dashboard, they should be passing. If not, try Flow on the Beta cluster.
 * In particular Flow interacts with Echo, Parsoid, Thanks, AbuseFilter, ConfirmEdit & SpamBlacklist.

Fixes and backports
Flow has a Tuesday window in the Deployments calendar. We usually use it for DB updates and other complicated feature deployments, and tp roll Flow out to new pages and wikis.

If we need to backport a fix
 * prioritize with Danny and quiddity
 * get OK from greg-g to add it to Deployments calendar, probably in a SWAT deploy window.
 * Danny or spage adds a card to current Trello calendar, gives it a deadline
 * get it merged to master well in advance
 * prepare backports, get them merged, the prepare Flow submodule bumps in the branches
 * use the Etherpad http://etherpad.wikimedia.org/p/Flow_Deployments to keep track of complicated deployments
 * do prolific manual testing afterwards

Phabricator notes
Each of our extensions (Flow, Echo, Thanks, PageCuration) is a project in Phabricator. Bugs and feature requests and work items for each extension go in there. Each extension project has a workboard, but we don't use these workboards to plan work.

Instead we have a Collaboration-Team project for managing work, /tag/collaboration-team/, and we do use this projects's a workboard project/board/65/ extensively. Add this project to any task in Phabricator that the team is
 * readying for a future sprint (planning, design, refining story, etc.)
 * actively working on
 * needs to discuss at standup (on the workboard move to "Needs team triage")
 * is an interdependency with another team (also add blocked-by-TeamName tag and Scrum-of-scrums project)

Currently we both manage the current two-week sprint and tasks on this one board, a decision we'll re-evaluate.

Not yet using Sprint burndown project type?
has a Sprint plug-in that supports Sprint projects. A sprint has start and end dates, and Tasks in a sprint can have story points. The trigger for this behavior is the special character '§'.

SPage created "§Collaboration-Team-Sprint-2014-12-17", here's its Burndown view. But its Sprint board view  was very slow to load, and after adding about 20 tasks to the project it times out. So until the performance T78679 is fixed we are not using this. Once fixed we should add the special '

Story grooming

 * All design issues should be wrapped up before the next iteration, so their due date is the Tuesday before the iteration starts, so engineers can estimate at the Wednesday poker meeting.
 * want designers to be working two iterations out,
 * When cards move to Flow current iteration board, the designer remains on the card.

Story review and estimation
We use http://hatjitsu.wmflabs.org for estimation

Engineers READ THIS BIT
Developers work in Flow current iteration:
 * Work right-to-left – help with code review in gerrit, see if you can help cards "Blocked with Questions", make progress on the cards you have "In Development"

After that,
 * Take the top-most card from "Sprint ZZ column", add your name as a member to it, drag it to the "In Development" column.
 * When you think you're done rebase your change on latest master, make sure it still works (run all tests), add the gerrit URL in a comment, and move to the "Code Review" column.
 * If you need design guidance or have questions about the design, ask on IRC. Also move to "Blocked with questions" with a question, reach out to a designer, Danny, spage, etc.

When you +2 a change in gerrit
 * run make ee-flow to deploy it on ee-flow (see wikitech page for ssh setup)
 * move the card to Testing.

When you need something from design, talk on IRC, and:
 * add the purple "Design" label
 * if a designer isn't associated with the card add moizsyed_ to the card.
 * under Activity, add a comment with @username I need XYZ to create a mention.
 * If you can't get any further, move the card into the Blocked with questions column.

Bugs
We review bugs in Phabricator
 * Key query: all open bugs in Flow, Echo, Thanks, and PageCuration grouped by priority (i.e. Unbreak Now! followed by Needs triage, High, ...), then sorted by date modified
 * Note this doesn't include [Collaboration-Team], because any tasks that are only in that project aren't user-facing bugs
 * Bug management/Phabricator queries has more useful queries, customize these to add our projects.

Bug triage
Ideally takes place in standup.
 * First review bugs that team members have put in the "Needs team triage" column of the Collaboration team workboard.
 * Then go through the bug backlog.

When you triage a bug, edit it.
 * improve task Title and Description
 * change Priority to something
 * Default Normal, Low if you're dubious.


 * add informative tags to Projects field, e.g. design, easy, Performance. Note "Project" encompasses tags, teams, extensions, etc. – the icon and color distinguish different kinds.
 * if you think team should evaluate, add Collaboration-Team to Projects field.
 * If you set Priority to High or Unbreak Now! then definitely add Collaboration-Team to Projects so it's on the team's radar.
 * adding Collaboration-Team project means it goes in the backlog column on our workboard, so on the workboard drag the task into a better column such as "Needs team triage" or "Consider for next sprint".

Every Phabricator project has a workboard, but we don't want to have to visit multiple extensions' workboards to manage team work. So we don't use each extension's default workboard. We should view these workboards occasionally to see if anyone outside the team is doing project management in them.

Definition of Unbreak Now
A task is Unbreak Now if it meets one of the following criteria:


 * Anything that makes it impossible to use one of the following features:
 * Reading
 * Posting new topics
 * Replying
 * Viewing history
 * Moderation
 * Any fatal
 * Anything that impacts other teams' products in similar ways
 * Anything that breaks continuous integration, preventing patches from being merged

Definition of Done
For a software development task to be marked Resolved and Done, a patch must be:


 * Merged to master.
 * If it meets the definition of Unbreak Now: A cherry-pick must be created for all deployed branches that are affected. (There are always two branches deployed, but in some cases only one is affected).  In addition, the deployment of the cherry-picks must be scheduled.
 * If it is user-facing, it must also be product-reviewed. For items in the sprint, simply move the task to the Product Review column.

After all of the above is done, mark then task Resolved then move it to the Done column. The Done column is a requirement of the Sprint extension, so our Burndown charts are accurate.

Old Bugzilla queries

 * Flow new bugs (2 weeks)
 * Flow changed open bugs (2 weeks)
 * Flow open bugs
 * Flow all bugs
 * Also, Bug Report - a nice grid view of Flow bugs (thanks StevenW)

Trello notes
Two boards
 * Flow current iteration
 * Every bug and story the engineers should get done in the current two-week three-week "sprint".


 * Flow backlog
 * Everything that isn't a story to get done in the next three weeks:
 * Design work (purple tag) for the next iteration
 * Stories we're refining for future iterations
 * Business analysis and community liaison work
 * The backlog of all good ideas and maybe-someday

So "backlog" also means "readiness", "prep", etc. Better names welcome.

Reach out to engineers for any BLOCKED cards, usually they're awaiting code review in Gerrit.

Thereafter the highest priority cards are at the top.

Gerrit notes

 * We need to review both Flow and Echo patches. Here's a dashboard for both; same dashboard for 1+ week old (this time period can be tweaked freely according to the search documentation).
 * TODO: Update this link, also need to review and monitor Thanks, PageTriage, MoodBar, WikiLove, etc. plus extensions written by the the old Growth team such as GettingStarted.
 * Your commit message must have a line about testing. What unit test or browser test exercises the feature, or why there's no test.
 * Use comments to be clear in what you want. "I want bernie to review the data change before I'm comfortable +2ing"
 * +1 means "OK to merge but I want someone else to take a look"
 * two +1s means it's good enough to merge. The second +1 on a patch should either +2 it or explain why not in comment.
 * Gerrit forgets +1s when someone submits a new patch set, so whoever does so should summarize the state, e.g. "mlitn +1'd patch 5 and this addresses his concerns, so need one more approval."

Working on a long-lived branch

 * Create the branch in gerrit, e.g. frontend-rewrite
 * Check it out locally
 * modify .gitreview in it to add  and commit the fix
 * To update the branch to latest master
 * $ git checkout origin/frontend-rewrite; git merge origin/master; git review
 * git review will ask if you want to submit all the commits on the branch; it's OK to do so, these are the existing commits that were merged in, gerrit doesn't really submit them as new.
 * to set a labs machine to check out that branch, crontab -e and add
 * */5 * * * * cd /mnt/vagrant/mediawiki/extensions/Flow && git checkout frontend-rewrite && git pull --ff-only

Code hygiene

 * The Jenkins jobs for the extension does PHP and jshint checks, and runs our PHP unit tests
 * To see the tests, look at the comments from jenkins-bot on a change, e.g. 115110 and follow the links to the CI run
 * we could enable php codesniffer ( phpcs --standard=path/to/codesniffer/MediaWiki . ), but lots of warnings
 * Engineers must set up the pre-review and pre-commit git hooks:
 * make installhooks
 * these run similar tests upon git commit.
 * ToDo: everyone commit automation improvements (git commit hooks, local test scripts, Makefile commands, etc.) to and

Test
We have various tests (see ) but not great coverage. Before you make a big change
 * write tests for it
 * In, engineers broadly agree we should write more tests, and when fixing bugs should write tests that fail without the code that fixes the problem. But these are just Good intentions...
 * make phpunit, run browsertests

Browser testing
It's not hard to run Flow's , see Quality Assurance/Browser testing/Running tests Every developer needs to understand the features we're testing and proactively fix tests as part of changes so the tests remain green. Assume that any front-end change will require updating tests. Just read the high-level test scenarios in tests/browser/features/*.feature.

To run tests yourself, read Quality Assurance/Browser testing/Running tests. In a nutshell:
 * It's a role in MediaWiki-Vagrant, or install rvm, then Ruby, gem, and bundle manager.
 * On you local wiki, add 'Talk:Flow QA' to  ; optionally you can create a "Selenium_user" account so your own username don't get all its Echo notifications.
 * Give the username running the tests (yourself or "Selenium_user") block-user, suppress and delete rights in order to test these features. In a typical configuration, use Special:UserRights, and add the user to the "administrator" and "oversight" groups.
 * To run the tests
 * in MW-Vagrant against MW-Vagrant: enter make vagrant-browsertests
 * locally against a web server such as ee-flow:
 * change the variables above to run the browser tests against your own wiki
 * the default browser is Firefox (note Firefox 32 is broken), there's also BROWSER=phantomjs
 * you can append :  to run the test with that line number, e.g. ... bundle exec cucumber features/flow_logged_in.feature:24
 * if you just supply features, all browser tests will run, and some will fail.
 * The browser tests assume an existing Flow QA board with existing posts, so on a new wiki run the tests three times and thereafter they should stabilize.

Browser tests are annotated with the browser(s) and server(s) on which they should pass, such as @chrome @en.wikipedia.beta.wmflabs.org @firefox @internet_explorer_10 @phantomjs @test2.wikipedia.org (and with other annotations like,  , etc.). Continuous integration passes the appropriate tags to  to only run the tests with these tags, e.g.  ... bundle exec cucumber --tags @en.wikipedia.beta.wmflabs.org --tags @firefox If a browser test fails, check its annotations, maybe it's not expected to work in your configurations; conversely if it passes, make sure it has annotations for your browser.

Running more than one test often fails on phantomjs version 1.9.7 because the its WebDriver test doesn't reset session state between tests, a known issue.

A big risk of a change is it degrades user experience on one of many browsers none of us use. To try Flow out on alternate browsers, see Browser testing and design tools on officewiki.

Performance
Flow is slow. It has to get faster, it mustn't get slower.
 * Use the Network tab in your browser's developer tools
 * See Flow bugs tagged performance TODO
 * View > Source of a page includes a line
 * We have profile hooks
 * TODO
 * We have some monitoring hooks at http://graphite.wikimedia.org
 * in its left navigation, expand Graphite > MediaWiki > Flow, and FlowHooks
 * so for example http://graphite.wikimedia.org/render/?target=MediaWiki.FlowHooks.onPerformAction.tavg&from=20140205& (this hook also gets called when we don't run, so the average would be way off

Labs machines
The Collaboration team has several labs instances in the editor-engagement project. Do not check out git files on either machine as root
 * ee-flow is an old instance with the mediawiki-install::labs role and extensive local mods, so we have to manually maintain it (see wikitech:Help:Single_Node_MediaWiki)
 * User  has a crontab on ee-flow that pulls master of extensions/Flow and extensions/Echo.  The /etc/motd message that you see when you ssh to ee-flow should tell you if this is running.
 * flow-tests is a newer instance with the role::labs::vagrant, so you can sort of administer it like a MediaWiki-Vagrant host (see Labs-vagrant), e.g. labs-vagrant enable-role and labs-vagrant git-update

We store changes to LocalSettings.php and other files on these labs instances in a common git repo that they all mount: ee-flow$ /srv/mediawiki/orig/LocalSettings.php -> /data/project/git_files/ee-flow/orig/LocalSettings.php

so commit changes to these files to git.

Security
Read and follow Security_for_developers and its accompanying Security checklist for developers. The Manual:MediaWiki Security Guide has background on these topics.

Templating security
Chris Steipp commented on Flow's templating security review bug "please follow my suggestion about policy (here), and make sure the team has a policy that,
 * If substitutions are used in html attributes, those attributes must be quoted with double quotes.
 * Make sure any SafeString objects have their escaping as close to the creation of the SafeString as possible, and that should be as close to the output as possible. It would be really helpful if I don't have to trace back more than 1 (or at most 2) function calls to see the escaping.

Refactoring

 * If it feels more scope than a bug fix, make a card for it, ask spage to add to the iteration.
 * Engineers need to campaign for their ideas – bring up major code changes in the "daily" standup, referring to the story card

Retrospectives
Currently at the Flow_retrospective etherpad. We've also tried dogfooding Flow by doing retrospectives on a Flow Team Retrospective Flow board. In late 2013 we did some retrospectives in the Afterparty etherpad.

During the meeting, start by adding issues you want to discuss in one of the three sections ("What worked", "What didn't work", "What's still confusing"). Then, if you're interested in discussing someone else's issue, add +1.

Meetings

 * Collaboration team calendar resource: In Google Calendar, add  to Other calendars, see Calendars page on office wiki for help.

Absences
If you're going on vacation or will be absent for 1/2-day,
 * 1) Add the absence to the Google Collaboration calendar, so others see it. (add Collaboration calendar to My calendars and show it)
 * 2) Invite yourself to the event, so others will see you're busy when they try to schedule you
 * 3) Decline the meetings while you're out so others don't wait for you.