Core Platform Team/PET Work Processes/Clinic Duty
Current Clinic Duty Rotation
The goal of the team is to provide space to work on important maintenance and tech debt tasks that are not covered by existing projects, to ensure that inbound UBN and reactive tasks are handled in a timely manner and to provide urgent support needed by other teams.
The Clinic Duty Team is tasked primarily with handing inbound reactive work, including
- Handling of Unbreak Now (UBN) tasks within the team's scope.
- Triaging incoming issues on the team workboard.
- External and internal requests for code review.
- Fixing regressions and simple, high-priority bugs.
- Progressing ongoing maintenance work.
- Participate in the train log triage meeting every Wednesday.
This section needs EM input!
The Clinic Duty team works on a continuous basis, with members transitioning in and out as needed. This is coordinated by an Engineering Manager, with 2 to 3 team members working on Clinic Duty at any time.
Daily check ins are conducted asynchronously, and a meeting is held once weekly for check in and task triage.
Members should know at least a week in advance if they are going to rotate onto Clinic Duty, particularly for planning around holidays and vacations.
Engineers should daily perform #Initial task triage, and should devote much of their remaining time to #External review and #Internal review.
Initial task triage
Team members are responsible for initial triage of all tasks in the Inbox columns of the Core Platform Team workboard and the Clinic Duty Team workboard. This is done daily, if not more frequently.
While it's impossible to account for every possibility, the outcomes of this triage may include:
- Moving requests to #External review.
- Moving the task to the Triage Meeting Inbox column of the Core Platform Team workboard, for discussion during the weekly triage meeting.
- Moving the task to other columns of the Core Platform Team workboard, if they clearly belong there.
- Moving the task to other CPT workboards, e.g. Green Team or one of the Initiative workboards, if they clearly belong there.
- Claiming the task for #Internal work.
- Avoid cookie-licking! Team members should avoid claiming more work for Clinic Duty than they can complete in a reasonable period of time.
- Untagging CPT, if the task is mistagged. This may also be done for old tasks with no real activity that were tagged automatically by Herald due to an unrelated action (e.g. someone subscribing or unsubscribing).
- Asking another team member (on Clinic Duty or off) for help in triaging.
Team task triage
The team will have a weekly meeting to process tasks in the Triage Meeting Inbox column of the Core Platform Team workboard. This meeting should be attended by engineers on Clinic Duty, at least one Engineering Manager, and at least one Product Manager.
The purpose of this meeting is to triage tasks where questions exist around scope, resourcing, and priority. Again, it's impossible to account for every possibility, but outcomes of this triage may include:
- Assignment of the task (to an engineer, PM, or EM) for investigation.
- Assignment of the task to an engineer for implementation within Clinic Duty.
- Scheduling of the task for one of the team planning meetings.
- Moving the task to Feature Requests to Review or Future Initiatives.
- Moving the task to other columns, including Volunteer Needed, Tracking/Watching, or Icebox.
- Untagging CPT from the task.
If, after the meeting, the Triage Meeting Inbox is not empty, another meeting should be scheduled later in the week to finish the triage.
Tasks with patches from volunteers or other teams are tracked on the CPT External Code Reviews work board.
Patches needing review progress through the board as follows:
- Tasks start in Review Needed.
- Once a review is given, they move depending on the review:
- If merged or +1 with no further CPT review expected, move to Review Completed.
- If -1 or otherwise needing more work, move to In Progress.
- If you want additional review from another team member, you should ask them (directly or on the task or patch).
- A task in In Progress can move back to Review Needed once the -1 has been addressed, via comments or new patchsets.
Tasks needing input or advice (but not implementation) from Core Platform are also tracked on the Clinic Duty work board in the Discussing column. They are removed from the board once the needed input or advice has been provided.
Work internal to the Clinic Duty team is tracked on the Clinic Duty work board.
Tasks are added to this board when a team member claims or is assigned the task. All tasks on the board should be assigned to a team member (in the Phabricator sense).
- The next task a team member intends to pick up may be placed in Ready.
- Any tasks a team member is actively working on should be in Doing.
- Tasks where further progress depends on external factors (other than waiting for a deployment) should be in Blocked Externally.
- Tasks needing review by other team members should be in Waiting for Review.
- If the review results in a -1, the task should be moved back to Doing.
- If the review results in a +2, the task should be moved to the appropriate later column.
- Tasks waiting for deployment, either via the train or manual, should be in Waiting for Deployment.
- Tasks where all Clinic Duty work is done should be moved to Done, or if appropriate may be untagged.
Team members on Clinic Duty are expected to look to the Waiting for Review column on that workboard to provide each other with the reviews that are needed for progress.
This is currently something of a dumping ground. The stuff in here should be cleaned up and formalized somehow.
- The Engineering Manager is responsible for updating #Current Clinic Duty Rotation on this page as assignments change. Or else we should figure out some other way to handle it.
- Should we give more guideance in this section for EMs/PMs pushing work on choosing between CD versus other things not to be named yet? Or is #Responsibilities enough?
- Another source of reactive work is looking for production errors (see wikitech:Performance/Runbook/Kibana_monitoring)
In order to measure the impact of the CD team work we want to try to get a snapshot of current state of
- Number of backlog tasks
- Number of unsized tasks
- Number of unprioritised tasks
- Number of tasks created vs resolved in last X days
- Number of "untouched" tasks
- Average age of tasks
- Average response time on UBN/Reactive tasks
- Average response time on CR requested from outside of CPT