Team Practices Group/Measuring Types of Work

Jump to navigation Jump to search

Updated status[edit]

This pilot has ended, and Team Practices has no concrete plans to continue work in this area. Upper management is not currently asking for these classifications, although there are continuing discussions about categorizing work. The latest categories are "Strategic" (in service of a WMF strategy) and "Core" (other necessary work).

Some teams are continuing to use the #worktype-maintenance and #worktype-newfunctionality tags for their own purposes.

Pilot Study on Maintenance Fraction[edit]

Several teams were asked to measure their maintenance fraction for at least two weeks. We created new Phabricator tickets to facilitate this. Five teams have confirmed numerical results, varying from 25% to 76% maintenance (compared to new functionality). We identified many definitions of Maintenance, some orthogonal and some conflicting. We did not reach consensus on a single definition. Several teams are continuing to track this data.

Team % Maintenance Metric Methodology Details Data Range
VisualEditor 37% Task Points Tasks pointed and tagged at triage
  • Def: "Maintaining the production service at a suitable quality level"[1]
Jun 18 to Sep 30 2015
Collaboration Tag tasks at review
  • Def: Unbreak/High prio tasks for existing feature, triggered by external change
  • Method: Tag at product review for tickets closed Aug 21st and after
Aug 21, 2015
Discovery 25-50% (for Cirrus) Task Count
  • Cirrus Search: All work completed between 2015-08-26 and 2015-10-31 has been tagged as either maintenance or new.
  • Maps: All new during the pilot period, since first release was late September.
  • WDQS: All new during the pilot period, since first release was late September.
  • Analysis: Not tracking, but vast majority new
  • UX: Too new to have data during the pilot period
2015-08-26 to 2015-10-31
Release Engineering 76% Task Count Https:// Sep 1 to 21 2015
Services Daily Manual estimation over sample period 2 weeks ca Sep 2015
Reading Infrastructure 55% Hours Time tracking over sample period Tracked hours over 2 week sample.

"could be directly associated with a quarterly goal for the team or the individual."

Oct 20 - Nov 2 2015
Mobile Apps Apps teams have tagged backlog tasks, and have incorporated tagging into their triaging and estimation meetings (squeezing both ends of the tube). Some changes were done in batch based on criteria, so some gray-area tasks like research spikes may need tag review in the future.

As the iOS team uses a kanban-esque system, with a separate dev board, it might take a little while to see tasks entering that board and worked on. The Android team should see things entering sprints immediately.

Fr-tech 38%/33%/? Task Points Measured by points for two sprints ca July-Aug 2015. This is expected to vary strongly by season. "individuals on the team have differing opinions" on category definitions. Some work was "unplanned" but not "Maintenance".

Dimensions of Interest[edit]

Maintenance vs New work[edit]

Category A: Lights On[edit]

"Keeping the lights On"[2]

Something essential will fail immediately (days? hours?) if this work is not done.

Category B: Maintenance[edit]


anything which keep search ticking, but doesn't related directly to our goals

Something essential will fail or degrade fairly soon (weeks? months?), or will cost much more to fix later, if this work is not done now.

Category C: New Projects[edit]

"Investment/New Projects"[2]

Functionality that is not currently available.

Can a team be their own customer? If a team delivers functionality to themselves, to make their own work easier/cheaper/better, is that category C, or B?

So this includes *all* new features, even if they are relatively minor tweaks to existing functional areas?



Tech Debt

responsive correction


Endogenous vs Exogenous[edit]

Category D[edit]

Supporting others

Developers/Maintainers and Upstream projects and who (if anyone) is tasked with fixing bugs in extensions/code that no existing team/individual is currently (officially) working on.

Category E[edit]

Internal to team goals

Interrupt vs planned[edit]

Category J[edit]

Interrupt. Work that is added to a to-do list (or just done) on a few days notice or less; work that is done before work that is both planned and high-priority.

Strategic vs Not Strategic[edit]

Category F[edit]

most strategic in terms of achieving our mission

Fulfills a quarterly goal.

Category G[edit]

Not Strategic. Shouldn't be doing it.

Does not fulfill a quarterly goal.

Lila's Buckets for infrastructure teams[3][edit]

Category H[edit]

Supporting others

Category I[edit]

Prototyping / research


17 Aug 2015 meeting[edit]

Meeting Goals[edit]

1) Figure out how to meet request for teams to report on maintenance fraction

1a) Clear standards across groups

2) Figure out what teams want to measure for their own use

3) Make sure we know how to proceed

3a) clarify vs other dimensions


Why is Terry asking for the three categories?[edit]
  • WMF-level planning, budgeting
  • Katie: you want to do this to justify your hiring needs.
  • David: Can also use this [ongoing maintenance cost] to justify sunsetting things.
Why is Lila asking for her categories, which are slightly different?[edit]
  • maybe Lila wants some of this to figure out if a team belongs in which section?
  • Lila's categories in the notes are specific feedback to one team or role, not everybody
Is this all Engineering, or all Foundation?[edit]


What should we call Maintenance vs New Work, as opposed to the other possible dimensions?[edit]

Proposed: Maintenance Fraction.

Can we differentiate between A and B?[edit]
  • Greg - no
  • Katie - sort of - sort of have AB/Interrupt and AB/Non-Interrupt
  • B is debt
  • Does anyone have a rule for differentiating A from B that could work for most teams?
  • no
Is there stuff that is inward-facing that belongs to C?[edit]
  • no?
Does C include prototyping new things, researching existing things?[edit]
  • We probably have a lot of edge cases and maybe should do card sorting or other offline activity to get our collective judgment in sync
Proposed definition for C:[edit]
  • C is work that produces functionality that a non-foundation person would see/use.
  • This doesn't work for teams with no external customers. Those teams can try counting other Foundation teams as their Category C customers.
How are we measuring?[edit]
  • We will create a new tag to track this.
  • Some teams are mostly AB or mostly C. to make it easier, they can tag just AB or just C, and assume all other tasks are the other category.
  • Joel can create custom reports on other query-able factors (VE uses project column already)
Next Steps:[edit]
  • Joel: confirm definition (and check with Terry).
  • All: Send suggestions for names for tag AB and tag C.
  • Joel: continue/initiate email discussion on other categories
  • continue in email (inc. mailing list) until and unless another forum becomes better
  • All: Send edge case examples to Joel - how to measure follow-up discussion

Open Questions[edit]

If a team does maint on a tool the team doesn't own, should that count as maintenance? Or, in the new world, core? Does it matter if the Foundation "owns" the tool, or if it comes from another party? (Raised in context of Community Tech doing "fixes" work on non-WMF tools.)