Topic on Talk:Continuous integration/Codehealth Pipeline

Code coverage statistics are misleading

2
TJones (WMF) (talkcontribs)

Here's an example of a recent SonarQube report, which is shown on Gerrit as failing because only 67.9% of "new code" (not just from my patch) is covered by tests. However, after merging this patch, it estimates that about 74.1% of code will be covered. Digging deeper, I found this report, which shows 96.5% coverage on my new code. I feel like the 67.9% number is effectively making my patch responsible for the all the uncovered code in the repo. Is that what we want to do? I don't really feel qualified to write tests for some of the code outside the analysis/ directory I'm working in with this patch, so I could never get this to pass—and it might be a pain for my reviewers to review a bunch of completely unrelated tests at the same time, too.

I think the new code coverage metric (96.5%, in this case) should be the pass/fail metric, while the before and after repo-level coverage (67.9% → 74.1%, in this case) would be very informative, but not part of the pass fail (or maybe pass only if it increases). A link to the new code report would also do more in terms of encouraging me to improve the coverage of my new code, and the existing code that I'm working with and possibly more familiar with.

All that said, I love the SonarQube reports and I am very happy to have them available. I'd just like the snapshot shown on Gerrit to be more relevant and informative. Thanks!

Update: Ugh, I just realized that the 96.5% coverage report that I linked to is for that directory, which in this case happens to line up with my patch, but in general it is not that easy to get the info. Bummer. If coverage for the current patch is not available, showing the "67.9% → 74.1%" metric would still be useful and let us know we are moving in the right direction. (And if coverage for just the current patch is available, please tell me where to find it!)

GLederrey (WMF) (talkcontribs)

Having a good quality gate on code coverage is hard! At the moment new code is defined as code created over the last 30 days. We can also configure that as code since the previous commit, which would give feedback on just your work. The draw back is that there are a lot of cases where it make sense for an individual patch to have low coverage (some code makes less sense to test than other). Having a sliding window over 30 days is a compromise, trying to care about the overall quality of a project, but with a focus on new code vs old code that we're never going to touch (and so we should not invest in improving it).

I think we should be able to get the info about coverage of a specific patch, but I could not find it in the UI (at least not without reconfiguring the quality gates to only care about changes since the previous commit).

Reply to "Code coverage statistics are misleading"