Topic on Talk:Article feedback

More accurate rankings of ratings should consider number of reviewers

3
Readparse (talkcontribs)

I like the rating feature and I think it's well implemented. I knew this information must be aggregated somewhere, and I was happy to finally find the Article feedback dashboard. What I found was surprising.

At the moment that I looked, the Forbes list of billionaires (2012) page was ranked number 1, and the Plastic page was ranked number 2. Not that individual rankings matter so much, because it's not a contest. But I think it is of tremendous value to see examples of what the community considers to be among the most valuable content on Wikipedia. And I think the Forbes list, while not a bad page, does not rise to that level. So looked at the number of reviewers.

The Forbes article had about 20 reviewers at the time. The Plastic article had around almost 500. I believe that a 4.85 with 500 reviews definitely beats a 4.86 with 20 reviewers. The question is, what is the right way to factor in the count in the algorithm. I'll look into whether there's a standard way to do this in the world of statistics, but maybe somebody here knows. But I think it's a conversation worth having. I did a quick look through this Talk page to see if this thread has already been started, and I apologize if I missed it. I didn't see one.

Readparse (talkcontribs)

After further consideration, it might make sense for there to be a threshold before appearing on the list. There may already be, lest you end up with the highest-rated article being one that was given all fives by a single user. The question is, what should the threshold be?

I submit that a "statistically valid sample size" should be determined. To do that, you need to have a denominator to start with. I think that denominator should be based not on the entire number of Wikipedia users, but perhaps on one of the following:

  • The total number of distinct users who have rated any pages in that particular instance of Wikipedia
  • The total number of points awarded in ratings on all pages of that instance.

By "instance," I mean a language or locale. For example, en.wikipedia.org.

So, to summarize, I mean that if I put up a page on trapeze cats, and a few people rate it as five stars (perhaps including me -- I don't know if that's possible), it doesn't show up on the list. But when that 600th user rates me, then my votes have become statistically valid, and I can appear on the list.

Happy to talk more about this with anybody who is interested. I think the ratings are really useful and really interesting.

WhatamIdoing (talkcontribs)

I'm not sure that you understand the purpose of the dashboard. It is designed to show abnormal rating patterns (very high or very low) within the last 24 hours.

Reply to "More accurate rankings of ratings should consider number of reviewers"