More accurate rankings of ratings should consider number of reviewers
After further consideration, it might make sense for there to be a threshold before appearing on the list. There may already be, lest you end up with the highest-rated article being one that was given all fives by a single user. The question is, what should the threshold be?
I submit that a "statistically valid sample size" should be determined. To do that, you need to have a denominator to start with. I think that denominator should be based not on the entire number of Wikipedia users, but perhaps on one of the following:
- The total number of distinct users who have rated any pages in that particular instance of Wikipedia
- The total number of points awarded in ratings on all pages of that instance.
By "instance," I mean a language or locale. For example, en.wikipedia.org.
So, to summarize, I mean that if I put up a page on trapeze cats, and a few people rate it as five stars (perhaps including me -- I don't know if that's possible), it doesn't show up on the list. But when that 600th user rates me, then my votes have become statistically valid, and I can appear on the list.
Happy to talk more about this with anybody who is interested. I think the ratings are really useful and really interesting.