Thread:Talk:Article feedback/Irrelevant/reply (56)


 * "What ALL the statistics tools show is that every time ONE person views a non-celebrity article, TEN people ARE viewing a celebrity article."

I've seen no stats tools that show this. I've been asking around, and nobody seems to be aware of any stats tool that actually provides traffic stats by general subject. If you've got one, then I've got several people who want to know where you've been hiding it.

Let's say that 10% of Wikipedia articles are about celebrities, and that half of the high-traffic articles are about celebrities (and, therefore, that half of the high-traffic articles are not about celebrities). That suggests that half of the traffic from the high-traffic pages is about celebrities (and half is not).

But we have said that celebrities are only 10% of Wikipedia articles. So celebrities get half the traffic—from a small subset of articles. But celebrities are only a small fraction of the overall articles: 10% of articles account for 50% of high-traffic articles, so they account for less than 10% of the moderate and low-traffic articles. This means that the other 90% accounts for more than 90% of the traffic on non-high-traffic pages.

The net result is that the 10% of articles about celebrities account for more than 10% of page views—but probably less than (probably much less than) 20% of the total traffic, because their "extra" views on the top end are dwarfed by the 90% of normal articles. (If we assume a simple distribution, the celeb articles (assumed to be 10% of total articles) have basically no chance of account for more than 15% of page views/traffic, but I have no idea how well the simple distribution conforms to reality.)

What you're missing is this: The English Wikipedia gets 5.7 billion page views a month. All of the top 1000 articles (excluding special pages) accounts for less than 10% of the traffic. That means that the highest-traffic celebrity pages account for less than 5% of total page views—and their effect on the other 90% of traffic is disproportionately low (because they are overrepresented in the top-traffic articles, they are automatically underrepresented in the rest of the traffic categories).