Hit stats aggregation

Page views
What's available from each hit:
 * page name
 * referrer:
 * local wiki/page
 * foreign URL
 * client:
 * geoip lookup (country or city-resolution?)

Image views
What's available from each hit:
 * image name
 * thumbnail pixel width
 * page number [for pdf, djvu]
 * referrer:
 * local wiki/page
 * foreign URL
 * client:
 * geoip lookup (country or city-resolution?)

General

 * Aggregation time resolution?
 * View style?
 * How much to combine / make available?

Storage
Note that Domas has tended to recommend flat files for this kind of info; it can eat a lot of database space, as you've got lots of per-row overhead.

If we're sticking these in MySQL might want something like this...


 * img_stats
 * is_id (int) primary key
 * is_img (varchar) -> img_name (or we could add a damn id to the image table!)
 * img_stats_period
 * isp_id (int) primary key
 * isp_img (int) -> is_id
 * isp_timestamp: start time
 * isp_period (int): number of seconds covered by this time period [5 minutes, 1 hr, 7 days, whatev]
 * isp_hits (int) -- total hits

Now for regional breakdowns:
 * img_stats_region
 * isr_id (int) -> isp_id
 * isr_country char(2)
 * isr_hits (int)

Breakdown by thumb size:
 * img_stats_size
 * iss_id (int) -> isp_id
 * iss_size_min (int) <- break down into ranges since we allow open-ended sizes :P
 * iss_size_max (int)
 * iss_hits (int)

Breakdown by source?
 * img_stats_referer
 * isr_id (int) -> isp_id
 * isr_referer -> sr_id
 * isr_hits (int)


 * stats_referer
 * sr_id (int) primary key
 * sr_url varchar(255)
 * potentially some annotation abilities?

Counter history view
Survey of existing hit history UIs:

trendingtopics.org:



stats.grok.se:



Further aggregation and use cases

 * Aggregate page/image hits per category
 * (eg for letting GLAMs know how much their files are being used)
 * Identify the most active new pages
 * ^ and compare against lists of failed searches to see how well new activity is serving people on the smaller sites