Hit stats aggregation

From MediaWiki.org
Jump to: navigation, search

Contents

[edit] Page views

What's available from each hit:

  • page name
  • referrer:
    • local wiki/page
    • foreign URL
  • client:
    • geoip lookup (country or city-resolution?)

[edit] Image views

What's available from each hit:

  • image name
  • thumbnail pixel width
  • page number [for pdf, djvu]
  • referrer:
    • local wiki/page
    • foreign URL
  • client:
    • geoip lookup (country or city-resolution?)

[edit] General

  • Aggregation time resolution?
  • View style?
  • How much to combine / make available?

[edit] Storage

Note that Domas has tended to recommend flat files for this kind of info; it can eat a lot of database space, as you've got lots of per-row overhead.

If we're sticking these in MySQL might want something like this...

  • img_stats
    • is_id (int) primary key
    • is_img (varchar) -> img_name (or we could add a damn id to the image table!)
  • img_stats_period
    • isp_id (int) primary key
    • isp_img (int) -> is_id
    • isp_timestamp: start time
    • isp_period (int): number of seconds covered by this time period [5 minutes, 1 hr, 7 days, whatev]
    • isp_hits (int) -- total hits

Now for regional breakdowns:

  • img_stats_region
    • isr_id (int) -> isp_id
    • isr_country char(2)
    • isr_hits (int)

Breakdown by thumb size:

  • img_stats_size
    • iss_id (int) -> isp_id
    • iss_size_min (int) <- break down into ranges since we allow open-ended sizes :P
    • iss_size_max (int)
    • iss_hits (int)

Breakdown by source?

  • img_stats_referer
    • isr_id (int) -> isp_id
    • isr_referer -> sr_id
    • isr_hits (int)
  • stats_referer
    • sr_id (int) primary key
    • sr_url varchar(255)
    • potentially some annotation abilities?

[edit] Counter history view

Survey of existing hit history UIs:

trendingtopics.org:

Page hit counter sample from trendingtopics-org.png

stats.grok.se:

Page hit counter sample-grok-se.png

[edit] Further aggregation and use cases

  • Aggregate page/image hits per category
    • (eg for letting GLAMs know how much their files are being used)
  • Identify the most active new pages
    • ^ and compare against lists of failed searches to see how well new activity is serving people on the smaller sites
Personal tools
Namespaces
Variants
Actions
Site
Support
Download
Development
Communication
Print/export
Toolbox