User:Jeblad/Temporal statistics

Temporal statistics are an adaption of the mw-core to allow calculation of temporal statistics for the last hours, days or weeks.

In the basic configuration it is intended for low traffic sites without Squid servers in its configuration without caching of traffic data, that is $wgHitcounterUpdateFreq less than or equal to 1. If the traffic is large there will be internal caching to reduce the load impact on the database server if $wgHitcounterUpdateFreq is larger than 1.

In the basic configuration without squids no external infrastructure is necessary to manage external logfiles. If there are squids in front of the web servers a maintenance script must be used to populate the database tables with statistics data. The data from the squids will then be integrated into the same presentations as usual traffic.

The adaption uses ring buffers to hold temporal statistics, as default either none or of length 2. One slot in the ring buffer accumulates statistics, then will move on to the next one when the unix time epoch goes from one hour to the next, from one day to the next or from one week to the next. The length of the ring buffers, less one, will give the maximum length of the statistics with the given resolution. In the default configuration that means the previous whole hour is collected in a single slot, the previous whole day or the previous whole week.

Only real pages gets statistics, that is no special pages will have statistics. This should have few practical consequences.

Special page
There will be a special page to get statistics for a single page. The call syntax is Special:Pagestatistics/pagename  and it will display the collected statistics for both global statistics and for temporal statistics.

Configuration
Additional configuration in the, with adaptions in

The configuration are used for modulus operations for calculating the present slot. That means any changes of the configuration will trash previous collected statistics. A maintenance script can be made to recalculate slot indexes, and then reorganize the database accordingly.

If the length of the ring buffers are changed, the table has to be kept temporarilly, the new table created and the statistics reorganized and moved back. This is the only way to generally keep the data in the slots. If the ring buffer is shorted, then old data are forgotten but the ring will be filled. If the the ring buffer is lengthened the ring will only be partially filled.

Altering of page table
The following shows how the database table page are changed to accommodate the new functionality. If it is necessary to use longer time series the number of slots for,   and   are increased accordingly.

Note that some code should be added to verify that the number of slots stays in accordance with the definitions of,   and  .

Note that the previous reflects the database scema used for. If the numbers are changed the database scema must be changed accordingly.

Altering the site_stats table
The following shows how the database table site_stats are changed to accommodate the new functionality. If it is necessary to use longer time series the number of slots for,   and   are increased accordingly.

Note that some code should be added to verify that the number of slots stays in accordance with the definitions of,   and  .

Note that the previous reflects the database scema used for. If the numbers are changed the database scema must be changed accordingly.

Altering the hitcounter table
The following shows how the database table site_stats are changed to accommodate the new functionality.

Patch for Article.php
The patch is an adaption of  to use additional columns in the database table for the page, typically named mw_page.

Note the incViewCount -function does not include the code for bulk updates.