User:DarTar/SandBox

=Feb 2011 Update=

Overview
In January 2011 en:User:DarTar joined the Article Feedback research team. We ran a new series of analyses based on data from the Phase 1 Article Feedback tool to try and address a number of research questions that may help inform the design/implementation of this tool. The analysis at this stage is exploratory and we will delve into the preliminary findings further over the coming weeks. Feedback is always appreciated in the talk page.

The questions that we considered for the present study are the following:
 * Are ratings reliable indicators of article quality?
 * Are there correlations between measurable features of an article (size, number of citations, views, quality-related templates) and the volume/quality of ratings?
 * Do different classes of users (anonymous vs. registered) rate articles differently and consistently within the same group?
 * What factors drive conversions (i.e. the decision to rate an article after visiting it)?
 * Are there significant changes over time in rating?
 * Do changes in article features produce shifts in ratings or rating volume?

We decided to focus on well sourced ratings in particular as an initial case study to try an understand the relation between the presence of citations vs. source/citation needed templates (or lack thereof) on the one hand and the perceived quality of the article on the other hand.

The dataset
The sample consists of a total of 727 articles  selected from the PPI project + an additional list of articles related to special events We collected ratings for articles in this sample between September 2010-January 2011 (hereafter: "observation period") for a total of 52787 ratings, 94.3% of which were generated by anonymous users vs. 5.7% by registered users. The mean number of ratings is 72 per article but, as expected, the distribution of ratings/article was very skewed, as detailed below. On top of ratings available from the Article Feedback tool, we obtained the following data:
 * daily volume of article views (from http://stats.grok.se)
 * daily changes in article length (via the Wikipedia API)
 * daily changes in number of citations (via the Wikipedia API)
 * daily changes in the number of citation/source needed templates (via the Wikipedia API)

Article Length
The sample selected for this study is not a random sample of Wikipedia articles and as such it shouldn't be considered representative of Wikipedia articles at large. In particular, the sample includes articles that were already at a very mature stage at the beginning of the observation period (such as en:United States) or articles that were created from scratch and underwent a dramatic volume of edits during the observation period (such as the en:GFAJ-1 article). As a result articles in the sample differ substantially in initial size and in how much they changed during the observation period, both in absolute terms (total number of bytes added) or relative terms (proportion of bytes added with respect to the initial size). Figure 1 shows the distribution of (initial) article lengths (using Log binning). Figure 2 shows the distribution of relative length change during the observation period (top) and the relation between relative length change and initial length.

Rating volume
Articles in the sample differ significantly in the volume of ratings they generate, with a strongly skewed distribution of the number of ratings per article across all four rating dimensions. Figure 3 shows a histogram with the distribution of the total number of ratings per article, with logarithmically spaced bins.



However, it's interesting to note that the completion rate for articles that get rated is very high, i.e. when people decide to rate an article, they consistently do so along all four dimensions. Figure 4 compares the volume of ratings per article along different dimensions (each dot represents an individual article) and shows a very strong linear correlation between the number of ratings an article produces across any 2 dimensions. To put it differently, it's very rare for articles to display a very high number of ratings in one dimension only with few ratings in the other 3 dimensions.