Topic on Talk:Wikistats 2.0 Design Project/RequestforFeedback/Round1/Detail page single wiki

How important is it to be able to download the data behind this page?

7
Milimetric (WMF) (talkcontribs)

What formats would you like besides CSV, TSV, and JSON?

NickK (talkcontribs)

Is it possible to have good old tables for all wikis available somewhere? This is the simplest way for using data. It would be great to be able to download tables like at https://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm , without having to download wiki per wiki

Otherwise yes, it is really important (at least for me) to be able to download data. For instance, recently I made a study to find out what impacts page views more: increase in bot edits or increase in number of human active users? This was quite simple as I could download (a) pageview table for all wikis, (b) active users table for all wikis, (c) bot edits % for all wikis, (d) article counts for all wikis. Then it was quite straightforward as I had to process just 4 spreadsheets and could select most prominent examples (e.g. wikis with most bot edits, wikis with highest page view growth) for each. It would have been way more difficult if I had had to download wiki per wiki.

To sum up yes, easy downloading is essential for me.

Milimetric (WMF) (talkcontribs)

Good to know. So all-wiki data is important, and access to the raw data is important. We were talking about having an "advanced" interface for these kinds of use-cases, so we can keep the first thing people see more approachable. Do you have use-cases that would benefit from the simpler interface, or most of your work is cross-wiki?

NickK (talkcontribs)

Yes, of course. I cited them in some other section (sorry, can't find it with Flow, some Topic:Tkaqwertyuiop...): I also look at simple indicators, like increase in number of active users on Commons during Wiki Loves Earth, or increase in new articles on one particular Wikipedia during an edit-a-thon. Graphs are useful for such indicators.

NickK (talkcontribs)

Oh, and one more example: I wanted to find the number of active users for all projects per language and find most active languages per region. Once again, straightforward if you can download data (sum of few tables plus regions from https://stats.wikimedia.org/EN/Sitemap.htm ), but hardly possible without downloading option.

We are not Google who hides trends data and shows only relative graphs for Google Trends. Please make data available for download like one could easily copy old tables. Or keep old tables (no matter they are ugly, having data is essential).

MCruz (WMF) (talkcontribs)

It is important, specially for research purposes and comparative studies.

Erik Zachte (talkcontribs)

Let me testify that this feature was asked a lot for Wikistats 1.0 and I mean really a lot. Many people wanted to play with data themselves. Sadly it never happened, as it would either complicate the code further or require a major rewrite. Wikistats 1.0 does data aggregation and presentation in one go (not a lucky choice), so code is already interspersed with output statements. Writing a flat csv file besides a not quite flat html table would make the code even harder to maintain.

Reply to "How important is it to be able to download the data behind this page?"