Topic on Talk:Quarry

199.111.226.142 (talkcontribs)

Im looking for a query that can give me user id/name alongwith all activity associated with the user such as comments/interaction with other users as well as the text of edit made on pages

199.111.226.142 (talkcontribs)

even the table names that need the join will be good to know. I'll write the query by myself

Milimetric (WMF) (talkcontribs)
Cr29uva (talkcontribs)

Hey @Milimetric

Thanks a ton for your reply! Yes i am trying to do something along the lines of the Detox research. Have a couple of doubts. Would be great if you have any insight into this -

  1. The database layout page says that there's a table called text which contains the "text content" pertaining to the revisions made. But when i try to access that table using Quarry , i get an error saying that the table doesn't exist. Manual:Text table so it does seem like the text should exist in the table format but i wonder why it says that the table doesn't exist. Any idea? It would be ideal to get the data from a sql like table format, my last resort would be to parse through the XML dump....
  2. What would be the best way to extract conversations made on article or user talk spaces? I'm interested in analyzing user conversations and link that to the probability of a user getting blocked in the future....
Milimetric (WMF) (talkcontribs)

@Cr29uva, yes, the text table isn't replicated to these databases because it's too big and hard to strip private parts out of it. The XML dumps are your best bet for now. We're close to releasing a new dataset that's more analytics-oriented, that may be of interest to you in the future. It will have user blocks as applied over the history of each user (so, for example, if someone's blocked five times, it will have those periods with a start and end). It won't have text initially either, but we're working on adding that over the next year.

Reply to "user activity data"