Topic on Talk:Quarry

Jump to navigation Jump to search
RhinosF1 (talkcontribs)
Milimetric (WMF) (talkcontribs)

Just to clarify, do you mean:

Let the list of people on that page be L. You want to find this set:

{user | user <- L and user has not made any edits on this wiki in 5 years}

What about other wikis? Also, what do you mean by "remove them", like remove them from the list?

To find users that haven't made any edits in one wiki you could do:

 select user_id
   from user
  where user_id in (... your list ...) 
    and user_id not in
        (select distinct rev_user
           from revision
          where rev_timestamp > '20140301000000'
        )
;

But that subquery would be impossibly slow on enwiki, so this is a heavy query. Not sure how to optimize it off the top of my head. Might just be better to make a temporary table somewhere out of the users that did edit in the past 5 years and query that.

RhinosF1 (talkcontribs)

Editors in last 5 years would also work. That query failed (https://quarry.wmflabs.org/query/34487). Just enwiki is what we need. We need to end up with a list of users who have/have not edited in the past 5 years and are on that page.

Milimetric (WMF) (talkcontribs)

you substituted "https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Apple_Inc./Subscribe" where I have (... your list ...) in the query, but that's not how Quarry works. Quarry just executes SQL, it doesn't now how to read the wiki and parse the user ids out of it. You'd have to do that manually, or build a query to get a list of user_ids based on user names. You can also make a list of user names that looks like this:

('user name 1', 'user name 2', ...) and pass that in the query above as `user_name in ('user name 1', 'user name 2')`.

RhinosF1 (talkcontribs)

Sorry for the late reply, The query was done in the end via production for us