Admin tools development/Global CheckUser

From mediawiki.org

It has been requested by numerous parties that the ability to run CheckUser across all wikis simultaneously would enhance Stewards' ability to locate and block spam bots.

Requirements[edit]

  • Current local tool exists at m:Special:CheckUser; provide a parallel one at m:Special:GlobalCheckUser.
  • Accessed through new permissions (global-checkuser and global-checkuserlog) - to be granted to the stewards, ombudsmen and staff global groups.
  • Will re-use local tool's code as much as possible to avoid code repetition/divergence.
  • Global CUs are logged on meta at m:Special:Log/pancheckuser, visible only for those with global-checkuserlog right.
  • Each local CU is also to be logged on meta at m:Special:Log/gblcheckuser, visible only for those with global-checkuserlog right.

Workflow[edit]

  • On m:Special:GlobalCheckUser, user uses tool as on m:Special:CheckUser.
  • Tool uses (where possible) existing functions to run CU on each wiki.
  • Results are returned to a single UI on Meta, split by wiki.
  • ? Potentially, results to return asynchronously as they come in.

Questions[edit]

Logging[edit]

  • Where are log entries placed?
    • Global CU log - Hard for local wikis to see the global action
    • Local CU log - Could potentially be a lot of log spam and cause many questions
      • could place in local log and identify them as global CU as an FYI
Initial conclusion: An integrated global log on meta for all CU actions (global and local) to be logged in one central place. Possibly also a copy of this locally for local CUs to see activity that went over their wiki.

Comments:

  • Is there scope for a positive hit to be logged on the local wiki? It would be preferable, if possible, for a local CU to be able to see search hits for local data checks, negative hits are not as necessary. — billinghurst sDrewth 23:42, 14 February 2013 (UTC)[reply]
    I don't think that's possible as you can't easily see which results are positive and which not. I guess it's better to have it logged in both the global CU log and locally with the comment "global CU" or something like that. So that everyone will see the CUs on their own project & having all global CU logs separated from the local ones. Trijnstel (talk) 10:11, 15 February 2013 (UTC)[reply]
    700+ logged entries per search, that is going to get awfully ugly, awfully quickly. Some of the quiet wikis will get more traffic from CU checks than all edit counts combined, hence my thought bubble about successful. If it is too hard, then we leave it. — billinghurst sDrewth 15:00, 15 February 2013 (UTC)[reply]
    I worry that this will make local CU logs very messy and confused. Possibly we could have a local view of the global CU log that shows each GCU action in the local context, so that local CUs can see if a range was otherwise searched and follow-up with the Steward directly? Jdforrester (WMF) (talk) 19:17, 15 February 2013 (UTC)[reply]
  • Could you please explain the sentence "Logged to a single global CU log, which will include all CUs, local and global." - will the people with access to global checkuser (stewards I assume) be able to see the CU logs of all wikis including the global ones in one (this is what I would support as right now we're not able to check the local activity while that's part of our job) OR will all checkusers of all wikis be able to see the global CU log, which basically only contains the CU actions of stewards via global checkuser? Trijnstel (talk) 21:46, 14 February 2013 (UTC)[reply]
    The former; otherwise this would be releasing information to local CUs. Jdforrester (WMF) (talk) 21:49, 14 February 2013 (UTC)[reply]
    To clarify, if the tool allows cross-wiki non-pan-wiki searches (e.g., search all English language wikis, or all Wikipedias, or hewiki/hrwiktionary/huwikisource or whatever), then giving all CUs everywhere global-checkuserlog would give some local CUs information to which they do not have a need to know. It would get hugely messy to create a global group of log permissions for each wiki and grouping with inheritance. The alternative if we feel we must show local CUs these searches is to have a local copy of the relevant parts of the global CU log, which would also be messy. Jdforrester (WMF) (talk) 19:17, 15 February 2013 (UTC)[reply]
    Make it whatever is reasonable and practicable. If there is no easy means to show local results, that to me then becomes another factor about wikiset approach to exclude (see argument in #Scope). If they have CUs for the vast majority of issues, they will know whether to run checks or not, and the bigger wikis have enough people to coordinate it. 12:00, 16 February 2013 (UTC)

Interface[edit]

  • What does the global search results set look like? (Do we group by wiki, by action type, by IP, by day)?

Scalability[edit]

  • Are smaller limits needed given the potential to slow all wikis down at once?

Scope[edit]

  • As stewards are only allowed to CU on projects without CUs (with exceptions of course), is it only possible to use this tool only on those wikis? Trijnstel (talk) 21:46, 14 February 2013 (UTC)[reply]
    • We could build the tool to respect a wikiset, but this would defeat a lot of the point of a pan-wiki tool. Jdforrester (WMF) (talk) 21:49, 14 February 2013 (UTC)[reply]
      • I agree, the policy empowers stewards to perform checks when needed for crosswiki issues, and by the user of this tool would be Stewards and WMF staff, and obviously only for crosswiki issues, else there would be no point in using it. Should it be felt that the policy does not the allow the usage of this tool, I suggest that the policy be updated. New tools being deployed might at times require updating of policies of course, but the fact that policies are not yet equipped to deal with no tools only means that they should be updated, not that tool development should be forever halted until policies allow it. In any case, I believe this tool is indeed allowed by the checkuser policy, as it would obviously only used for crosswiki issues. Snowolf How can I help? 21:57, 14 February 2013 (UTC)[reply]
      • Yep, agreed with Snowolf. As a follow-up to my question: what if we check a /16 range and we find out that it has LOTS of traffic on multiple wikis, so much that the page doesn't load -- how can we prevent that? (besides checking a smaller range) I was thinking about a collapse tool so that you can 'open' up the different wikis. Mathonius suggested to use checkboxes, 1 for all projects, 1 for only wiktionaries/wikibooks/etc and 1 for only the projects without CUs (for example)? Trijnstel (talk) 22:06, 14 February 2013 (UTC)[reply]
        • The idea we were looking at for this (though as the Product Manager I shouldn't freelance too much on technical things!) is that each wiki's data would be pulled into a <div /> for each wiki, blanking out if there was no activity there. Obviously there will be load issues with using this tool, so we'd want to build it in a way that makes sense for you and avoids you wanting to re-load the same page. This could also potentially be progressive through different wikisets if you think that would make sense. Jdforrester (WMF) (talk) 22:28, 14 February 2013 (UTC)[reply]
    The primary function of the global tool request for stewards is primarily to deal with spambots, and xwiki vandals, and this is primarily where there are no CUs. I am concerned with collateral damage aspects of a pan tool where a user is blacklisted on one wiki, however, can be of good reputation on another, or even could be operating under a local account with good edits; the purpose of stewards is not to be seeing data that comes from local only situations. So is it possible that we we look to a global CU tool that respects wikisets, then implement two wikisets, one restrictive set for general use that respects local wikis with their CUs (this would be the standard setting for stewards), and an "all public wikis" wikiset that the right can be applied on a short term "as needs" basis (conditional grant). — billinghurst sDrewth 00:00, 15 February 2013 (UTC)[reply]