Talk:Anti-Harassment Tools/CheckUser Improvements

From mediawiki.org
Latest comment: 2 years ago by Enterprisey in topic Wording tweak
Previous discussions about feature design

What do you think of the preliminary check idea?[edit]

  • Seems neat but not too different from the existing Special:CentralAuth. When I want to look up multiple accounts I'm usually more interested in seeing if there's any overlap in technical data. I also generally only care about enwiki, since that's the only place I can take any action.

    What you have here seems better fit for a public-facing special page, maybe Special:CentralAuthMultiple, or something. Mostly of use to stewards, not local CUs. MusikAnimal talk 19:47, 4 October 2019 (UTC)Reply

  • I agree with what MusikAnimal said. -- zzuuzz (talk) 05:37, 8 October 2019 (UTC)Reply
  • @MusikAnimal and Zzuuzz: , the idea here was to provide the information Special:CentralAuth provides but in one view for all of the linked accounts instead of the Checkuser having to do the lookup for each account separately. We can definitely restrict the information to a single wiki (the wiki where the check is being run) and offer an option to view information for all wikis (for global cases). With that in mind, do you think this could be useful? -- NKohli (WMF) (talk) 12:23, 8 October 2019 (UTC)Reply
  • @Ajraddatz: I don't have access to that special page but my understanding is that that interface, though it provides a similar level of information, is for when you need to lock multiple accounts. We are trying to surface all that information in CheckUser so basic information about the accounts can be viewed before deciding whether a check is warranted. Do you think it would help to tie in Special:MultiLock with CheckUser? -- NKohli (WMF) (talk) 19:00, 8 October 2019 (UTC)Reply
  • Comparing edit counts and registration dates could be helpful, but in my workflow I usually already know what accounts I want to check (and that there's strong enough evidence to warrant those checks). The hard part is identifying any crossover in technical data. So in my case I'd probably go straight to the second screen that you have listed. MusikAnimal talk 21:10, 8 October 2019 (UTC)Reply
  • It could be, as I've heard stories of users accidentally being checked (and even though I rarely used CU when I had access, I managed to mistype an IP range). But I would hesitate to make it mandatory for everyone. --Rschen7754 04:42, 9 October 2019 (UTC)Reply
  • The preliminary check page should be part of Special:CentralAuth (i.e. being allowed to supply multiple users) so that everyone can benefit from it. There is no sensitive information on the proposed mockup that justifies restricting it to checkusers. MER-C (talk) 13:00, 12 October 2019 (UTC)Reply
  • I think this is a useful tool. But as on the other comments, it should be a public tool. That public tool should have a link to perform a check on the accounts, only visible for checkusers (like for checkusers there is a link to the checkuser tool on the contribution page). For the tool on a specific wiki it should only show that specific wiki, with a link to meta to do a crosswiki check. With that, the local tool should link to the local contributions, the crosswiki one should per wiki have a link to local contributions and have a link to guc. Regarding guc, I think that should then not be an 'external' tool anymore, but integrated as a special page on meta. Akoopal (talk) 16:13, 13 October 2019 (UTC)Reply
  • @MER-C and Akoopal: I agree that it is all public information and could be made available to all users. Our foremost priority is to make improvements to CheckUser. Testing this feature with a small group of users will allow us to gather quick feedback and iterate faster. Regarding making guc a special page, that is definitely highly desirable but it would be a very big project in itself. We want to focus our efforts towards reducing the biggest pain points of using CheckUser first. -- NKohli (WMF) (talk) 05:44, 17 October 2019 (UTC)Reply
  • @NKohli (WMF): Agreed to start testing small, but for the sake of separation, make it two separate pages I would say, this tool and the new CU tool. What about my comment having the crosswiki one on meta, and on the local wiki just offer it for that wiki, and link there to the normal local contributions instead? Akoopal (talk) 21:28, 17 October 2019 (UTC)Reply

How else could the CheckUser information be presented?[edit]

  • @Huji: This is definitely something we have been thinking about. We would need to coordinate with the Operations and Security teams to fulfill that request. While we do that groundwork, we hope to meanwhile improve existing CheckUser data presentation. -- NKohli (WMF) (talk) 12:29, 8 October 2019 (UTC)Reply
  • ...

What do you think of the information being displayed in the actual CheckUser step?[edit]

  • I assume this is a proposed makeover for the "Get IP addresses" view. This seems quite good; it would quickly tell me whether or not the the accounts might be related, and whether I need to check individual IPs or ranges. I don't need a separate tab for each account I look up. Good stuff!

    For "Anything else?", we should include all the same information we have currently, such as block/lock status and user groups. Preferably "Activity" would show a full date range of editing activity, as it does currently. While I know this isn't a proper wireframe, I don't think a tabular format would work well given the amount of information and links we'd need to display. I personally find the status-quo to be readable enough. Note also we customize the links with w:MediaWiki:Checkuser-toollinks.

    I don't think this UI could replace "get users" (and obviously not "get edits"). We need to see individual IP users in addition to accounts, and the option to see everyone's edits. This helps to further conclude if there is a connection and whether hard blocks are appropriate. There are use cases to "get edits" by a single account or IP, too. MusikAnimal talk 23:00, 8 October 2019 (UTC)Reply

  • Make accessible full HTTP headers. The WMF is automatically supplied that information, and it may be useful in addition to UA/location. MER-C (talk) 13:02, 12 October 2019 (UTC)Reply

How can we improve the CheckUser logs to be more helpful with the above proposed improvements?[edit]

What are we missing?[edit]

Note: We are not adding any new information to the tool, as a first step. We are looking to improve access to the information already being presented in the tool. We want to be able to deliver value quickly and iterate based on the feedback we receive.

  • Making phab:T146837 a reality would be a dream come true. Even just doing exact matches (no wildcards) would be of great help. MusikAnimal talk 19:50, 4 October 2019 (UTC)Reply
  • As far as I can see, this is just replacing the initial steps - the 'preliminary check' is similar to the CentralAuth page, which is not usually important, and the 'Checkuser' section is just replacing the 'Get IP addresses' stage in the current tool. The latter is of course a very important starting point for looking at accounts, and I'm sure it can be improved, but is only ever part of the story. The all important bit is the 'Get edits' part, or 'ipcheck' as it's (perhaps) called here, which I don't see discussed at all. Maybe no changes, or something else is planned for it? That would be OK, but it's not clear from this page. The other important bit is the 'Get users' part of the current tool, which is indispensable for blocking large numbers of accounts - again not mentioned. The page does rather ominously mention 'showing all the necessary data in one interface', which I find difficult to imagine. So perhaps we are missing how this proposal fits into the rest of the picture? -- zzuuzz (talk) 05:37, 8 October 2019 (UTC)Reply
  • I see the ipcheck refers to the ipcheck tool, which is sometimes useful but just one of many. What the 'Get edits' part currently does is add the context for what is being done at the time, and the precise timeline of events. I'm also having difficulty imagining what the proposed data will look like for a prolific multi-account socker using multiple addresses in a range at multiple times. These can be picked up relatively easily with the current 'get edits' tool. -- zzuuzz (talk) 13:43, 8 October 2019 (UTC)Reply
  • @Zzuuzz: Thanks for the feedback. I'm curious to learn more about your workflow with CheckUser. In the several CheckUser interviews we did, we rarely saw Get edits being used. Get IP addresses and Get users were primarily used. When is it that you would use Get edits and what does it offer that the other views don't? The current view described on the page would list all the IPs used by all the accounts under question (and also identify other accounts using the same IPs). Could you give me a made up example of a case where you think the proposed views won't work? Thank you so much! :) -- NKohli (WMF) (talk) 16:14, 8 October 2019 (UTC)Reply
  • OK, I can accept I might be a bit of an outlier. I hardly ever use 'Get users', except to block many accounts after having got the edits. I wonder if one difference might be that I am rarely looking to confirm whether one user is another, instead I am mainly looking for other sockpuppet accounts and especially the collateral for IP blocks. In the language of this page I will more often do most profiling after the checking. I'm also looking a lot at ranges with multiple people. You can't say that one user (that you've come across) is another just because they use the same IP and user agent - you need to look at what they're doing and the timeline of events (things like IP switching and browser changes are only really clear in this view). Additionally you see all the data in one view (filter hits, account creation, deleted edits, password resets, ...). So I think some of this might be lost. I also have to repeat though, being able to block 50 accounts from the CU results with a couple of clicks is a Great Thing. -- zzuuzz (talk) 20:28, 8 October 2019 (UTC)Reply
  • I use "get edits" whenever I need a timeline. For example:
  • 2019-10-09 00:10 GermanyFan edits History of Germany
  • 2019-10-09 00:12 StarWarsFan01 edits Star Wars
  • 2019-10-09 00:13 StarWarsFan02 edits Star Wars
  • 2019-10-09 00:14 StarWarsFan03 edits Star Wars
  • 2019-10-09 00:14 GermanyFan edits History of Germany
  • 2019-10-09 00:15 StarWarsFan04 trips an edit filter on Star Wars
  • Everyone is on the same IP address and is indistinguishable from each other. However, GermanyFan has no interest in Star Wars whatsoever, and the various Star Wars fans don't care about the history of Germany. "Get edits" is useful in highlighting the Star Wars fans as potential socks and differentiating GermanyFan from them. Otherwise, I have to spend a lot of time in the Editor Interaction Analyser. Like Zzuuzz, I like being able to see a timeline of everything that has happened on an IP address or IP range. NinjaRobotPirate (talk) 10:57, 9 October 2019 (UTC)Reply
  • The workflow for another type of typical check looks something like this:
  • Vandal A is an obvious LTA sockpuppet - instant block.
  • We probably want to block their IP address because we've blocked a lot of their accounts recently (Vandal B), and also block (and revert!) the other 30 accounts they've probably created that we don't know about. Get IP addresses for User A - we find they are editing throughout a highly dynamic /39 IPv6 range.
  • Get edits for the range (or users, it doesn't really matter here). Examine the contributions and timelines for the range, to identify the other accounts.
  • This is a type of check that most CUs will have done and know how to do. But it's at this point I can't imagine what the proposed check or results will look like:
CheckUser
Username Activity IP address User agent Anything else?
Vandal A August 12, 11:00 fe80:0:0:1::1 (ipcheck) - 1 edit Chrome 65, Windows 10
Vandal B August 12, 11:10 fe80:0:0:c::c (ipcheck) - 1 edit Chrome 65, Windows 10
Linked accounts (below accounts were found associated with the IP addresses found above)
None?
  • @NinjaRobotPirate and Zzuuzz: This is extremely helpful, thank you! I understand the use case for Get edits much better now. Would it be helpful if the tool can generate the timeline for edits (similar to what NinjaRobotPirate made but with more info) based on the usernames and IPs that are input on the first screen? I understand right now the tool can only do it for one user/IP at a time. If that will be helpful, what kind of information would you expect from such a timeline? -- NKohli (WMF) (talk) 14:01, 11 October 2019 (UTC)Reply
  • I didn't quite understand the usefulness of "get edits" at first, either. The tool provides much more information than what I posted, of course, but I was just trying to give an uncomplicated example (I wouldn't want it streamlined down to what I posted). If by "first screen" you mean the table you labeled "preliminary check", I don't really know. It sounds like it would be useful, but it's difficult for me to visualize exactly what the results would be. I'm used to the current way of doing things. I hated the CheckUser UI at first, but it's grown familiar. NinjaRobotPirate (talk) 15:43, 11 October 2019 (UTC)Reply
  • @NinjaRobotPirate: No, the screen before that (don't have a mock yet), in step 1 where we take an input of all the usernames/IPs you want to look at. We can come up with a Get edits like interface for the edits by those users/IPs in the tool which can generate a timeline for you, similar to the one you posted. You wouldn't have to look up the Get edits for each user like you have to right now. What are the key things that you are looking for, when you use Get edits? I imagine if users are checking the same pages, that is something to make note of. What else do you do that can possibly be flagged by the tool automatically? (also @MusikAnimal: on this thread) -- NKohli (WMF) (talk) 17:04, 11 October 2019 (UTC)Reply
  • "Get edits" can be used for both usernames and IP addresses/ranges. On usernames, it's occasionally useful to see all of someone's edits annotated with their system configuration. You can see this information elsewhere but not at a glance. "Get edits" can also be used on IP ranges to see all edits made by all accounts on that IP range. If you know someone uses a really generic system configuration, "get users" might not be as useful as "get edits". "Get users" can tell you who all the Firefox users are on a Verizon IP range, but sometimes you don't really care about that. You're more concerned with finding Verizon customers who edit Star Wars articles regardless of browser. "Get edits" will tell you that. It's also useful for finding edit summaries. For example, it's easy to find Verizon customers who habitually call people fascists in their edit summary in "get edits". "Get users" is great for finding sock puppets based on technical matches, and sometimes I use it to find behavioral matches, too. The problem is that I tend to open up a lot of tabs if I use it this way. For example, I'll sometimes open a new tab for each suspicious-looking editor and skim over their contributions. If I already know the sockmaster and I can skim over their edits fast enough, this can be substantially faster than "get edits". "Get edits" returns a lot of useless information, especially on busy IP ranges. If I want to see Firefox users on Windows 7 who edit Star Wars from Verizon IP ranges, "get edits" will list them. However, it also lists everybody else, and there's no way to filter out the millions of Chrome users, Windows 10 users, or MacOS users. At least with "get users" I don't have megabytes of useless information cluttering my screen, even if it doesn't tell me anything about who's editing Star Wars. I'm not really sure if any of this answers your questions... I can try to think more about this later. NinjaRobotPirate (talk) 02:27, 12 October 2019 (UTC)Reply
  • I've always wanted some indication of the language, ie, the Accept-Language or perhaps Accept headers. And I also think not using the enwikiUserName (or equivalent) cookie is a totally missed opportunity. -- zzuuzz (talk) 05:37, 8 October 2019 (UTC)Reply
  • I'm not sure I'm on the same page as my colleagues. I find I often do things differently. That said, I have some features I'd like to see in the current interface. They mainly consist of adding the ability to sort and to filter. I'll do the Get IP addresses first:
    • Sort by IP rather than chronologically, although it's fine to sort chronologically within IP.
    • Give me IP ranges without my having to copy them into the text box at the bottom.
  • Get users:
    • Filter out blocked users.
    • Filter out unblocked users.
    • Filter out unregistered users.
    • Filter out no-edit users.
    • Filter out selected users.
    • Give me the ability to get date ranges for each user/IP, as opposed to just the overall date range for all their edits (this is sort of a blend of Get edits and Get users but without quite the business of Get edits).
    • Give me the ability to get log entries for the IPs (again something that is shown in Get edits but not in Get users.
    • Give me the ability to see if two or more registered users are not only using the same range but individual IPs.
  • I may add more to this list. I'm going by memory, and checking may bring more to mind.--Bbb23 (talk) 00:38, 9 October 2019 (UTC)Reply
  • ...
  • Long term, a graph representation of the relationships between IPs and accounts may be helpful, especially for complex investigations. MER-C (talk) 13:05, 12 October 2019 (UTC)Reply
  • Finally took the time to look, in general this looks ok. What I am wondering, I see in the example the 'extra users' sessions for accounts found on the IP's. Will that be automatic or would that be a next step? There are cases where you check two accounts by just checking their ip, and if they don't match, also not range match, you don't need nor want to look up the ip's. What I would really like is for the 'get ip' case to already show the user agents, something you now need to check the ip's for. Further more, a handy way from the 'check ip' button to check the range instead is a must I think. When dealing with IPv6 you should always check the /64 (although there are use-cases for checking the ip itself). What might be nice would be a dynamic table that expands, for the complicated cases. I have had cases, where on the IP's you find new users, checking the users give new ip's again, and on those ip's you find again other users. So a button 'check this ip and add the users to the table' would be nice. Again, it should not be automatic as there might be users you don't want to check on dynamic ranges based on user agent. Akoopal (talk) 15:58, 13 October 2019 (UTC)Reply
  • @Akoopal: Good point about making that check for other linked editors optional. We are still thinking about the UI for accepting an IP range in addition to an IP. Would you say the range should be automatically generated from the IP address? I will be sharing some design mockups early next week and would be really interested to hear your thoughts on that. -- NKohli (WMF) (talk) 05:44, 17 October 2019 (UTC)Reply
  • @NKohli (WMF): For the range checks, I think there can be different approaches for IPv4 and IPv6. For IPv6 an dropdown where you can select the /128, /64, /56, 48 and then a field where you can specify a bigger range will probably do. For this one, the /64 might even be default. Of course it should zero the last 64 bits of the IP address to calculate the proper range. For IPv4, it will be much more difficult as there are so much different sizes used. At first just a field to fill in, defaulting to /32 (1 IP) to start with. Looking up the range with the IP registries might be a nice one, offer that via a dropdown, then /32 default, the discovered range as second and the fill in as the last one or something. Does that make sense a bit? Akoopal (talk) 21:19, 17 October 2019 (UTC)Reply

Pagination for busy range results[edit]

With the current checkuser tool, if a check is tried on a busy range it may exceed a maximum number for results and then only gives a list of IP addresses with the number of edits per address which is not very usable (failed check). It does not give proper results. A desired feature would be to paginate the results (Page 1, Page 2, etc. as necessary) for busy ranges.

One current way around the max number exceeded is to select a lower time frame from a pull down box but this shortens the 90 day period to either one month, two weeks or one week. That loss of data is sometimes undesirable depending on the case that you are working on. That is already built in as a feature of the current tool.

Another workaround to retain the full 90 day information involves splitting the network and running separate checks. For example (using Class B reserved range), if a /16 range (172.16.0.0/16) fails because it is too busy then you might could split the network in half and run a check on 172.16.0.0/17 and then run a check on 172.16.128.0/17. If those fail then you can run four separate checks on 172.16.0.0/18, 172.16.64.0/18, 172.16.128.0/18 and 172.16.192.0/18. If that fails then you can run eight separate checks and so on. Some checkusers acquired their cu bit by being elected to Arbcom and may not have a strong networking background so they aren't likely to try this second workaround. It would make it easier for all checkusers to have pagination on busy range results which would forego the need for workarounds.
⋙–Berean–Hunter—► ((⊕)) 21:41, 8 October 2019 (UTC)Reply

SockFilter?[edit]

Would it be possible to put alerts on certain IP-ranges / UA data so certain editors get flagged? (I know, likely these are flags that only checkusers will see, maybe only to admins?)

The current situation on-wiki is now that we see an editor with a pattern (possibly through AbuseFilter) that is recognised and that editor gets reported to CheckUsers to see the data behind the editor. That is often a Always-Too-Late action, prolific sockers are already on another account, and you keep hunting. It must be possible to flag certain combinations so that if a sock performs an action on wiki they get matched against the pattern. Setting the filter should be at CheckUser discretion and only used on serial-violators (and not to pre-emptively 'catch' editors). --Dirk Beetstra T C (en: U, T) 13:11, 9 October 2019 (UTC)Reply

Get IP addresses vs Get Users vs Get Edits[edit]

@MusikAnimal, Zzuuzz, and NinjaRobotPirate: and others - I tried to enlist the various features and use cases for the three Get options in CheckUser. Does the below seem accurate? What other use cases do you have for these views? I'd appreciate your help in teasing those out. Thank you. -- NKohli (WMF) (talk) 05:56, 12 October 2019 (UTC)Reply

This looks pretty good to me. It could probably be doubled in size after thinking for a week about every use case possible, but it's a good overview. NinjaRobotPirate (talk) 18:54, 13 October 2019 (UTC)Reply
Yes, this sums it up nicely. A few minor corrections: For "Get IP addresses", a date range should be shown for activity (from available data), not just the latest action. I also think all views show the block status, current or previous. Next, the number shown next to usernames/IPs I believe is a count of logged actions as well as edits (or at least it should be, if it isn't). Finally, "Get edits" can be used on accounts too. I think you've got all the primary use cases covered. Best, MusikAnimal talk 21:39, 14 October 2019 (UTC)Reply

Get IP addresses[edit]

Features
  • Shows the IP addresses associated with a user account.
  • Shows the timestamp for the latest activity on each IP address
  • Shows the number of edits made by the IP along with a number for total edits made by accounts operating on that IP address, indicated by something like: [2] (~5 by all users)
  • Provides links for tools to run checks on an IP
Use cases
  • Used to get a quick overview of a user account activity
  • To immediately see if this might be a big sock farm if the number of edits from the same IP is high
  • Run various checks for the listed IPs with the help of the tools linked under the IP address to find out the location information, if it is behind a Tor node or VPN etc.
  • ...

Get Users[edit]

Features
  • Shows the IP editors and user accounts editing from a given IP or IP range.
  • For each record -
    • Shows activity time period (start - end) and number of edits (denoted by something like [20])
    • Offers an option to run a WHOIS check on an IP
    • Offers an option to look at talk/contribs for an account
    • Shows IPs and user agents associated with the editor
  • Allows one to select IP editors and user accounts from the list and to block them (along with some block options)
Use cases
  • Used to find sleeper accounts/other socks that are created from the same IP or IP range
  • ...

Get Edits[edit]

Features
  • For a given IP or IP range, it displays a timeline of edits and log actions by the user accounts or IP editors operating from that IP or IP range.
  • Dates are in descending order (latest first).
  • Records are grouped by date. Each record displays:
    • If it is a log record.
    • Links to diff and history if not a log action/record.
    • Timestamp of activity
    • Page (if edit action)
    • Editor
    • Info about editor privileges
    • Info about whether editor was previously blocked
    • Edit summary of the edit
    • IP address
    • User-agent string
Use cases
  • Used to identify if two users are the same based on their activity pages, IPs, UAs
  • Sometimes used to figure out if a single user is behaving suspiciously based on their activity
  • Identify other accounts/editors from an IP range that might be behaving in a similar fashion as a known sock
  • ...

Wikimedia specific tool links messages[edit]

MediaWiki:Checkuser-toollinks and MediaWiki:Checkuser-userlinks-ip often have to be customised locally due to either the default links provided not working or not being enough. Having Wikimedia overrides for those via the WikimediaMessages extension would help. I don't think we want to push, e.g. wmflabs tools on the CheckUser extension itself for all MediaWiki installs outside Wikimedia to use. Further local customisation will still be allowed of course, but at least those who have to do crosswiki checks such as stewards will actually have some tools that work. Thank you. —MarcoAurelio (talk) 09:32, 22 October 2019 (UTC)Reply

@MarcoAurelio: Right. I heard similar feedback from MusikAnimal so we will be keeping a way for those links to be overridden if desired. I hope at some point in the future we can have an integrated service in MediaWiki so we do not need users to go to external tools to look up the information they need. -- NKohli (WMF) (talk) 14:13, 28 October 2019 (UTC)Reply

Blocking from CheckUser[edit]

@MusikAnimal, Zzuuzz, Ajraddatz, NinjaRobotPirate, MarcoAurelio, Huji, Akoopal, and Rschen7754: We have been working on preparing a set of mocks for a redesigned CheckUser experience that we will be sharing with this group soon. Meanwhile, we have a question about the blocking feature built into the Get users view. 1) How often do you use the blocking feature in CheckUser?
2) We are thinking about decoupling CheckUser functionality from blocking and instead providing a better, integrated experience with `Special:Block` instead. What do you think about this idea?
Thanks in advance! -- NKohli (WMF) (talk) 05:20, 31 October 2019 (UTC)Reply

1) I don't know exactly. I don't use it every day, but it's a lifesaver when I do use it. I would guess several times a week. There are a couple long-term vandals on English Wikipedia who create dozens of new accounts before we catch them. The cleanup is very frustrating, but having the CU/blocking process cleanly integrated like this makes it easier. 2) In general, I doubt non-checkusers need this functionality, though it might be useful for blocking a large, organized group of vandals who attack the same article. I can't promise that I won't complain if this slows down my workflow, but if it's easier for you to maintain the code by doing this, I'll take that into account before complaining. NinjaRobotPirate (talk) 05:49, 31 October 2019 (UTC)Reply
I'll echo what NinjaRobotPirate said. 1) I would say I use it less often, but when I do I usually block anywhere from 10 to 50 (or more) accounts. The convenience is amazing. The alternative is 50 browser tabs and many hundreds of mouse clicks, and that's really exhausting and time-consuming. 2) If you're actually going to provide a better experience with the same or better abilities elsewhere, then I'm sure that needs no comment. But I'll comment just in case. I can understand if you'll find it easier to remove it from the CU extension. I don't think this facility will be very useful to non-CUs - I'm sure some batch-block scripts have probably been written for admins on enwiki but they're not common. The practical alternative would be username exporting and either Javascript blocking or a new ability for multiple usernames at Special:Block. It's certainly not my preference to remove the integration of this feature, but I'm sure we could get by. -- zzuuzz (talk) 08:11, 31 October 2019 (UTC)Reply
I use the feature a lot, because via scripting we have it integrated with Special:MultiLock as well. Please keep the feature around. Thanks! —MarcoAurelio (talk) 11:05, 31 October 2019 (UTC)Reply
I use the block feature rarely, but I agree with Ninja that when we do need it, it is a life saver.
The decoupling idea is not necessarily bad. One option can be to allow a user to "select" (i.e. check the checkbox next to) a bunch of users in the CU interface, have a button that would allow copying those usernames to clipboard, and then have an interface in Special:Block (say, Special:MultiBlock) that would take a list of usernames and allows you to block them all at once.
Putting my MW developer hat on, from a code perspective it also makes sense to abstract out the multiuser block feature from CheckUser and make it a core feature (or its own extension, kind of like the Nuke extension). Huji (talk) 19:18, 31 October 2019 (UTC)Reply
@NinjaRobotPirate, Zzuuzz, MarcoAurelio, and Huji: Thanks for all the input! It is helpful to know that the feature is useful. After talking to the engineers, I will circle back with you all to discuss potential options and we can collectively decide what is most useful for you. Thank you. -- NKohli (WMF) (talk) 18:07, 5 November 2019 (UTC)Reply

Update and mockups[edit]

@MusikAnimal, Zzuuzz, Ajraddatz, NinjaRobotPirate, MarcoAurelio, Huji, Akoopal, and Rschen7754: Hi all! I shared an update on the project page. It outlines some of the things we have learnt from this page so far that we have designed some mockups around. These are early-stage mockups and there are a lot of details to work out. My biggest question from you all is whether these make sense at a high-level and provide the information you seek from the tool. Looking forward to your feedback.
If you're interested in participating in a usability test for the mockups, please let me know below or over email. Thanks. -- NKohli (WMF) (talk) 21:56, 5 November 2019 (UTC)Reply

Happy to participate in a usability test. – Ajraddatz (talk) 22:34, 5 November 2019 (UTC)Reply
What's involved in usability tests? The mockups look OK to me. NinjaRobotPirate (talk) 23:55, 5 November 2019 (UTC)Reply
The mock-ups look good to me as a first step. Depending on the level of involvement for the usability test (the time it takes, does it have to be over Skype or something or can it be done through email, how near in the future is it planned for) I may be able to participate as well.
Some additional thoughts (for now or for later);
  • I would really like the timeline to be also presented as a graph.
  • Some of the features of DEWKIN (such as punch card) may also be good to have.
  • I still think that the actual UA string should be exposed somehow. (It can be hidden by default)
  • The aggregate data that is shown for IPs ("13 from all users") should also be shown for UAs.
Best, Huji (talk) 00:33, 6 November 2019 (UTC)Reply
Looking good! I concur with Huji that the full user agent should be accessible somehow. I really like that it infers the browser/OS for you, but this can't always be done as UAs can be completely arbitrary. Also ideally I could get directly to Compare/Timeline tabs, similar to how it is now with radio buttons for which view you want. I see the value in the CentralAuth-style view for stewards but I doubt I'd use it much. In my case I'm just trying to find accounts with technical overlaps and/or check for collateral damage. That said a lot may be missing with the "Include all users who are using the same IPs" option, unless you can somehow have it go by the ranges. I doubt you could without producing too much noise, so perhaps it could just compute ranges for me based on the surfaced IPs (same as copying pasting IPs into the range calculator), and have a link to check just those range(s). That'd be amazing! That'd shave off a lot of time copying/pasting IPs. Finally, I worry about the table format. What you have presented is super clean, but a lot of stuff is missing such as the user groups (positions of trust might tell me this user is likely unrelated), block/lock status (current and previous), the date range of activity (I see only a single date in each row), basic links like talk/block/contribs, and of course the community-maintained links -- all of this may not confine nicely into a table. Perhaps we could have a little expand/collapse link to reveal this? Otherwise I think the current "Get users" view could work well as it's already designed to show data for multiple users. The only difference would be that you could query for multiple users/IPs from the beginning, which is a great improvement! Kind regards MusikAnimal talk 03:30, 6 November 2019 (UTC)Reply
Hmm. I share some views of MusikAnimal above, and I see some improvements from the previous proposal. I just have a few comments. 1) I think these examples could benefit from some real world data (anonymised obviously) of some real prolific sockpuppeteers, based on real CU activity. Users "Apples" and "Oranges" is one thing, but show me (and please practise on) a 20-account LTA on a /39 range with plenty of collateral, and preferably some other examples. 2) I agree the UA information should not always be permanently reduced from its current format. Some details are important. I'm of the opinion there should be more of this user-provided information available where it's useful (a tooltip might be an idea). 3) IP addresses are missing from the timeline view. Considering a CU operating on ranges, it can be useful to see IP sticky/switching behaviour - especially with IPv6. You could say that's now incorporated in the "Compare" view, but I'm not sure that's an improvement. And I hope one of the options for "Type of activity" is "All". Thanks. -- zzuuzz (talk) 19:38, 6 November 2019 (UTC)Reply


Thanks all! To summarize, I'm hearing these as feedback:

  • Timeline as a graph
    • @Huji: Can you elaborate more on this? What kind of a graph are you envisioning? With the highlighting feature you would be able to highlight a set of pages for example and all those rows will be highlighted, showing all the users who have edited those pages.
  • Aggregate data for UAs
    • I believe this should be doable but I will check with the engineers on this.
  • Show the complete UA string
    • This will be the case. The first iteration of the tool will not have the parsed UA string. Once we build that, we will ensure there is still a way to access the complete UA string. As part of this, we hope we can also flag non-standard user-agent strings in the UI.
  • Compute ranges automatically to check on
    • @MusikAnimal: This should be doable. My concern is how do we present the information usefully if there are too many results to show. If you search for a range and 1000 users turn up, is it useful to show all 1000 of them? Should we try to limit them and ask the user to try a smaller range? I'm also concerned about the technical limits we'll run into which will make loading results slow.
  • Table format
    • I do want to make sure we show everything that is currently available in CU. I think the table can be changed to include everything in the same UI. I'll discuss more with Prateek (our designer) about this.
  • Real world data
    • This is a good point and something we are considering doing. I will try to share an updated mock with some real world examples (anonymized) within the next week or two. If you have an example in mind (not a very extreme one as stress-testing will come later, this is for the mocks only) please let me know privately or on CU wiki.
  • IP addresses on timeline view
    • Should be added, yes.
  • IP sticky/switching behavior
    • If I understand you right, this should be achievable with the highlighting we are planning to build. You'll be able to highlight all rows in the timeline for a given user or IP or page edited etc.
  • All as a type of activity
    • Yes, that will be one of the types.

Thanks. -- NKohli (WMF) (talk) 02:21, 7 November 2019 (UTC)Reply

@Ajraddatz, NinjaRobotPirate, and Huji: thank you for expressing interest in the usability tests! They shouldn't take more than 20 minutes, and can be done at your leisure - no explicit scheduled interview needed. These tests will involve recording your screen as well as audio, if you have a microphone enabled. Please email me, clo@wikimedia.org, for more details (and this goes for anyone else who might be interested in participating in the usability tests). —User:CLo (WMF) (talk) 18:40, 7 November 2019 (UTC)Reply


With regard to the timeline, what I am thinking is something like this:

In many cases, one of the first steps of running CU is to determine if the accounts are editing on similar days or not. As I pointed out, the Punch Card feature of DEWKIN can also be helpful (e.g. to see if the users edit during similar times of the day). Huji (talk) 20:18, 7 November 2019 (UTC)Reply

Looking for your feedback on interactive mockup[edit]

@MusikAnimal, Zzuuzz, Ajraddatz, Rschen7754, MER-C, Akoopal, Huji, MarcoAurelio, NinjaRobotPirate, Bbb23, Berean Hunter, and Beetstra: Pinging everyone who's chimed in on this page in the past. Hi all, I want to share some interactive mockups that our team has been using to do our series of user tests up until now. I want to share these with you to give you a sense of the direction we are going in for the new tool and get your feedback. You can find the mocks on this link. The mocks are along the tool outline we initially set out with.

Please keep in mind that because they are just mockups, not everything will work and you will probably find some bugs. Also, some things have changed since the last time this mockup was updated. This mostly involves design changes. There are also some features in the works that are not present in the mocks here. For example, we are working on incorporating MusikAnimal's feedback to allow looking up a range in the Checkuser interface (mock here).

With that said, I would like your feedback in the following categories. Thanks so much for your time. -- NKohli (WMF) (talk) 01:49, 8 May 2020 (UTC)Reply

What do you like?[edit]

What concerns you?[edit]

  • The UI is pretty different. I like that all the users on an IP address highlight when you hover over that IP address, but this isn't so obvious unless you are hovering over an IP address. I wonder if there's a way to make it a bit more obvious at a glance, without having to hover over anything. It looks like we'll be able to sort by IP address. I suppose that will make it fairly obvious. NinjaRobotPirate (talk) 05:33, 8 May 2020 (UTC)Reply
    @NinjaRobotPirate: That's true, the highlighting feature isn't obvious until the user "discovers" it. Our hope is that it will become apparent to the users as they start looking through the data and after once or twice, that feature will become obvious to them. We will also add ample documentation for the feature in the help docs. As for sorting by IP addresses, seems like there are technical challenges to doing that. We might not be able to make it work for all use cases. The engineers are still talking about the technical possibilities for making it work. -- NKohli (WMF) (talk) 02:59, 12 May 2020 (UTC)Reply
  • This design is also IP/user based and can make it even harder for us to ever achieve UA-based queries (see phab:T146837) Huji (talk) 12:16, 8 May 2020 (UTC)Reply
    @Huji: I don't think having the design be IP/user based makes the possibility of having UA-based search hard. It's possible that in the future there could be a button to expose all users using a given user-agent, much like the button we have to expose all users using a given IP address. -- NKohli (WMF) (talk) 02:59, 12 May 2020 (UTC)Reply
  • The mock-up doesn't show how things will look when a user has more than one UA for the same IP (see phab:T26411 and phab:T170508) Huji (talk) 12:16, 8 May 2020 (UTC)Reply
    @Huji: Ah. There will a separate row in the table for each different IP, UA and activity time period. I think that got missed in the mocks. -- NKohli (WMF) (talk) 02:59, 12 May 2020 (UTC)Reply
  • 1) The block interface, particularly for IP addresses. Half the time, we're probably going to be hard-blocking the IP. The default block length of one week seems arbitrary. And are we blocking all the IPs that have been used? All accounts found? It's difficult to tell how this works without a fully working dataset. A particular concern I have is about privacy. Checkusers will often (not always) try and obfuscate any connection between accounts and IP addresses, perhaps by staggering the blocks or using other smoke and mirrors. This interface, by blocking IPs and accounts at the same time, will make this connection obvious by default. I suspect I would be inclined to remove IP addresses from this interface. It's not often that we will want to mass-block multiple IP addresses (and we will not want to add the same tags). 2) I do hope for the "Dropdown with IP ranges" we can use custom ranges, like the /23 ranges used by BT IPv4, or the /40 ranges used by Verizon IPv6. -- zzuuzz (talk) 22:02, 8 May 2020 (UTC)Reply
    @Zzuuzz and Berean Hunter: . Thanks for pointing this out. The blocking interface will actually look different from the one in the mocks. As we were thinking about the best way to allow checkusers to block, we came to the conclusion that it's best if we offer checkusers the same UI as Special:Block but with the additional options that Checkuser blocking form has. There will be an option for the checkuser to select users/IPs they want to block and they will be taken to a special page called Special:InvestigateBlock, which is a modification of Special:Block in a way that allows blocking multiple users. This block form will be pre-filled with the users the checkuser selected. The checkuser can submit the form after checking the relevant options and setting the time period they want. It's not in the mocks. My apologies about that. You can find a couple screenshots and more detail in phab:T248530.
    Regarding your second question, @Zzuuzz: , in the initial version we were considering having a default list of ranges. We're also being cautious because the bigger ranges we allow, the longer it will take to fetch the data, especially given that multiple users are being looked up. If you were to make a default list of possible ranges, which ones would you include? Thanks. -- NKohli (WMF) (talk) 02:59, 12 May 2020 (UTC)Reply
    I can understand the need for caution, but I also understand the 5,000 record limit of the current tool is one of the most common complaints about it, and some of us very often deal with large busy ranges. To answer the question though, a compromise I would suggest is that if you can't make the IPv4 prefix multiples of 2, which I would strongly prefer, then at least make it multiples of 4. For IPv6, please double those numbers. We can probably get by with minimum range sizes (maximum prefixes) of /24 and /64 respectively (aside from /32 and /128). -- zzuuzz (talk) 06:15, 12 May 2020 (UTC)Reply

What else would you like to see?[edit]

  • Highlighting non-standard UAs (see phab:T234980) Huji (talk) 12:16, 8 May 2020 (UTC)Reply
  • Essentially, go through phab:T237486 and bring as many of them into life in the mock-up as possible. Huji (talk) 12:16, 8 May 2020 (UTC)Reply
  • I believe this may be covered by what Huji has linked above but, yes, having the pertinent whois and geolocation info display in one place would be great. Displaying which ones are proxies would be great too but if not done automatically then having a link to autofill and complete the proxy checking would be second best. It's better than cutting and pasting in a different tab which is what we do now. One of the big goals is to reduce the number of browser tabs that we have to have open.
    ⋙–Berean–Hunter—► ((⊕)) 20:59, 8 May 2020 (UTC)Reply
    @Berean Hunter: Good news, we are working on bringing the IP information into the user interface, not just for checkusers but more broadly on the project. You can read more about it here. I would love to hear your thoughts on the talk page discussion. It's a few months down the line. It involves vetting and contracting with external IP services to gather the data, ensure reliability etc. There's a few challenges to that work. I will be posting a project page for it within the next few days and I'll let you know when I do. -- NKohli (WMF) (talk) 03:04, 12 May 2020 (UTC)Reply

This section is to collect feedback on the new special page. We're looking to hear what you like, what you don't like and what else you'd like to see. The new Checkuser tool is accessible at Special:Investigate. The documentation for it can be found at Special Investigate.

What's good?[edit]

What's not good?[edit]

What is missing?[edit]

Any other feedback[edit]

Adding and removing users from an investigation[edit]

I have started an investigation on three users. It is not conclusive, so I did some more digging (behavioral, outside of Special:Investigate) and want to add a fourth user to the comparison. There does not appear to be any way to do this. It should be possible to edit/add to the list of users/IPs. ST47 (talk) 22:54, 8 October 2020 (UTC)Reply

@ST47: That's been added to the list of feature we want to deliver on. We will come up with mocks and share them with the community soon. Thanks. -- NKohli (WMF) (talk) 00:16, 14 October 2020 (UTC)Reply
@NKohli (WMF) and ST47: This indeed would be an awesome addition. Being able to keep adding users and IPs to an investigation would be of great help. Thanks. —MarcoAurelio (talk) 15:50, 18 November 2020 (UTC)Reply

Missing IP addresses[edit]

I performed a check on the same user using both Special:Investigate and Special:CheckUser. Special:Investigate failed to detect one of the user's two IP addresses. It appears that Special:Investigate does not check all of the same types of entries as Special:CheckUser. Users with the appropriate access can find a testcase here. ST47 (talk) 23:00, 8 October 2020 (UTC)Reply

It also fails to detect abuse filter logs, for example, it is claiming that en:Special:Contributions/Meckma is stale. ST47 (talk) 01:23, 9 October 2020 (UTC)Reply
It sounds like a major flaw if it doesn't show up all the IPs that the older checkuser does. Since we cannot ourselves test on enwiki to look at the output difference, do you mind pasting the outputs on checkuserwiki that I can then look at which will help us determine the issue? I appreciate all the help.
The issue with abuse filter logs not showing up has been mentioned above. I'll make a note of it and we'll get to it as soon as we can. Thanks. -- NKohli (WMF) (talk) 00:32, 14 October 2020 (UTC)Reply
@NKohli (WMF): I expect that it's that all of the logs aren't being displayed in the table. User creation, abuse filter, password reset, login, email... ST47 (talk) 21:36, 16 October 2020 (UTC)Reply

I don't know if it's the same problem, but in my case I checked a single new account that has no edits, and the "IPs & User agents" tab says "there are no results". However I do see the IP and UA for the account creation on the Timeline tab. I would expect that information to be listed on the "IPs & User agents" tab too. To clarify my use case, as a steward I usually run checks on loginwiki where users cannot make edits. MusikAnimal talk 00:22, 16 October 2020 (UTC)Reply

I think the problems is that "account created" and "login" (and maybe "mail", too) entries are missing. Run queries using both tools and got different number of IPs (on user A):

tool query user A user B
classic Get IP addresses 3 rows 11 rows
investigate 2 rows 11 rows

Unfortunately, the row which was missing from Special:Investigate output but was present in the classic tool, indicated a perfect match between the users that would otherwise be missed. This could be a major flaw. Missed IP did not have any edits, only user automated creation and login.  « Saper // talk »  20:44, 24 March 2021 (UTC)Reply

Unclear cues for when additional data is available for an IP address[edit]

I performed a check, and got back several IP addresses. One IP was used by two users, and with different UAs, so it appears three times in the table. It is annotated with "[1 edit](~5 from all users)". This gives the impression that additional information is available for that IP address. However, adding that IP to the investigation returns no new data. There should be a clear and easily visible cue to indicate that a given IP address has additional edits associated with it, but which are not part of the current investigation. (Similarly, if I check an IP, there should be a cue for usernames which have additional edits which are not yet part of the current investigation.) ST47 (talk) 23:59, 8 October 2020 (UTC)Reply

Not possible to add an IP range to an investigation[edit]

I performed a check on two users, and got a list of three IP addresses in what I recognize as a /23 range. I want to run a check on that range. It does not appear to be possible to do this without starting a brand new investigation. ST47 (talk) 23:59, 8 October 2020 (UTC)Reply

+1. Very strongly needed to add the entire range to the investigation — NickK (talk) 19:00, 14 October 2020 (UTC)Reply
+1. Because of that I'm having to use the new and old tool simultaneously. Rafael (stanglavine) msg 21:25, 16 October 2020 (UTC)Reply
+1 —MarcoAurelio (talk) 15:50, 18 November 2020 (UTC)Reply

Not possible to highlight (for copy/paste) information from table[edit]

I performed a check, and I want to copy the IP address from the table in order to run a separate check (or to block them, or whatever). However, you appear to be blocking copy/paste from the table. Attempting to highlight text by clicking and dragging has no effect. This affects the username, IP, and user agent columns. ST47 (talk) 23:59, 8 October 2020 (UTC)Reply

I'll have more to add tomorrow, but this was an extreme pain in the ass to get proper results over to the SPI and to other tools to check. The sooner this can be fixed, the better. -- Amanda (aka DQ) 02:31, 9 October 2020 (UTC)Reply
@ST47 and DeltaQuad: It's not intentional but a bug. It's tracked in task T261646. We'll get to it as soon as possible. My apologies. -- NKohli (WMF) (talk) 04:54, 9 October 2020 (UTC)Reply
Seconding that this is important, but because some LTAs use dedicated phone models that show up in UA and I want to copy the model to search in results Amir (talk) 05:09, 13 October 2020 (UTC)Reply
@ST47, DeltaQuad, and Ladsgroup: This has now been fixed! You should be able to select text from the table as expected. Sorry for my late response here. I got tied up in a few different things. Let me know if you find any further issues. Thanks. -- NKohli (WMF) (talk) 05:40, 18 November 2020 (UTC)Reply

Middle click to open in new tab doesn't work on tool links[edit]

I was trying to open the contributions for an IP by middle clicking in the dropdown menu. However, this has no effect. I see that you are forcing those links to open in a new tab by default. This is a questionable practice at best - it should be the user's choice, not the designer's, how to organize their browser. However, there is no reason to block middle-click from working. ST47 (talk) 00:09, 9 October 2020 (UTC)Reply

Thanks for the feedback, ST47. Filed this as task T265439. -- NKohli (WMF) (talk) 00:46, 14 October 2020 (UTC)Reply

Not possible to set the number of items in IP&UA tab[edit]

On IP&UA tab, if there are too many IPs/accounts in some range, only 50 items will appear, and I must click Next Page to view the next part. If this happens, the inconvenience of individually checking each page will arise. It's a good idea to be able to choose the number or set them all to appear. --Sotiale (talk) 00:52, 9 October 2020 (UTC)Reply

Yes, this definitely should not be paginated. We routinely need to look at more than 50 IP/user/UA combinations at the same time. ST47 (talk) 01:18, 9 October 2020 (UTC)Reply
You may increase the number of items to show per page by going to the "Recent changes" tab in Special:Preferences, and changing the option "Number of edits to show in recent changes, page histories, and in logs, by default:" (these are the English translations, obviously). DWalden (WMF) (talk) 12:13, 14 October 2020 (UTC)Reply
Sorry, I don't want to do that. I frequently browse RC on mobile, and increasing it to (say) 500 to force a "un-paginated view" is not realistic because my phone will just stop thinking for a while whenever I see RCs, page histories, and other logs. Please find some other ways. — regards, Revi 12:40, 14 October 2020 (UTC)Reply
@DWalden (WMF), ^ — regards, Revi 20:18, 15 October 2020 (UTC)Reply
Another temporary workaround is to add to the URL &limit=n, where n is any number up to 1000. E.g. Special:Investigate/IPs_%26_User_agents&limit=500&token=.... Note this will not be preserved across different tabs (e.g. going to Timeline). DWalden (WMF) (talk) 09:01, 19 October 2020 (UTC)Reply
Yeah, that unrelated setting certainly isn't an adequate fix. Just show the entire table on a single page, so that we can Ctrl-F and copy-paste and sort. There is no reason to paginate it, we can scroll down the page. ST47 (talk) 21:38, 16 October 2020 (UTC)Reply

Only one pin can be set[edit]

The function to highlight UA or IP with pin is useful, but it halves the usefulness of only one UA or IP being fixed. Is it possible to set multiple pins? --Sotiale (talk) 00:57, 9 October 2020 (UTC)Reply

+1, it is useful as it is, but it would be even more so if you could "categorise" different data into different pins/colours. --Base (talk) 07:55, 9 October 2020 (UTC)Reply
@Base and Sotiale: Thanks for the feedback. It gives us a lot of joy to hear that you found the highlighting feature helpful. We will do some thinking around how we can make multiple pins happen and get back to you as soon as we can. Thanks. -- NKohli (WMF) (talk) 00:17, 14 October 2020 (UTC)Reply

What is the default sort of the IPs and user agents table?[edit]

I assumed it was by date descending, but that does not appear to be the case. File:Investigate_sort.png. The new sockpuppet would be the first row if I ran Special:CheckUser, but it's buried in the middle of the table for Special:Investigate. ST47 (talk) 01:33, 9 October 2020 (UTC)Reply

Also, if the table exceeds one page (50 rows), it is impossible to sort. Unfortunately this is unusable. ST47 (talk) 01:42, 9 October 2020 (UTC)Reply
I see the problem you are talking about. We do have a task to allow sorting to work across tabs but it proved to be a harder technical challenge than we had imagined. The underlying infrastructure are in such a way that if we choose to allow sorting across tabs, it will take way longer for results to load and quite likely timeout often. One intermediary solution is to set the limit on number of rows in the result to be higher. This is the preference under your Special:Preferences > Recent changes tab. It's called Number of edits to show in recent changes, page histories, and in logs, by default. You can see up to 1000 rows on the same page and as long as the results are restricted to one page, you will be able to sort. I'm sorry to not have a better response for you. -- NKohli (WMF) (talk) 00:55, 14 October 2020 (UTC)Reply

Timeout function?[edit]

I noticed that when a certain amount of time has passed, clicking [Next page] automatically sends me to the first interface. At first I thought it was my mistake, but when I tried again, it was. Perhaps it is set to automatically timeout after a certain period of time. Does this feature have an automatic timeout feature? --Sotiale (talk) 02:29, 9 October 2020 (UTC)Reply

@Sotiale: It does not timeout unless there is no activity for 24 hours. What likely happened is that you got logged out as part of the mass-logout that was done by the security team to log all users out last week. You should not encounter this problem again. Please let us know if you do. Thanks. -- NKohli (WMF) (talk) 00:50, 14 October 2020 (UTC)Reply

URL parameters[edit]

If you are using GET parameters to fill tool's fields (for example linking from the contribution page), parameters "user" and "period" don't work.

For example https://meta.wikipedia.org/w/index.php?title=Special:CheckUser&user=DR&reason=SomeReason&period=14 fills both "User" and "Reason" fields, but https://meta.wikipedia.org/w/index.php?title=Special:Investigate&user=DR&reason=SomeReason&period=14 fills only "reason". --DR (talk) 14:22, 9 October 2020 (UTC)Reply

@DR: &targets= seems to work for me. stwalkerster (talk) 14:48, 9 October 2020 (UTC)Reply
`targets=` doesn't appear in the URL after the check has started. So this may work for creating a new "Investigation", but doesn't solve the inability to add users or IP ranges to an investigation that is already started. ST47 (talk) 21:40, 16 October 2020 (UTC)Reply

More precise timestamps[edit]

Could the tool list the time as well as the dates? Currently it only lists the date range but wouldn't it be more useful if it listed something like "9 October 2020, 12:20 (UTC) - 9 October 2020, 14:40 (UTC)"? Regards SoWhy 16:14, 9 October 2020 (UTC)Reply

@SoWhy: That sounds like a good idea. I'll document it in a task and we'll get around to it. Thanks. -- NKohli (WMF) (talk) 01:20, 14 October 2020 (UTC)Reply

Link to edits / Show edits directly in table[edit]

The table currently display "~X edits" but there seems to be no way to show just those edits. When you click on "Contributions" in the drop down, it shows only contributions for this IP address. Shouldn't it rather display the user's contributions filtered by start and end timestamp? Better yet, can't they be displayed in the table with a expand toggle? Regards SoWhy 16:14, 9 October 2020 (UTC)Reply

@SoWhy: This was one of the challenging things for us to figure out. The table lists the username, IP address, user-agent and the activity time. So when we show contributions there are multiple ways one could imagine them to show up:
  • for that user account
  • for that IP address
  • for that user account when editing from that IP address
  • for that user account with that user-agent
  • for that IP address with that user-agent
  • for that user account + IP address + user agent combination
Linking to IP contribs was easy because it could be placed in a way that was intuitive. I'd love to hear more about what you think is most valuable for your workflow and how we can display it in a good way. By the way I love your username. :) -- NKohli (WMF) (talk) 01:03, 14 October 2020 (UTC)Reply
This old thing? Thanks! :-D
This is the current output when checking my own IP:
Username IP User agent Date range
SoWhy [IP removed]
[1 edit](~2 from all users)
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:76.0) Gecko/20100101 Firefox/76.0 14 October 2020 - 14 October 2020
Yhwos [IP removed]
[1 edit](~2 from all users)
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 14 October 2020 - 14 October 2020
It displays my main account and my alt account separately. Since it already does that, shouldn't the "Contributions" drop-down link next to the IP offer the possibility to show all contributions for the username listed in the first cell? At least, it might be useful if the drop down offered links both to the IP contributions and the user contributions, no? Regards SoWhy 08:14, 14 October 2020 (UTC)Reply

Several requests[edit]

I didn't want to create like 6-7 separate sections, so i'm listing all requests in one.

  1. IP Ranges (up to /16 for IPv4 and up to /32 for IPv6) should be grouped together for the same user like this:
    192.168.1.0/16
    192.168.1.5
    192.168.1.82
    192.168.1.133
  2. UAs on the same IP need to be grouped. It's pointless to a CU, especially if it's only version changes to have these separated out by each IP
  3. IPs from the same user should also be grouped together. Maybe with a thicker border than just the separation by IPs This may need discussion, but when I have like 6-7 ranges that a user is on, it gets very crowded and unorganized very very fast and hard to not be distracted or confused.
    Firstly, thanks for all the feedback, Amanda. I love the idea of grouping records. The only downside will be that sorting might not work as expected or remove grouping settings if selected by the user. We'll brainstorm this some more in the team and get back to you. -- NKohli (WMF) (talk) 01:29, 14 October 2020 (UTC)Reply
  4. User links are not set for IPs, making user scripts like markblocked, popups and other things useless.
    This makes sense. We should fix this. -- NKohli (WMF) (talk) 01:29, 14 October 2020 (UTC)Reply
  5. Timeline formatting should match the formatting of get edits in Special:CheckUser. The way it is now is distracting and hard to read.
    We purposely chose to format the timeline just like RecentChanges in order to maintain familiarity for new checkusers who found the older 'get edits' interface hard to parse. Do you think the new interface is missing any information that the older one provided? Or any other visual cues that you liked? -- NKohli (WMF) (talk) 01:29, 14 October 2020 (UTC)Reply
    @NKohli (WMF): Specifically, I was hoping just for the IP/UA to maintain a different line from the actual log, with a different font so it's definitely distinguishable. I'd be happy to explain more by email about why and how because there are specifics that involve me that I don't wish to put out here. -- Amanda (aka DQ) 13:21, 16 October 2020 (UTC)Reply
  6. User links are also missing from timeline.
    Again, makes sense to fix. I'll create a task for it. -- NKohli (WMF) (talk) 01:29, 14 October 2020 (UTC)Reply

I am happy to create a table and screenshot it to show what I mean with the grouping or even the links if that helps at all. Feel free to comment inline inside my comment too. -- Amanda (aka DQ) 01:03, 10 October 2020 (UTC)Reply

@NKohli (WMF): If you could just link any tasks you create for these, I'd be interested in following them directly. :) -- Amanda (aka DQ) 13:21, 16 October 2020 (UTC)Reply

Table export[edit]

If I want to export a table, I can do it only once. If I change sorting, there is no way to export with the new sorting. This is quite annoying: by mistake I clicked on wikitable export without sorting, I sorted in the correct order but I had to reload page for export — NickK (talk) 19:56, 14 October 2020 (UTC)Reply

Indicate number of users behind the IP[edit]

This is a request for something that doesn't exist in Special:CheckUser either. On the "IPs & User agents" tab, it would be neat if it indicated the number of users behind the IP. So instead of saying "~10 from all users", it could say "~10 from 5 users". This is helpful because for instance if it's just one user (the user I looked up), I know I don't need to investigate further because the Timeline will show all activity for that user. MusikAnimal talk 00:55, 16 October 2020 (UTC)Reply

"~N from all users" appears to be wrong[edit]

I've been testing on my own account. Under the "IPs & User agents" tab, it says "[8 edits] (~296 from all users)". First, I think it should indicate the number of all logged actions, not just edits. For instance if they've made 0 edits, I am misled into thinking activity occurs only for other users on that IP, when in fact the "~296 from all users" is just the single user I looked up.

Now, for the actual bug: the 296 in my example seems to be wildly incorrect. No one has edited/logged in using my IP except me and my bot, and I have only made ~11 actions in the last week. The number should match the number of items on the Timeline tab if I were to investigate the IP, I think. MusikAnimal talk 01:13, 16 October 2020 (UTC)Reply

The count appears to be correct for me - it's counting the edits made by myself and by my bot, and while I agree that it should be counting log entries (it completely misses m:Special:CentralAuth/ST47ProxyBot) the "[X edits]" counts for each row do add up to the total "(~2723 from all users)" exactly. Are you counting your bot? ST47 (talk) 21:54, 16 October 2020 (UTC)Reply
In fact, they should be exact - this isn't an "EXPLAIN SELECT" like Special:CheckUser, they're actually running a SELECT COUNT(*) for each IP (thank christ they're caching it, but still, imagine a user on T-Mobile who has used 100+ IPs over the last 90 days, that's a lot of queries). ST47 (talk) 21:58, 16 October 2020 (UTC)Reply
There is a single edit made using my bot under my IP in this time range, but that is all. The results I'm seeing suggest 297 actions happened on that IP, and if I run a check on that IP, neither the old "Get edits" or the "Timeline" in Special:Investigate show anywhere near this number. In the old CU, the number I see matches the number of results under "Get edits" exactly, which is what I would expect. MusikAnimal talk 23:49, 16 October 2020 (UTC)Reply
Bear in mind that the "(~n from all users)" is not limited by the Duration parameter. It will get actions going back 90 days. "[n edits]" is limited by the Duration parameter. DWalden (WMF) (talk) 09:15, 19 October 2020 (UTC)Reply

I want to add that I get lots of "[8 edits] (~8 from all users)" (the two numbers being the same) in my CU while in the old code, when the two are the same, it doesn't show them which makes it much easier to spot IPs that are being used by more than one user. Can this follow the old code and omit showing "(~N from all users)" when the number is the same? Amir (talk) 15:27, 19 October 2020 (UTC)Reply

WMFTimeoutException[edit]

With the advent of being able to enter multiple IPs/users, I suppose this is always a possibility. In my first test, I checked myself, my staff account and my bot on enwiki, and only for the last week. Even with my bot included, that amounts to only a few hundred actions, so I'm a little surprised it timed out. Next I checked only my bot account, and it still timed out :( My bot is hosted on Toolforge where the IP changes a lot, so I suspect Special:Investigate is running a bunch of queries on each of those IPs, and that's what's slowing it down. But in this case there were only 35 different IPs in the last week. I imagine this scenario could easily happen when checking wide IP ranges, too. I would suggest maybe doing something like Special:CheckUser does where it first runs an EXPLAIN to estimate how expensive it's going to be, and if it's too much it shows limited info. I certainly should be able to at least get the list of IPs/UAs, and if I have to start a new investigation for each IP one by one (like Special:CheckUser), that's OK.

Also, the issue I pointed out above might be related; where Special:Investigate is reporting a much higher number of actions for an IP then there actually are. Perhaps Special:Investigate is doing more scanning than it needs to.

If it helps your team with testing, I consent to you running checks on MusikBot. They're all Toolforge IPs anyway, which are publicly known. (actually a single action from my IP is in there too, but I don't mind you seeing it :) MusikAnimal talk 01:42, 16 October 2020 (UTC)Reply

@MusikAnimal: Thanks for this report! We aren't quite sure why it's timing out. It certainly isn't supposed to. We will look into this and get back to you soon. Sadly, we are not allowed to run tests on enwiki itself. Does MusikBot run on testwiki too? -- NKohli (WMF) (talk) 22:14, 29 October 2020 (UTC)Reply
It has in the past, but there's no recent data. I suspect you'll run into this issue with any Toolforge bot that edits regularly. You could try setting up a test bot that edits in the bot's userspace, or something. I just ran another test (on enwiki) and noted the exception ID is 3066779a-7880-4ec4-977c-392be0e1d2d5. From Logstash it looks like the timeout happened in MediaWiki\CheckUser\CompareService->getTotalEditsFromIp(). MusikAnimal talk 17:13, 30 October 2020 (UTC)Reply

More aggregation[edit]

Currently, it aggregates pretty loosely causing lots and lots of rows that I'm not much interested to, specially or the time I want to have a summary report for checkuser wiki. It would be great to have an option to give a much more summarized version of the table. Thank you! Amir (talk) 20:23, 21 October 2020 (UTC)Reply

Accounts without edits or actions are not displayed on "IPs & User agents"[edit]

I don't know if this has been reported before, but accounts that don't have any edit or any filter log (have only account creation log) do not appear on "IPs & User agents" tab, they only appear on the "Timeline". I think they should appear in both. Rafael (stanglavine) msg 17:31, 27 October 2020 (UTC)Reply

I mentioned this above but it deserves its own section. This is important for stewards who use loginwiki to run checks, since no one has any edits there. MusikAnimal talk 02:42, 28 October 2020 (UTC)Reply

Tour message too WMF specific[edit]

Hello. Unless these are going to be phased out in the near future, I suggest that we modify MediaWiki:Checkuser-investigate-tour-copywikitext-desc and related messages in order to remove jargon specific to WMF such as CUWiki which surely does not exist for all other MediaWiki installs elsewhere. If needed, we can have a custom message and override using the WikimediaMessages extension. Thanks. —MarcoAurelio (talk) 15:48, 18 November 2020 (UTC)Reply

Some i18n issues[edit]

task T268379, task T268380MarcoAurelio (talk) 12:16, 21 November 2020 (UTC)Reply

The first task is a dupe of task T41013 created by me in 2012. It looks like the schema change patches are there already for review. Given that they're pre-Investigate, I assume we need to make sure they'd work with the new system. The last task is now resolved thanks to Ammarpad. —MarcoAurelio (talk) 10:49, 23 November 2020 (UTC)Reply

Wording tweak[edit]

When the tour says "don't worry, we'll create a separate checkuser log for each user" or something like that, I imagine the author meant to say "we'll create a separate checkuser log entry". But I like the interface otherwise, very professional. Enterprisey (talk) 12:11, 21 December 2021 (UTC)Reply