Talk:User account types

Are temp users unregistered/anonymous?
@Matma Rex, @Whatamidoing (WMF), @RHo (WMF): I've been looking into this terminology since Data Engineering is discussing how to capture the new account type in their schemas and specifically, whether a field named  should be true or false for temp users. This in a table that is meant to be made public on Cloud Services in the future, so it seems important to have the most consistent terminology possible.

Initially, I thought that it should be false, since to me "anonymous" is a synonym of "unregistered" and, from a MediaWiki-internal point of view, temp accounts will be registered in the user table. This table supports that view.

However, after looking at how people are using the terms now, I think the vast majority of folks would choose to define "anonymous" (and "unregistered" and "logged-out") to mean either temp or IP. Here are some points:


 * The main Meta page uses both "unregistered" and "non-logged-in" to mean temp or IP ("IP masking hides the IP addresses of unregistered editors", "In the future, before a non-logged-in user completes an edit, they will be informed that their edits will be attributed to a temporary account.", etc.)
 * When I look at the discussion page on Meta, all the people using "anonymous" use it in a way that includes temp accounts ("Should IP Masking [be] a veil for anonymous users, but still transparent for patrollors?", "anonymous users using temporary accounts can never use the very same temporary account when on multiple devices").
 * Comparing the temp and IP columns in your very helpful table, their capabilities are very similar to each other and very distinct from registered users, so they naturally form a group.

I realize this does not totally mesh with the in-code terminology. For example, from :


 * User::isRegistered will return true for all registered accounts, including temporary accounts.
 * User::isAnon will return false for such temporary accounts.
 * User::isTemp will return true exclusively for temporary accounts.
 * User::isNamed will return true exclusively for registered accounts that are not temporary accounts.

According to this, only IP accounts are anonymous. However, we are also deviating from the in-code terminology when we say that temp users are not registered, so I think we have already lost perfection in terminology.

So, based on this, I'm inclined to update the table to make "anonymous", "unregistered", and "logged-out" an umbrella term for IP and temp users and then add information about the cases where this doesn't match the User methods. What do y'all think?

cc @MKampurath (WMF), @Milimetric (WMF). Neil Shah-Quinn (WMF) (talk) 01:27, 18 May 2023 (UTC)


 * I agree – I think the term "unregistered" applies to both IP and temp users. I wrote this page before people started really talking about them much, so it didn't match real world usage.
 * We should document it that way despite the problem you noted that in the code, we treat temp users as a "subtype" of registered users; that's just a quirk for reasons of backwards-compatibility.
 * I would avoid the term "anonymous" when possible. It was widely used to describe IP users in the past, even though they were less anonymous than the registered users in some ways. Temp users are more anonymous now, but you try explaining that. I expect that folks will use it for both IP and temp users despite my advice ;), so we should document it on this page, but I would rather not use it in any APIs, databases, etc.
 * The term "logged-out" is a bit weird, since you can log out of your temporary account. But it's a synonym of "unregistered" in practice, and there are people who have registered, but just edit logged-out, and I don't think we have a better term for this. Matma Rex (talk) 12:37, 18 May 2023 (UTC)
 * @Neil Shah-Quinn (WMF), by, say, February 2024, I hope that all three accounts types will be in use on WMF wikis. Is there a chance that you will want to use this to compare and contrast all three?  If so, we could consider replacing   entirely, with   and  .  Using "anonymous" to describe an editor whose real-world location may be readily discernable has long been disputed by English-speaking volunteers.  Perhaps we should retire that imprecise and misleading language entirely.
 * (@Danilo.mac, it's possible that this discussion would be interesting to you, since it touches on how some account data is stored.) Whatamidoing (WMF) (talk) 19:55, 19 May 2023 (UTC)
 * In my point of view (a volunteer), when an user use an username that they chose in the registration, they have a name, that we can trust they will always use if they are a good-faith contributor. And when an user is identified by an IP or a number they didn't choose (temp user), they don't have a name, they are anonymous. Even if they are a good-faith contributor we can not trust the contributions of that user is all aggregated in the same contribution page, and the warnings the user received are all in the same talk page. We can not be sure of all that user did because they don't have a fixed name, that is what I think about the term anonymous. For me it would be confuse to not call a temp user as anonymous. I didn't know other people had other meanings for that that term. Danilo.mac (talk) 23:15, 19 May 2023 (UTC)
 * Complementing my opinion, about how the data is stored, I didn't like the idea of store temp users in user table, I have tools that use  to get the registered users (users who filled the registration form and have fixed usernames), it would make more sense for me if actor_user was NULL for temp users and we use   to get only temp users, and have a separate table to store temp users data. But that is an opinion of who only develop tools in Toolforge and not contribute directly in the MediaWiki code, and I didn't read the phab tasks, I may be missing some details that make that idea not possible. Danilo.mac (talk) 01:04, 20 May 2023 (UTC)
 * In case this helps - we are adding a  column to the user table which you could use to differentiate between temp users and other (registered) users. Here's a ticket that explains this work: https://phabricator.wikimedia.org/T333223
 * Would that help with your tools? -- NKohli (WMF) (talk) 13:05, 21 May 2023 (UTC)
 * +1 to what @Matma Rex and @Whatamidoing (WMF) said. I have no attachments to "non-logged-in" terminology - I think we could replace it by "unregistered" on the project pages. And I would like us to refrain from using "anonymous" for the reasons MatmaRex mentioned. I like the suggestion to replace  with   and  . NKohli (WMF) (talk) 12:59, 21 May 2023 (UTC)
 * Thanks everyone for the input! It sounds like everyone agrees that temp users should be considered a type of unregistered/logged-out/anonymous users. I've updated the table to reflect this and to hopefully make it a bit easier to read.
 * I get what y'all are saying about the problems with the term "anonymous". I have heard about these problems before and, if I had been in charge of naming, I probably would have chosen  for that reason. I'm not sure if it's worth deprecating the existing field, but definitely when we add fields to make sure we can distinguish all three types of users, we will use the preferred terminology (unregistered, IP, temp, registered). This page makes it a lot easier to do that! Neil Shah-Quinn (WMF) (talk) 20:01, 22 May 2023 (UTC)
 * I've also just updated the table to document the behavior of the four User functions from MediaWiki. Neil Shah-Quinn (WMF) (talk) 20:09, 22 May 2023 (UTC)
 * Seeing as temp users are registered users, I think it would be incorrect to call them 'unregistered'. However, perhaps changing the definition of 'anonymous' could be worthwhile. Anonymous could include both 'unregistered' and 'IP/temp' user.
 * This would be simple from a data perspsective, since there is no concept of 'anonymous' in MW data models, just 'registered' and 'not registered'. Historical anonymous == not registered, but perhaps the code should be changed to that User::isAnon returns true for temp accounts. Ottomata (talk) 21:08, 22 May 2023 (UTC)
 * See also T336176 - MediaWiki user types Ottomata (talk) 21:10, 22 May 2023 (UTC)
 * @Ottomata as I mentioned in my first post, I came into this discussion with the same idea, that temp users should be considered registered because they have rows in the  table and because   returns true for them.
 * But @Milimetric (WMF) and @MKampurath (WMF) had a different idea: that temp users should be considered unregistered, because from a user point of view they have almost everything in common with IP users and almost nothing in common with traditional registered users (as @Danilo.mac said above). Also, from a semantic point of view, they have not, in fact, gone through the registration process!
 * Eventually, I came around to their point of view and came here to see if others agreed as well. @NKohli (WMF) and @Matma Rex did, which is why I changed the page to reflect it.
 * From this perspective, the rows in the  table are an irrelevant implementation detail and   returning true is just a "quirk for reasons of backward compatibility", as Matma Rex said.
 * Overall, I think it makes sense to declare that temp users are unregistered; to make code, documentation, and data reflect this wherever possible; and to document it as a historical quirk whenever that isn't possible. Neil Shah-Quinn (WMF) (talk) 00:32, 23 May 2023 (UTC)
 * Aye, I think your conclusion of what the term should mean makes sense. But, having the same term mean opposite things will end up confusing a lot of people. It might be worth gathering support from more MediaWiki core devs, and deciding and documenting this decision there, rather than just making it externally.  T336176 - MediaWiki user types might be a good place to do that.
 * We just finalized the page_change schema and stream, which will have   and   for Temp users, in data that we use to generate new dumps, generate search indexes,  and in event streams we will publish publicly. Ottomata (talk) 00:43, 23 May 2023 (UTC)
 * @Ottomata I see what you're saying and I agree that it's important that everyone be on the same page here.
 * The user type ticket has a lot of other issues mixed in, so I wouldn't go there. It seems like the discussion is already flowing here, so how about we attract some more participation here? Post on the Phab tickets you mentioned, ping the relevant Foundation teams, maybe even post of the wikitech list? Neil Shah-Quinn (WMF) (talk) 00:51, 23 May 2023 (UTC)
 * Accountless? :) Elitre (WMF) (talk) 09:40, 23 May 2023 (UTC)
 * +1 2603:7000:8B07:3111:44A9:9100:9E1C:6B7B 10:29, 23 May 2023 (UTC)
 * Oops, hah that was me replying 'anonymously' from my phone. :p Ottomata (talk) 11:53, 23 May 2023 (UTC)
 * FYI, we are going to remove is_registered from the page change event schema. Ottomata (talk) 16:08, 24 May 2023 (UTC)
 * If  and   cannot be changed for backward-compatibility reasons, what about deprecating them? A deprecation process would also eventually break backward-compatibility, but this breakage would result in PHP errors instead of silently subtly different behavior, and people using them would be warned now (if they use a capable source code editor) but hit by the breakage only later. —Tacsipacsi (talk) 13:33, 23 May 2023 (UTC)
 * > what about deprecating them?
 * I think this is a great idea. This should probably be something that the IP Masking project drives.  @NKohli (WMF)?
 * I was just considering removing the  boolean from the page change event altogether.  One could still check for user_id > 0 to get the same semanatics. Ottomata (talk) 14:03, 23 May 2023 (UTC)
 * I agree with deprecating if possible. My suggestion would be to deprecate and keep:
 * would be true only when the IP of the editor is revealed
 * would be true only when the user account is temporary (the  flag is set in the user table)
 * With these two methods we could look at any revision past, present, or future, and have a very clear way to tell what type of user is involved. (I'm ignoring other stuff like  for the purpose of this conversation) Milimetric (WMF) (talk) 15:20, 23 May 2023 (UTC)
 * "Revealing" an editor's IP is new jargon for a non-CheckUser finding out what IPs have been used by a given a temp account, so please re-word that first bullet point before it gets recorded anywhere official/permanent. Whatamidoing (WMF) (talk) 18:12, 23 May 2023 (UTC)
 * @Whatamidoing (WMF): When a user is editing without logging in, currently, their IP is recorded in the revision history and is visible on the "View history" tab. Is there better terminology to refer to that specifically? Milimetric (WMF) (talk) 18:46, 1 June 2023 (UTC)
 * Mentioned this to @Daniel Kinzler (WMDE) and he reminded me that MediaWiki is not getting rid of IP users, only WMF is. We can't deprecate or change the semantics of User::isRegistered, as it will continue to be used outside of WMF. Ottomata (talk) 19:04, 23 May 2023 (UTC)
 * Why couldn’t we? According to the table,, so wherever currently   is used,   can be used instead. I assume   and   will be available on any MediaWiki install, regardless of whether IP masking is enabled or not, simply   will always be false if it’s not enabled. —Tacsipacsi (talk) 19:46, 23 May 2023 (UTC)
 * Indeed. Another alternative for  is , which would probably be the best in code that wants to check "this user has a row in the user table" (which I'd argue is the real meaning of  ). Matma Rex (talk) 12:36, 24 May 2023 (UTC)
 * I agree that third party use of  shouldn't affect deprecation plans.  There's a proper replacement that makes dealing with the new concepts (temp users) easier.  If your MW instance doesn't intend on enabling temp users, the new abstractions still help you as you install extensions and operate in the ecosystem. Milimetric (WMF) (talk) 18:52, 1 June 2023 (UTC)
 * Instead of deprecating, could we just change the semantics of User::isRegistered? If a MediaWiki install does not use Temp users, nothing will change.  But, if temp users are enabled, then
 * ? Seems like that would be backwards compatible?  @NKohli (WMF)? Ottomata (talk) 19:07, 23 May 2023 (UTC)
 * No, changing the semantics is exactly the subtle yet backward-incompatible change we want to avoid. Code may do any sorts of assumptions about  and   that doesn’t hold anymore, for example that these users have no entries in the user table, that they’re parseable as IPv4 or IPv6 addresses etc. —Tacsipacsi (talk) 19:46, 23 May 2023 (UTC)
 * Yeah you are right. I think this is why it might be hard to deprecate too though.  Of course it is possible, but my understanding from Daniel is that IP masking explicitly needed temp users to be 'registered' users, so that a bunch of existing code and tools (Talk pages, others?) didn't have to be changed.
 * But, you make a good point. @Neil Shah-Quinn (WMF) What about just using the existent term isNamed vs isTemp, and avoiding all references to isRegistered/Anonymous?   Ottomata (talk) 12:24, 24 May 2023 (UTC)
 * Honestly, I don't think that would be a great outcome. I'm really interested in making the general terminology (as opposed to the very granular in-code terminology) here as simple and accessible as possible, and adding a totally new term goes against that. As much as possible, I want these terms to used consistently in high-level metrics, datasets, documentation, the interface, user conversations, and so on, and it's much, much easier to have few developers change their definition of "registered" than to have a huge group of users replace a familiar term ("registered") with an unfamiliar one ("named").
 * I know you're under time pressure here because you're working on, but I'm willing to try to drive this conversation to a broader consensus, and I hope that can be done in no more than a week. Neil Shah-Quinn (WMF) (talk) 19:53, 24 May 2023 (UTC)
 * FWIW, we are going to remove is_registered from mediawiki/page/change to avoid this issue for now, so from my perspective we have more time. Ottomata (talk) 20:15, 24 May 2023 (UTC)
 * All makes sense. IMO a worse outcome is different conceptions of a singular concept.
 * You might try to keep mediawiki data out of high level metrics and documentation, but if dataset/metric pipeline developers have to add code that does, I think you are going to have a hard time ensuring that this is true. Ottomata (talk) 20:18, 24 May 2023 (UTC)
 * Changing the semantics of a method like User::isRegisteres is really problematic - if anything, we should deprecate the method, and replace it with something that has a better name and more clear semantics. However it should be noted that this method is used a **lot** - codesearch finds 390 files that contains references to it in production across core and various extensions. This is work that will have to be explicitly resourced.
 * That said, I think "registered" is the smaller problem. The use of "anon(ymous)" is much harder. The User::isAnon method isn't used as much as User::isRegistered, but the term anon is used in public APIs. Changing public APIs is much harder and slower than deprecating PHP methods.
 * Beyond the names, there are features that currently rely on making distinctions based on whether the user ID is 0 or not - e.g. the "registered user" filter on Special:RecentChanges. This would have to be re-implemented to work with the new logic (which may require changes to the database structure to allow for efficient filtering). Do we somewhere have a list of features that would need to be re-implemented to account for the changed meaning of registered/anonymous? DKinzler (WMF) (talk) 13:30, 26 May 2023 (UTC)
 * I agree. And change the API can result in malfunction in bots and external tools. I don't know how far would be the remove of "anon" terminology, but we have many uses beyond the  and API. Database has   column, where temp user would be anon. Javascript has , that is used in many gadgets and scripts. The MediaWiki configuration variable  . Extensions configuration variables like   and  . And many documentation pages say "anonymous users" meaning those users who didn't fill the registration form. Danilo.mac (talk) 14:19, 28 May 2023 (UTC)
 * The big question is - what will break more things, treating "temp" users as "aonon" in the API, or not treating "temp" usres as "anon" in the API? We will have to pick one, but neither option seems to be particularly appealing. DKinzler (WMF) (talk) 18:27, 28 May 2023 (UTC)
 * The tools that use the API to get "anon" users usually want to get the logged-out users, so they probably will not break if temp == anon. And the same with js gadgets and scripts that use . Danilo.mac (talk) 13:28, 30 May 2023 (UTC)
 * Yeah, similar to API users, I (and other data folks) are guessing downstream datasets will want to keep the umbrella "anonymous" abstraction on top of both IP and Temp users. But internally we're distinguishing temp users with the new   column (thank you for that).  Because in the beginning, users of wikistats or other dashboards won't have any use for the distinction.  Then we'll study the new data coming in, and if it makes sense to pass on the distinction, we'll update all downstream pipelines. Milimetric (WMF) (talk) 19:09, 1 June 2023 (UTC)