Talk:Edit check

So glad to see!
@ESanders (WMF)@PPelberg (WMF), I'm very glad to see that this project exists! If implemented well, I think it will have a strong positive impact on the new editor experience. Please let me (and others!) know when you're looking for community feedback, and I'll be happy to share thoughts on specific ideas/prototypes! Cheers, &#123;{u&#124; Sdkb  }&#125;  talk 18:29, 24 January 2023 (UTC)


 * hi @Sdkb! I'm glad that you're glad to see the Editing Team prioritizing work on this ^ _ ^
 * Please let me (and others!) know when you're looking for community feedback, and I'll be happy to share thoughts on specific ideas/prototypes!
 * Thank you for offering as much; we're going to need it!
 * In fact, since we're here, I wonder: can you think of specific Senior Contributors or groups of Senior Contributors that you think we ought to prioritize reaching out to first? I've added a bit more context below about the project and what we're currently looking to learn at this stage.
 * All of the above aside, thank you for dropping by! You doing so was the reminder I needed to prioritize posting a status update about this work which I'll do on the project page before this week is over.
 * Seeking Input from Seniors
 * Okay, this is the "context" I referred to above...
 * In the coming weeks, I expect us to be ready to share an initial idea for the user experience of the first "check" we're prioritizing work on: a check that will present people who are attempting to add new content without a corresponding reference with a call to action to do just that. ''This design work is happening in T325711.
 * While we'll be keen to hear what Senior Contributors think of the user experience we're proposing to present to newcomers and Junior Contributors, we're thinking some of the most valuable things we have to learn from y'all in this moment are things like:
 * "In what – if any – ways do Senior Contributors think the proposed mobile UX flow could be adjusted/augmented to increase the ease and efficacy with which they can review edits people will be making with Edit Check?
 * "Which parts of the feedback system Edit Check is proposing do Senior Contributors would value being able to configure on a per-project basis?" [i][ii]
 * i. This "configuration" bit is particularly top-of-mind for the team as we're seeing Edit Check as having two distinct, and complimentary, dimensions: 1) the user experience newcomers and Junior Contributors will see and 2) the configuration tools/options Senior Contributors will have access to that will enable them to customize this user experience it guides people to take actions that align with project policies and conventions.
 * ii. For more details about how we're thinking about consulting with experienced volunteers, please see T327330. PPelberg (WMF) (talk) 22:33, 24 January 2023 (UTC)
 * ...thank you for dropping by! You doing so was the reminder I needed to prioritize posting a status update about this work which I'll do on the project page before this week is over.
 * ✅: Edit Check Update: 27 January 2023 PPelberg (WMF) (talk) 19:32, 27 January 2023 (UTC)
 * @PPelberg (WMF), thanks for sharing all that! I reviewed the project page, phabricator tickets, and Figma mockups for desktop and mobile. I responded on Phabricator to some of your prompts there.
 * I think you're on the right track with who to reach out to — w:WT:Teahouse and w:WT:Growth Team features are good options, and I'd add somewhere more general like w:WP:VPI to bring in some folks who might have good advice but who we might not have thought to ask.
 * Regarding your questions:
 * If I'm understanding this right, you're looking to make sure that the mobile implementation of EditCheck doesn't make mobile editing annoying for experienced editors. Well, when I need to make an edit on mobile, my workflow (and that of most other experienced editors I know) is to just switch to desktop mode, since the mobile mode is so deeply lacking in functionality it's preferable to suffer through the tiny fonts/buttons of desktop on mobile. I know has been interested in mobile editing, so they might be able to offer more.
 * There are lots of things we might want to configure. We care more about referencing for certain types of articles, such as those on living people and those on medical topics, so we'd likely want a more aggressive implementation for those than others. Of note for you all, there's currently a tag for new users adding unreferenced material to a BLP. Noting this comment you made on Phabricator — Iterate on the "Citoid" edit card to include some kind of guidance that prompts people to reflect on whether the source they're considering adding is likely to be one other volunteers deem acceptable — we'll absolutely want control over the source list there, so that we can modify it as RSP changes, and ideally we'll want to be able to provide context/specific conditions for (sometimes-)unreliable sources, as it's far from just a binary reliable/unreliable switch. And we'll want to be able to tweak all the guidance text that shows up at the different steps.Going more broadly, I think the true wiki way would be to allow us to come up with custom filters (just as we currently do abuse tags) and provide custom feedback. This would probably mean leaving out some of the more advanced functionality, but I think the community would be able to come up with many good uses that'd be more effective than our current approach when we notice a persistent error of trying to expand guidance (leading to creep).
 * One other thought (for now): Many editors who are contributing uncited material are providing original research, so we'll want to identify and provide guidance to them. The Upload Wizard comes to mind for its approach to a similar problem. When you're contributing a local file, it asks you whether it's a free work, in a fair use category, or doesn't fit either of the categories above. If you choose the last option, it comes back with Please don't upload it as it's very likely a copyright violation. For here, the question we want to make folks adding information answer is "How do you know this?" If it's from a particular source, we of course want the source. But if not, we want to give editors a way to express that by saying "it's something I know from my personal experience". And we want to then explain to them why that's original research and why they need to either find a source for it, or if it's private information that doesn't have a public source, why they shouldn't contribute it.
 * Cheers, &#123;{u&#124; Sdkb  }&#125;  talk 06:47, 28 January 2023 (UTC)
 * First off, thank you for investing the time and energy into thinking about these prompts and making this thinking easy for us to understand and engage with, @Sdkb!
 * I'm going to respond to the specific points you raised in discrete comments so that we can more easily explore the distinct points you are raising here.
 * Before that, thank you for being patient with me.
 * Note: I've also seen the comments you posted on T327330 (thank you!); I'll respond there directly. PPelberg (WMF) (talk) 00:57, 11 February 2023 (UTC)
 * I think you're on the right track with who to reach out to — w:WT:Teahouse and w:WT:Growth Team features are good options, and I'd add somewhere more general like w:WP:VPI to bring in some folks who might have good advice but who we might not have thought to ask.
 * Excellent. You affirming the instincts we had leads me to think something like, "Okay, we're on the right track." PPelberg (WMF) (talk) 00:58, 11 February 2023 (UTC)
 * If I'm understanding this right, you're looking to make sure that the mobile implementation of EditCheck doesn't make mobile editing annoying for experienced editors.
 * While we're open to hearing input on the user experience, at this stage, we're primarily seeking to learn things like the following from experienced volunteers:
 * How can you envision this general approach going wrong? What fundamental assumptions/constraints might this project be at risk of "running up against"?
 * What logic/heuristic do experienced volunteers think ought to cause the reference check to get triggered? Thank you for responding to this in T327330#8566397; I'm going to follow up with you there.
 * What kinds of editors do experienced volunteers think the reference check should be enabled for? E.g. All logged out editors? All logged out editors *and* editors who have made fewer than (insert number here) cumulative edits? Etc.
 * How might experienced volunteers value being able to evaluate/audit how Edit Check is performing and the impact it is having so that they can make improvements to it?
 * What other kinds of checks can experienced volunteers imagine being useful? I see you shared a collection of wonderful ideas in T327330#8581224 (thank you!). Similar to "2.", I'm going to respond in Phabricator.
 * Note: we do NOT anticipate the in-editor Edit Check user experience to disrupt experienced editors because we:
 * Assume experienced editors are unlikely to use the visual editor on mobile to edit. See data
 * Assume experienced editors are likely to intuitively add references when they are adding new content to the wikis and thus, be unlikely to trigger the reference check
 * Think we might start by only enabling Edit Check for people who are logged out and have made fewer than, let's say, 500 edits or something like that. Exact number TBD. Although, writing this out led me to realize we need a task to hold ourselves accountable for deciding this: T329340.
 * PPelberg (WMF) (talk) 01:01, 11 February 2023 (UTC)
 * Responding to the numbered questions:
 * I think there is some risk for false positives. E.g. if the tool tells people to add citations for the plot section of a film, that'll violate w:MOS:PLOTCITE. There is also some risk for false negatives. This could happen if newcomers become used to the tool, and then start expecting it to let them know whenever they need to add a reference. Then, when the tool misses an instance, they might falsely think that they don't need to add anything because edit check didn't flag it.More generally, I think the difficulty of this project is that it's trying to communicate Wikipedia's policies/guidelines succinctly at points when newcomers need them, but those guidelines are both inherently complex and not designed to be machine-readable. Because of that, identifying when they come into play will be a challenge.
 * (See Phabricator discussion)
 * We always want even the newest of newcomers to be incentivized to create an account, so I would not want to see it enabled only for logged-out editors. Once the tool is mature, I could see it being enabled for everyone if it's good enough about not having false positives — but of course there would be an option in the settings for anyone to turn it off. For the beta stage, I think something like fewer than 15 days or 100 edits might capture most of the target user base. I think you should also make it an opt-in beta feature, though, since some experienced editors like myself are going to want to try it out to provide feedback.
 * This could be tricky, as our normal way of checking things is to look at edits, but this tool of course intervenes before an edit is made. I think trying it out for ourselves will be very helpful, as we'll be able to see how accurate its suggestions are.
 * (See Phabricator discussion)
 * Cheers, &#123;{u&#124; Sdkb  }&#125;  talk 18:42, 11 February 2023 (UTC)
 * Well, when I need to make an edit on mobile, my workflow (and that of most other experienced editors I know) is to just switch to desktop mode, since the mobile mode is so deeply lacking in functionality it's preferable to suffer through the tiny fonts/buttons of desktop on mobile.
 * Understood. I appreciate you naming the differences in what functionality is and is not available within the desktop and mobile editing experiences.
 * While we are aware of this, the Editing Team is not currently prioritizing work on feature parity.
 * I recognize that me saying the above offers no relief. Tho, it is important to me that, at a minimum, you know what the Editing Team is and is not focusing on in the near-term. PPelberg (WMF) (talk) 01:03, 11 February 2023 (UTC)
 * Yeah, certainly. I definitely didn't mean it as an accusation that you ought to focus more on mobile editing. (And indeed, I'm not sure how fruitful that might be — there are some tasks just inherently not well-suited to mobile.) &#123;{u&#124; Sdkb  }&#125;  talk 18:46, 11 February 2023 (UTC)
 * There are lots of things we might want to configure. We care more about referencing for certain types of articles, such as those on living people and those on medical topics, so we'd likely want a more aggressive implementation for those than others.
 * Understood and I feel energized hearing that you anticipate people wanting to be able configure the checks themselves!
 * But back to what you shared, two follow-up questions for you:
 * Would it be accurate for me to think that you'd expect to be able to configure the check based on the category in which the article someone is editing belongs to?
 * Can you say a bit more about what you mean by "aggressive implementation" in this context? Asked another way: can you share the kinds of 'rules' you'd imagine wanting the reference check to operate on and that you think volunteers would value being able to 'tune' based on the category an article belongs to?
 * PPelberg (WMF) (talk) 01:03, 11 February 2023 (UTC)
 * On (1), yes. We might also want to configure the check based on project tags (which add categories to the talk pages) or even Wikidata information.
 * Another way we might want to configure is based on article quality, which would also be reflected in talk page categories. For instance, we might decide to tone down the suggestions on a featured article, which has already received extensive human review.
 * On (2), I'm thinking of it in terms of the confidence level of the tool. So, for example, if the tool thinks there's an 80% chance that adding a reference is needed, we might decide that that's high enough that we want to display the check at a BLP page, but not if it's a normal page. &#123;{u&#124; Sdkb  }&#125;  talk 18:51, 11 February 2023 (UTC)
 * Of note for you all, there's currently a tag for new users adding unreferenced material to a BLP.
 * Oh, great spot. I've added this tag/filter (Special:AbuseFilter/history/686) to the Edit Check project page. Although, please boldly edit that page if I've linked to a tag/filter that differs from the one you had in mind. PPelberg (WMF) (talk) 01:05, 11 February 2023 (UTC)
 * Noting this comment you made on Phabricator — Iterate on the "Citoid" edit card to include some kind of guidance that prompts people to reflect on whether the source they're considering adding is likely to be one other volunteers deem acceptable — we'll absolutely want control over the source list there, so that we can modify it as RSP changes, and ideally we'll want to be able to provide context/specific conditions for (sometimes-)unreliable sources, as it's far from just a binary reliable/unreliable switch.
 * Great call; I'm glad you named the nuance and configurability such feedback would require.
 * I've added a note to T276857 (the ticket where we'll explore adding feedback of this sort) so that I can hold us accountable to revisiting this conversation when we prioritize work on it. PPelberg (WMF) (talk) 01:05, 11 February 2023 (UTC)
 * And we'll want to be able to tweak all the guidance text that shows up at the different steps.
 * Understood. PPelberg (WMF) (talk) 01:06, 11 February 2023 (UTC)
 * Going more broadly, I think the true wiki way would be to allow us to come up with custom filters (just as we currently do abuse tags) and provide custom feedback. This would probably mean leaving out some of the more advanced functionality, but I think the community would be able to come up with many good uses that'd be more effective than our current approach when we notice a persistent error of trying to expand guidance (leading to creep).
 * I feel inspired reading the above! Primarily because it sounds like you and us (the Editing Team) are seeing similar potential in this work…
 * Where – to double confirm – "potential" in this context refers to the ability for experienced volunteers to write custom filters/checks (e.g. similar to those you've listed in T327330#8581224) in ways that:
 * New and inexperienced volunteers will intuitively understand and be equipped with the tools/workflows they need to apply them
 * Experienced volunteers can audit and iterate upon
 * I'd value knowing if you sense any gaps between the potential you articulated above and what I responded with.
 * Note: I still think there is a lot for us to collectively execute and validate before I'd feel comfortable "promising" the potential we seem to both see, but yes, the Editing Team is definitely thinking about this first check as a proof of concept for an open-ended system that we can collaboratively shape and expand. PPelberg (WMF) (talk) 01:08, 11 February 2023 (UTC)
 * Yep, that's correct! &#123;{u&#124; Sdkb  }&#125;  talk 18:52, 11 February 2023 (UTC)
 * One other thought (for now): Many editors who are contributing uncited material are providing original research, so we'll want to identify and provide guidance to them.
 * Before inquiring more deeply about the Upload Wizard, would it be accurate for me to understand what you're describing above as the following?
 * In the event that someone declines to add a source, the edit check user experience ought to present them with next steps (that experienced volunteers at individual projects ought to be able to configure) that align with the best practices outlined in WP:No original research.
 * In the meantime, I've filed T329406. PPelberg (WMF) (talk) 01:09, 11 February 2023 (UTC)
 * Yep, that's correct! &#123;{u&#124; Sdkb  }&#125;  talk 18:53, 11 February 2023 (UTC)
 * @Sdkb, are you saying that if someone adds material without adding a source, then it's likely impossible for anyone to find a reliable source that could support the new material? That hasn't been my experience, but perhaps it varies by subject area.  For example, consider this edit that turned up in my (volunteer) watchlist the other day.  Would you call that "original research"? Whatamidoing (WMF) (talk) 21:11, 14 February 2023 (UTC)
 * @Whatamidoing (WMF), I'm not saying that. It's certainly always easiest for the person adding info to provide the source when they add it, but others can always do so later. For the category of editors who don't add a source, there are two subcategories: the ones where there is a public source but it's just not provided, and the ones where there is no public source (e.g. "I know Jane Smith lives in Glendale because she's my neighbor"). It's the latter subgroup that is likely original research. &#123;{u&#124; Sdkb  }&#125;  talk 21:56, 14 February 2023 (UTC)
 * @Sdkb, I am thinking about this: "Many editors who are contributing uncited material are providing original research".
 * Original research is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist". "Exist" means they're available to be found in the real world.  So if you write "Jane Smith lives in Glendale" because she's your neighbor, but it happens that Jane Smith has posted on her Facebook account that she lives in Glendale, then that's not original research.  Jane's Facebook post is a reliable, published source for her residence, and therefore a reliable, published source exists for this material, even though it's not cited.  (If, on the other hand, Jane is like me in shunning all anti-social media, then it might or might not be OR.)
 * Looking at it from the POV of the software, I think that OR is not a useful lens for evaluating content. The software can identify whether material is cited, but it won't be able to identify whether uncited material either should be (WP:V) or could be (WP:OR) cited.  Consequently, I'm thinking that it's simpler to focus on the concept of cited/uncited instead of "policy violations". Whatamidoing (WMF) (talk) 22:28, 16 February 2023 (UTC)
 * If it helps to have a real-world example, this is from earlier today.
 * If I understand correctly, you're saying that it's very hard for the software to determine whether or not a reliable source exists for content contributed without a source, so it's therefore hard to say whether or not it's original research. That makes sense.
 * Basically, I imagine the flow for editors who decline to provide a source to be something like this. First, we nag them and say, "you should really provide a source, it'll make you less likely to be reverted". For the portion that still decline, we want to know why. For some, they might be unwilling to be bothered. For some, there might be a source but the novice editor can't manage to find it (we want to divert them to somewhere where experienced editors can help with that). And for some, there might be no source available because they're contributing original research. I think you make a good point that it can be hard for a novice editor to know whether they're in the second-to-last group or the last group. But we want to find ways to provide appropriate guidance to everyone. &#123;{u&#124; Sdkb  }&#125;  talk 22:45, 16 February 2023 (UTC)
 * I don't think it would be possible for the software to differentiate that kind of edit from someone doing a basic copyedit. It's pretty easy to identify "added a whole paragraph".  It's kind of hard to identify "added a new sentence" (How many sentences is "In 1939 Mr. Smith went to Washington, D.C."?).  It's really hard to differentiate "added new information" from "fixed grammar". Whatamidoing (WMF) (talk) 00:02, 18 February 2023 (UTC)
 * @Whatamidoing (WMF), I'm not saying that. It's certainly always easiest for the person adding info to provide the source when they add it, but others can always do so later. For the category of editors who don't add a source, there are two subcategories: the ones where there is a public source but it's just not provided, and the ones where there is no public source (e.g. "I know Jane Smith lives in Glendale because she's my neighbor"). It's the latter subgroup that is likely original research. &#123;{u&#124; Sdkb  }&#125;  talk 21:56, 14 February 2023 (UTC)
 * @Sdkb, I am thinking about this: "Many editors who are contributing uncited material are providing original research".
 * Original research is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist". "Exist" means they're available to be found in the real world.  So if you write "Jane Smith lives in Glendale" because she's your neighbor, but it happens that Jane Smith has posted on her Facebook account that she lives in Glendale, then that's not original research.  Jane's Facebook post is a reliable, published source for her residence, and therefore a reliable, published source exists for this material, even though it's not cited.  (If, on the other hand, Jane is like me in shunning all anti-social media, then it might or might not be OR.)
 * Looking at it from the POV of the software, I think that OR is not a useful lens for evaluating content. The software can identify whether material is cited, but it won't be able to identify whether uncited material either should be (WP:V) or could be (WP:OR) cited.  Consequently, I'm thinking that it's simpler to focus on the concept of cited/uncited instead of "policy violations". Whatamidoing (WMF) (talk) 22:28, 16 February 2023 (UTC)
 * If it helps to have a real-world example, this is from earlier today.
 * If I understand correctly, you're saying that it's very hard for the software to determine whether or not a reliable source exists for content contributed without a source, so it's therefore hard to say whether or not it's original research. That makes sense.
 * Basically, I imagine the flow for editors who decline to provide a source to be something like this. First, we nag them and say, "you should really provide a source, it'll make you less likely to be reverted". For the portion that still decline, we want to know why. For some, they might be unwilling to be bothered. For some, there might be a source but the novice editor can't manage to find it (we want to divert them to somewhere where experienced editors can help with that). And for some, there might be no source available because they're contributing original research. I think you make a good point that it can be hard for a novice editor to know whether they're in the second-to-last group or the last group. But we want to find ways to provide appropriate guidance to everyone. &#123;{u&#124; Sdkb  }&#125;  talk 22:45, 16 February 2023 (UTC)
 * I don't think it would be possible for the software to differentiate that kind of edit from someone doing a basic copyedit. It's pretty easy to identify "added a whole paragraph".  It's kind of hard to identify "added a new sentence" (How many sentences is "In 1939 Mr. Smith went to Washington, D.C."?).  It's really hard to differentiate "added new information" from "fixed grammar". Whatamidoing (WMF) (talk) 00:02, 18 February 2023 (UTC)

Commenting since I was pinged (thanks !). , if you're interested in my thoughts about mobile editing, I wrote a piece for the Signpost last month. There's some interesting comments on the related talk page from other Wikipedians. If there's anything you want to ask me, feel free. Clovermoss (talk) 15:20, 28 January 2023 (UTC)


 * hi @Clovermoss – it's great to arrive in a conversation with you after appreciating the some of the work you've done from a distance. Now, as someone who is experienced with and has developed nuanced thoughts around Wikipedia's mobile editing experience, I'd value hearing what you think of the mobile experience we're envisioning for Edit check.
 * I expect us to have some mockups to share in the next week or so (maybe sooner)...I'm thinking I'll ping you here once we've published them along with some prompts/questions to serve as feedback guides. How does that sound to you? PPelberg (WMF) (talk) 00:23, 15 February 2023 (UTC)
 * Feel free to reach out whenever you have something you'd like to share. I still have access to my old phone so I can actually observe what it looks like from two devices if you think that'd be useful. Clovermoss (talk) 04:33, 15 February 2023 (UTC)

I responded at the ticket and have found this discussion, which is perhaps a better place for it. Mathglot (talk) 00:13, 12 February 2023 (UTC)


 * hi @Mathglot! Based on what you've written on Phabricator, I have the sense that we (you, @Sdkb, and the broader Editing Team) are thinking about Edit check in similar ways which I feel quite energized about!
 * You can expect direct responses from me to the feedback you shared in T327330#8607428 before this week is over. In the meantime, I wanted you to know that the feedback you shared is by no means too late. In fact, it's arriving at just the right time ^ _ ^ PPelberg (WMF) (talk) 00:18, 15 February 2023 (UTC)
 * Thanks for the update. Just wanted to confirm that I am interested, and to let you know that I'm less experienced here at mw:, so if you ping here and don't hear from me within a day or two, definitely try an en-wiki ping; I won't knowingly ignore responses from you (or anyone) here. Mathglot (talk) 23:06, 16 February 2023 (UTC)
 * Understood! And just so you know, the response I committed to sharing before today was over will need to come next week instead...today got away from me ^ _ ^ PPelberg (WMF) (talk) 01:52, 18 February 2023 (UTC)

Above somewhere you asked about who might be contacted, and I know that User:Cullen328 has a page on smartphone editing and may be interested in this topic. (It's too far up to try to figure out the right indent/spot to add it, so just tacking it on here.) Mathglot (talk) 23:37, 16 February 2023 (UTC)
 * Thanks for the ping, . I have been an active editor on English Wikipedia for almost 14 years and 98% of my contributions for the past 12 years have been made using smartphones. I have been an administrator for 5-1/2 years and have carried out many thousands of administrative actions on my phone. My strongly held perspective is that all sites and apps offered by the WMF for the purpose of editing should be fully functional. Wikipedia is a collaborative project and new editors become productive and committed to the projects when they are fully able to collaborate and interact with their colleagues. In my view, all of the software offered to mobile editors is inadequate in a variety of well-documented ways. That is why I use the so-called "desktop" site on my phone, because it is fully functional and enhances collaboration. "Desktop" in this context, in my view, is an inaccurate moniker that discourages mobile editors from using readily available fully functional collaborative software on their phones. I am aware that my view of this matter is is often castigated and ridiculed, but I believe that my own long record of contributing high quality, well-referenced content, and fully participating in many "behind the scenes" areas of the encyclopedia serves as a refutation of those criticisms. To be frank, I am not much interested in placing band-aids on software that I believe impedes collaboration. If these fixes help a little bit, that is fine, I suppose. I take a much broader view. Cullen328 (talk) 00:37, 17 February 2023 (UTC)
 * @Mathglot, @Cullen328, and anyone else: You should see a "Subscribe" button at the top of this thread.  If you click it, you'll get Echo/Notifications for every new (signed) comment added. Whatamidoing (WMF) (talk) 00:04, 18 February 2023 (UTC)

Dark Continent
The project has a focus on editors from Sub-Saharan Africa. But there seem to be several issues including:


 * 1) As noted, "people from Sub-Saharan Africa represent only 1% of active unique editors".
 * 2) Sub-Saharan Africa is different in nature to the West, being lower in resources, development and infrastructure

The proposition is to add a feature to the Visual Editor to nag editors about adding references. But how does this address the geographical issue? Sub-Saharan Africa is not richly endowed with libraries, museums, quality press and other GLAM institutions which will help with formal citations, is it? Instead, it has different traditions such as oral history. So, nagging Sub-Saharan Africa about expectations based upon the experience of the more privileged 99%, doesn't seem appropriate.

For example, consider a topic in the English Wikipedia: Furra -- a legendary queen. This was started by a BBC journalist with Ethiopian heritage. I assisted her in getting this started at a BBC 100 event and then did some follow-up work, taking it through the DYK process. I was able to find some sources to cite in our customary way but it wasn't easy and I suppose that the work might seem quite difficult or strange to people in that region. There might be some good sources in the Sidama language and oral tradition but they will not be easy to access here.

So, the proposal doesn't seem a good fit with the objective. Perhaps it should be turned around so that the 99% of our editors in the developed world, like me, get the nagging, while those in Sub-Saharan Africa are exempted from this.

Andrew Davidson (talk) 13:12, 23 February 2023 (UTC)


 * Quite interesting point you suggested relating how people will reach out to library or reliable sources: if not par to Sub-Saharan editors, us in ja, or in the global North but in the East, also suffer from finding good sources to reach. We are gifted with decent libraries to each prefecture, dotted Japan in 47+ cities. But as far as you are yet to learn how to hunt books in libraries, numbers are numbers, not treasure troves.
 * I am green to my ears that Sub-Saharan peers are given the chance to explore the five-pillar supported effort and try explore reliable sources with peers. And I am ambitious to see how things turn out, as Sub-Saharan experience would be a great forerunner for us on jawp.
 * One thing aside from how many libraries you have near you. Isn't this an opportunity to demonstrate that you can't write un-sourced materials on Wikipedia? Our weak point in jawp is that newer editors need to learn that and Five Pillars, and realize we are not on a chat board or SNS. Of course, some topics such as anime or voice actors, pop singers, attract newer editors; they come jumping in to write/change whatever they heard/ seen, and such pages naturally ends up in editing wars. We senior editors fail to prove them what a digital encyclopedia should be handled. Or us old-school don't understand the best route to find backing information for those very new and developing artists/genre of entertainment.
 * One question.
 * Seriously, should such reliable sources be physically located on the Continent? Or documented in local language only? Written in local language and any chance stored across the ocean? My search for good ukiyo-e woodblock prints of 18th-19th century Japan often leads me to the States or UK museums/collections. Then I need to correct captions on Commons, as references on each artist are a bit richer in ja, and misunderstanding can be solved with reliable sources.
 * Well, replace prints to sculpture/costume abroad, and references local, that formula might apply to any part of the world. Human history had many explorers/scholars coming to the end of the world (on the map projection they had used), picked up whatever they found curious, brought those evidences/collections to their home in the global North, to the West. That is not that bad, or in Tokyo alone, we, too, had collected items of anthoropology from around Japan, then lost many of them to fires/earthquakes in the 20th century. Or hymns of 16th century Jesuit has been orally handed down to 1860s among those hidden Christians in southern Japan (orrassio in local term), in 16th century Catalan dialect.
 * Let's hope we have backed up important human knowledge in material form at distant places. And local wisdom or non-material knowledge is still handed down among us somewhere on this planet, including among Sub-Saharan peers. Or invite unwritten wisdom to be put into texts. Are there any more dreamer like me? Omotecho (talk) 13:12, 24 February 2023 (UTC)
 * Suzuki Harunobu - Evening Snow on the Heater.jpg
 * Good points. Me, I live in London, which has many GLAM institutions and so I recently started an article after seeing an exhibit at our Science Museum.  But multi-cultural topics can be challenging.  For another example, I started an article to explain a ukiyo-e print when it was featured as a baffling Japanese picture on the English Wikipedia (right) without any narrative to explain the subject.  That has now been expanded into a detailed explanation of the series -- an elaborate parody or pastiche of Chinese art.  People in one culture may find it difficult to understand such intricate conventions, culture and history of another... Andrew Davidson (talk) 14:05, 24 February 2023 (UTC)
 * Wonderful I encounter somebody who cares about ukiyo-e, and so appreciating to focus on Harunobu, a kind of mysterious artist to me.
 * I imagine it takes very deep understanding that many art forms in Japan had entertained ideas to have your tongue in cheek kind of resistance against the Bakuhan system. Or brocade on the back of cotton kimono and cheat the luxury ban imposed on you by the government/feudal lords stocking for wars that would never break.
 * Then, that kind of everyday people's spirit against the rulers would be a global theme, won't it be? How about in Sub-Saharan area?
 * I am thrilled to imagine that had happened somewhere...
 * FYI, regarding the image of the cotton on those laquarwares, I will scan NDL biblio database. Kindly,, Omotecho (talk) 13:43, 25 February 2023 (UTC)
 * Some of the major interest areas in Africa seem to be pop culture (like everywhere else), current politics (like everywhere else) and businesses information. As a result of the interest areas, the sources tend to be available online.   Whatamidoing (WMF) (talk) 22:06, 28 February 2023 (UTC)
 * But not always. For example, consider the town of Azia which is in Nigeria.  I picked this up on prod patrol and then had some difficulty defending it.  The best source is a book.  To access this myself, I had to visit the British Library.  That's one of the best reference libraries in the world but it's in London and so not readily available to most editors.  The original editor used the book but then got burnt by using it too faithfully.  This is a common issue -- damned if you do and damned if you don't.
 * Note, by the way, that one of the editors involved in this incident -- SpinningSpark -- has recently died. My impression is that the original corps of veteran editors who established much of the current thinking about such matters is now dying off or being driven off.  There is much that can and has been said about this.  I don't want to digress but it may be relevant as background.
 * Andrew Davidson (talk) 23:24, 28 February 2023 (UTC)
 * Picked up the term British Library: @Whatamidoing (WMF), hi, do you see we are able to expand like the following, and I wonder whose team under the WMF umbrella does the best job: in my case, I'd depend much on Wikipedia Library;
 * can we request the WL sponsor establishment and set a number of temporary accounts on Wikipedia Library; they can expire automatically on the due date of each initiative?
 * Or any matching page where those needing book research in a book/journal will be paired with those WM editors, who are lucky enough to hold accounts, esp at those publishers whose resources focus on each initiative? If not writing or content proofreading, book research itself is an area some people are very resourceful.
 * Wikimedia Library has partner establishments who sets the number of total library cards or free-of-charge log-in accounts. That is why editors are in queue for vacancy, and those who holds account are encouraged to offer and do book research when requested. I have encountered only two jawp editors who were aware of WL, so I translated the page into jawp. As Wangari Muta Maathaï has said, Mottainai! (waste of reserouces)
 * Omotecho (talk) 04:58, 1 March 2023 (UTC)
 * Mottainai!? It's a small world as that was a remarkably controversial topic on the English Wikipedia -- see Talk:Mottainai and its archive. Andrew Davidson (talk) 10:16, 1 March 2023 (UTC)
 * Mottainai!? It's a small world as that was a remarkably controversial topic on the English Wikipedia -- see Talk:Mottainai and its archive. Andrew Davidson (talk) 10:16, 1 March 2023 (UTC)

Where will this sit in the publish process?
Timing wise, is Edit check going to be before the contributor attempts to publish, or will it interrupt the publish process? Other items in this workflow are pre-publish items like disambiguation resolution, and interpublish items like CAPTCHA and Abuse Filter. Xaosflux (talk) 15:44, 28 February 2023 (UTC)


 * I think the answer is "yes". Nothing's written in stone (or even in code ;-) yet, but the notion is that it could trigger:
 * after you've added some content (e.g., a whole paragraph) and/or
 * when you click the Publish changes… button.
 * The first option runs the risk of interrupting you. The second option runs the risk of people cancelling their completed edits.
 * I think that the very first version will not trigger frequently. (If there's something totally broken about the experience, you don't want it interfering with all the edits.)  So – on the first day – it might not trigger on a page like w:en:User:WhatamIdoing/Christmas candy, because it contains just one short paragraph (115 bytes long) plus some lists.  But eventually I would expect it to trigger when a short paragraph is added. Whatamidoing (WMF) (talk) 21:37, 28 February 2023 (UTC)
 * From the early mockups on Figma, it also looked like there may be a difference with mobile vs. desktop, with mobile interrupting the publish process whereas desktop coming up as you edit. &#123;{u&#124; Sdkb  }&#125;  talk 19:01, 3 March 2023 (UTC)
 * The team decided recently that the mobile and desktop versions are probably going to be different. It's more work, but it could produce a more natural feel in the different platforms. Whatamidoing (WMF) (talk) 18:57, 24 March 2023 (UTC)

This was discussed at the meeting today. I liked the suggestion that the process might be a background activity rather than an intrusive dialog. This would highlight issues in the text as the user edited or created it. Background shading or annotation would appear, drawing attention to issues such as spelling errors, missing citations, plagiarism, grammar and so on. An example of a tool which works in this way was Grammarly. I've not used this myself but it seems successful and so may be familiar to our users. We might copy the method of operation of such popular tools so that it would seem intuitive. Andrew Davidson (talk) 22:49, 3 March 2023 (UTC)


 * It looks like the first version will be more like "intrusive dialog" and less like "background activity", but the hope is to adjust that in the future. Whatamidoing (WMF) (talk) 02:12, 17 March 2023 (UTC)
 * brings up an excellent point, below, regarding the privacy implications of this sort of approach.
 * A background task inherently means evaluating unpublished data — that's a privacy minefield, unless the analysis is performed entirely client-side without ever sending even a single byte of content back to the Wikimedia servers. (Which, for certain of those suggestions — e.g. plagiarism — is basically impossible.)
 * If data is evaluated server-side, then it would have to be immediately discarded because retaining users' private data for analysis / review purposes is a huge red flag.
 * Spiriting away unpublished content from the client edit interface would probably also run afoul of the GDPR, unless the user is given a (clearly-explained) choice and explicitly affirms their opt-in to data-sharing, before any such sharing takes place.
 * Even if the data collected were to be managed in a fully-anonymized fashion with all identifying metadata removed, it wouldn't make any difference. As points out, what we can't know for sure is that the content itself contains no sensitive personal data. So it just can't be collected or stored, full stop. Not anywhere, not for any length of time.
 * Storing potentially-sensitive data would be a bad idea, even if the only person with access was — forget about making any of it semi- or totally-public. Not having access to any of the actual content processed by Edit Check seems like it would hamper analysis of the tool's accuracy/performance.
 * All of these issues go away, once the user has signaled their intent to publish the edit interface's contents by hitting the "Publish" button. But for anything that occurs prior to that point, it's a completely different story.
 * Have any privacy experts (and/or the Evil WMF Lawyers™) been involved in discussions about Edit Check's features and design? The time to invite them to the table is yesterday. FeRDNYC (talk) 08:54, 4 May 2023 (UTC)
 * This objection seems excessive. If users have such extravagant privacy requirements then these should be addressed at the outset rather than waiting for the Publish interaction, which is not a good place for it, as discussed below. Andrew Davidson (talk) 09:16, 4 May 2023 (UTC)
 * I would not consider this an "extravagant" privacy requirement. Do you except the recipients of your emails to be able to read them before you click "Send"? Suffusion of Yellow (talk) 21:04, 4 May 2023 (UTC)
 * As noted below, we already are sending the content of unsaved edits to servers. Open up your browser's network monitor, and look for "stashedit". But as far as I know, that's discarded after one minute. I see no problem with sending back to user, and only the user, the result of some evaluation. It's allowing anyone but the user access to this result, or the text that was evaluated, that is the problem. Suffusion of Yellow (talk) 21:01, 4 May 2023 (UTC)
 * I could check in with the Legal team about this, if/when that ever happens. While storing the data might be useful during development (e.g., to see whether the evaluation is happening correctly), I'm less certain that it would be wanted in the long-term.
 * As for whether it's legally acceptable: Do users in Europe see autocomplete search suggestions on major web search engines?  Do you remember being "given a (clearly-explained) choice" or "explicitly affirm[ing] their opt-in to data-sharing" for that?  AIUI sending the data back to the servers for evaluation is how that happens, and if it's okay for Google to do that without clearly explaining the mechanism and letting you opt out, then it's going to be okay for other websites to do that, too.  I therefore doubt that it is a legally insurmountable problem. Whatamidoing (WMF) (talk) 18:33, 5 May 2023 (UTC)
 * Re storing the data might be useful during development. sure, on a separate wiki (beta or whatever) where people have a lower expectation of privacy, so long as it's clearly disclosed. On a production wiki? At a minimum, it should only be visible to NDA'd developers, and deleted-for-real (not "archived") when you're finished debugging. Yes, someone, somewhere at Google, has access to your autocomplete suggestions. But imagine if one day Google started sharing that history with all your contacts. That goes beyond "legally acceptable or not". That's creepy.
 * But each wiki is going to have its own highly customized "edit checks", right? Each time the equivalent of a local EFM updates one of the "edit checks", they need some way to know they haven't made a mistake. Looking for an NDA'd developer to go sifting through a log obviously does not scale. Hopefully, it will be enough just to look at the finished edits. Suffusion of Yellow (talk) 20:03, 5 May 2023 (UTC)
 * Hopefully, it will be enough just to look at the finished edits. Ooh, that just made me realize something. Because the Edit Check data is eventually working towards published content, it would be really easy to de-anonymize. Say the Edit Check log shows that some anonymous user was writing some text, which Edit Check flagged and suggested they change in some way. They do so, and the updated text becomes part of their edit when published. Well, the information on who made that edit is public, which means we know who received that suggestion, and we've now tied the input data back to them as well. Yeah, there's no such thing as anonymous Edit Check data. FeRDNYC (talk) 19:34, 7 May 2023 (UTC)
 * (...For published edits. If the user ultimately decides not to publish their edit, then they haven't "outed" themselves as the same person who interacted with Edit Check.) FeRDNYC (talk) 19:40, 7 May 2023 (UTC)
 * They could still out themselves by the content of the edit, e.g. "as I just told you, Bob", or "I made [this diff] because", etc. Absolutely nothing must be logged anywhere, except "this finished edit matched this edit check", which is something anyone with time on their hands could have worked out for themselves anyway. Suffusion of Yellow (talk) 19:45, 7 May 2023 (UTC)
 * That's a good question, regarding autocomplete. AFAIK it does work in Europe, although (just like the in US) you can opt out of it. However, Google sets strict limits on how that data is managed — it's quickly aggregated, so that the only strings their servers store are the ones that have been entered by thousands of users (if not more), which no longer makes it personal data. They've also been sued, more than once over the results of those autocompletions.
 * And the enhanced spell checking feature in Chrome (which sends your typed input back to their servers) is explicitly default-off and opt-in only, even though it promises to anonymize and not retain the data, all out of privacy concerns. (Much greater ones than are involved here, of course, since it has access to everything you type in any input field. Including our editing interfaces.) FeRDNYC (talk) 19:11, 7 May 2023 (UTC)
 * This is slightly off-topic. The issue of sending the text to servers is moot. The "stash edit" feature has been doing this for at least half a decade. It's why huge pages sometimes seem to save instantly; the server had already quietly parsed the page and even checked it against edit filters, and was just waiting for your confirmation. The reply tool with which you typed this response was sending data to the servers so it could show you a live preview. And of course "show preview" and "show changes" were server-side from the very beginning, and never explicitly disclosed that fact. The issue is storing the data, and worse, sharing it. Suffusion of Yellow (talk) 19:33, 7 May 2023 (UTC)
 * I agree that it's probably possible to do this (send information to the servers, and even to store it briefly) legally. I can ask Legal to particularly look at that.
 * But the bigger question, and one that Legal might not accept, is sharing the information. But if editors can't see the information, that might limit their ability to test their own checks.  The underlying software could be made as open-ended as Special:AbuseFilter, and it could be made as pre-packaged as Growth's Newcomer Home Page.  But if it's open-ended, I'd hope that editors are verifying that it's really catching what it should be.  There is always a risk that when you try to prevent one bad edit, you will accidentally lose two good edits as well.  If you can't see what's happening, you might not know that a given check is causing more harm than benefit. Whatamidoing (WMF) (talk) 22:32, 20 May 2023 (UTC)
 * My objection goes beyond "is this legal?" See my example about Alice and Bob below. It's not just questionably legal, it's toxic. Either you have to be worried at every keystroke that your innermost thoughts are going to be shared, or you get yelled for something you never intended to publish. Suffusion of Yellow (talk) 22:56, 20 May 2023 (UTC)
 * I agree that it's a significant worry. It's one thing to have a couple of people with a contractural obligation to maintain confidentiality who are checking something on many edits, for a particular purpose ("Your call may be recorded for quality purposes...").
 * It's quite another thing to have an editor with no obligations, and who might be in a dispute with you, reading the words you typed before you publish them, including things you did not publish.
 * I would not like to see this be possible. But:  There are consequences for this decision.  Being able to see what editors do before they publish (e.g., right now, they can count how many editors click on a button in the editing toolbar) helps developers know whether a tool is working (e.g., nobody clicks on that button; people who click on that button abandon their edits; clicking on that button results in fewer reversions...).  Having that visibility into pre-publication actions improves the tools (e.g., nobody clicks on that button → change the label on the button, so users know what the button does).
 * Perhaps if editors can't (and they normally shouldn't!) have the information that is necessary to find out whether the tool is working, then they also shouldn't be creating brand-new tools that ought to use this sometimes-private information for testing purposes.
 * As a simple example, imagine that someone wants to encourage new articles to be well-sourced. This well-intentioned person creates a "check" that guesses whether the article is about a living person and requires the citation of six sources, and that only sources from a pre-approved list are counted.
 * What they will notice is:
 * All new articles about living people contain at least six citations from sources on the pre-approved list. 🙌
 * They might even notice that:
 * Fewer articles are created.
 * Fewer new articles end up deleted during the first 30 days.
 * What they will not necessarily see is:
 * Fewer articles about dead people are created, because the heuristic for detecting whether the subject is alive is poor.
 * Fewer articles that aren't about people at all are created, because the heuristic for detecting whether the subject is about a person is poor.
 * Articles about living people don't get created, because the editors couldn't figure out how to format the citations in a way that was recognizable to the new "check".
 * Articles about living people don't get created, because not all of the good sources were included in the pre-approved list.
 * The pre-approved list of sources is focused on English-speaking content, so the creation of articles about people from non-English-speaking countries is particularly badly affected.
 * More articles about certain borderline subjects (e.g., minor politicians, newly hired athletes, first-time actors) are being created.
 * It doesn't trigger when a redirect is turned into an article, so editors bypass the new "check" by creating articles as redirects first and only then adding article content.
 * ...and so forth.
 * When the Editing team creates software, they look for these kinds of bad outcomes. This sometimes requires more information than is visible in a diff.  If you don't feel that you could trust volunteer devs to have the pre-publication information that is necessary to notice such problems, do you also think we should we limit the "check" system so that they can't (independently) write their own, potentially inadequately tested, versions? Whatamidoing (WMF) (talk) 18:56, 23 May 2023 (UTC)
 * As in, have to file a phab ticket every time we think a "check" might need tweaking? And wait the usual one week to fifteen years to get it fixed? No, that would terrible in its own way. It would effectively give the WMF a channel to tell users the right way to write content. Traditionally, there's been a "separation of powers" and that would be a major breach. I think a combined approach will work:
 * First, on every popup, have a prominent "report error" button. Not some ad hoc system like we have with edit filters, but built right in to the extension. When they click that, it's clearly disclosed that now the contents of the edit form will be made public (and also CC-BY-SA, you are editing logged out, blah blah blah). They can click "cancel" if there's a something private or copyrighted.
 * Second, as discussed above, a log of those checks which stall match when the user clicks "publish". I understand the worry that this won't be adequate, but really, from working with warn-only edit filters, I think that maybe 50% of time, even when the filter is correct, people just ignore the warning. And it's probably about 95% when the filter is wrong. In your example above, nearly everyone creating an article about a dead person will think, "oh, stupid buggy system", ignore the popup, and publish anyway. And then we'll have the data we need.
 * And no, I don't want either of these logs to be limited to "volunteer devs" or any special group. With edit filters, we value transparency; and only mark filters private if there's a good reason to do so. We certainly don't mark "good-faith" edit filters (like en:Special:AbuseFilter/869) private. Suffusion of Yellow (talk) 19:39, 23 May 2023 (UTC)
 * I don't think you were around for the Article Feedback Tool, which was removed by community demand in 2014. I can recommend reading w:en:User:Risker/Risker's checklist for content-creation extensions in general (FYI @Risker), and I'm sure that a feedback tool inside the editing window won't result in as much extra work for oversighters as the ones shown to readers did, but it could still be a lot.
 * Also, feedback tools usually have a pretty poor signal:noise ratio. Some people will report errors on correctly functioning software (e.g., spammers), and most newcomers just trust that the software is correct, or they'll click pasts it and decide that it's not worth their time to report the problem.  The false positives and false negatives are both significant.
 * And, of course, this is assuming that the problem is fairly obvious to the end-user. If the check says "This appears to be an article about a living person, and it appears to have zero sources.  Sources are required for all regular Wikipedia articles" with options like  and  and, then that might result in people getting past it when the check is buggy.
 * But if it says instead "This article must have more sources. You cannot publish this unless you add citations to reliable sources", with the only possible response being, then I would expect a very different reaction from the users. Whatamidoing (WMF) (talk) 18:23, 24 May 2023 (UTC)
 * With edit filters, the tradition is to almost never disallow an edit unless it is disruptive. Certainly I wouldn't want Edit Check to be any different. It should offer suggestions, and nothing more. If something really should be disallowed, we can continue to use AbuseFilter. Suffusion of Yellow (talk) 18:39, 24 May 2023 (UTC)
 * With edit filters, the tradition is to almost never disallow an edit unless it is disruptive. Certainly I wouldn't want Edit Check to be any different. It should offer suggestions, and nothing more. If something really should be disallowed, we can continue to use AbuseFilter. Suffusion of Yellow (talk) 18:39, 24 May 2023 (UTC)

Clarify please: warning for SpamBlacklist warning
✅ As it says: Warn when the url added as reference is registered in the SpamBlacklist, and thus prevent the warning from appearing when saving the page, does it mean the SpamBlacklisted url is removed automatically? Omotecho (talk) 09:56, 4 March 2023 (UTC)


 * @Omotecho, that might be possible in the future, but I think it would probably just show a message and (I hope) highlight the place where the link is located. Whatamidoing (WMF) (talk) 02:11, 17 March 2023 (UTC)
 * I see, and case closed. Thank you helping me (: Omotecho (talk) 03:24, 17 March 2023 (UTC)

Privacy
It's suggested that we are going to be able to Review the edits people who are shown Edit Check are making. But the user hasn't clicked "Publish" yet. Publish means "make public". Until they click that, it's private. The contents of the edit form might contain all sorts of private information: cut & paste fails of passwords or personal info, ill-thought-out personal attacks, copyrighted text, and so on. The user hasn't agreed to for it to be shown to anyone, even users with special rights. So how will we be able find and adjust broken filters/checks? Suffusion of Yellow (talk) 22:27, 13 April 2023 (UTC)


 * Editing a complex text like an article will naturally tend to require some interaction. For example, I commonly use the visual editor to generate citations from URLs and it is good practice to do this as one goes along.  If such interaction causes privacy concerns then these are best dealt with at the outset, before the user starts editing.  The user should be warned of any such implications in advance so that they don't waste their time or risk their privacy.  It's no good waiting for a final publish interaction.  As that may come at the end of a lengthy editing session, the user will tend to be weary at this point and disinclined to read through detailed formalities and legal warnings.  If these are important, they are best done when the user is fresh. Andrew Davidson (talk) 09:09, 4 May 2023 (UTC)
 * It's not the information being sent to the servers that I'm worried about.  has been doing that for years. But a stashed edit is kept for what, one minute? And even if it's kept for longer, only some NDA'ed developer is ever going to see it. But what's being proposed here would seem to be making public (or at least sharing with some huge group, like admins) the contents of the edit box, before the user intended to make it public.
 * For example, suppose Alice types Bob, you're a total moron, regrets that a few seconds later, hits backspace, and types Bob, I respectfully disagree, Should Bob be able to access Alice's original comment?
 * A disclaimer won't help here; we're talking about fundamentally changing the way we interact with Wikipedia. To start, it'll no longer be possible to preview anything you don't want public. It'll no longer be possible to keep copyrighted text in the edit form, for easy reference. We'll have to be on edge every time we hit CTRL-V.
 * I can see a few solutions, though:
 * Only log the edit if the problem was still there when they clicked "publish". That would be incomplete data, but possibly enough.
 * Provide a "report problem" button. When the user clicks that, warn them then that they're about to make a public post, and they should remove any copyrighted material, etc.
 * Suffusion of Yellow (talk) 20:50, 4 May 2023 (UTC)
 * In the early days, "review" will mean "review via Special:RecentChanges". The first step will be a new item in Special:Tags (which was recently created, but which just picks up the addition of any 50 characters, so not really functional right now) that shows finished edits that EditCheck should have triggered on.  We'll be able to make sure that it's (usually) picking up the right things before the tool is shown to any editors at all. Whatamidoing (WMF) (talk) 18:37, 5 May 2023 (UTC)