Talk:XTools

Jump to navigation Jump to search

About this board

This page is a feedback forum for XTools. For reporting bugs, it's preferred that you use Phabricator.

If the issue is urgent and you're unable to use Phabricator, feel free to ping one of the active maintainers.

Boehm (talkcontribs)
MusikAnimal (talkcontribs)

I am working to fix this. Thanks for the report.

Reply to "articleinfo-authorship"
JamesLucas (talkcontribs)

A user's XTool's page has a "Top edited pages" section which lists that user's nine most edited pages for each namespace, where "edited" is measured in edit counts. I find this to be a good mechanism for reviewing talkspace edits, but for mainspace I'd be equally or more interested in being able to see pages by authorship percentage. Is this technically feasible (or already available somehow)? Cheers!

MusikAnimal (talkcontribs)

Unfortunately this is not feasible :( The Authorship tool relies on an external service where we can't do arbitrary, large-scale queries like you suggest. Sorry!

JamesLucas (talkcontribs)

Interesting reading—it prompted me to test the processing time of a heavily edited article like Che Guevara (22 sec) against that of a randomly selected article (2 sec), so I now better appreciate the complexity of the tool. I guess a top-authorship table would be feasible only if it were displaying cached data of authorship info that XTools was slowly scraping. It might be cool, but it'd full of holes and constantly out-of-date. Regardless, thanks for the response!

Reply to "User's top authored pages"
Calbow (talkcontribs)

According to the list of editors by highest edit counts, there are less than 50 editors above the 400k edits threshold, yet it would not surprise me if these were among the most searched in your tools. I was wondering whether since there are so few, it would be possible for the edits/pages created for those users to be calculated infrequently (e.g. once a month) and then have that monthly snapshot be served to those searching for them, rather than just a message saying the editor has >400k edits?

Just a suggestion, thank you for creating/maintaining these tools.

MusikAnimal (talkcontribs)

Everything is computed on-demand, so this wouldn't be an option. Ideally we would simply limit processing to just 400,000 edits (phab:T182182); the issue there is we likely only care about the most recent 400,000 edits, and that is what makes the query so slow. I think going by time frame would be more doable, say the past year. The problem with that is someone could make a million edits in that year alone. So we need some very specialized logic...

Calbow (talkcontribs)

I figured my suggestion wouldn't be feasible somehow. I agree that doing things by time frame would be ideal, hopefully it becomes possible in the future.

Furfur (talkcontribs)

I very much like the EditCounter Tool but I would find it even more useful when one could calculate a "weighted article count", i.e. the summarized kilobytes of created article content.

There are authors who create lots of tiny articles (1-2 kb in size) and thus have an article count of several thousand. Other authors put some work into their articles which often have a much larger size.

The "kilobyte count" would be a helpful counterpart to the pure article count, which may be misleading when judging contributions. ~~~~

MusikAnimal (talkcontribs)

Hello! If it helps, you could use the Pages Created tool and sort by size (either original size or current size). We can add an "average size" column in the summary section. Would that help? Should it be averaging the original or current size?

Furfur (talkcontribs)

Hello MusikAnimal, thanks for your answer. Of course an average size (showing the original size not what may have been added by other users) would be helpful – at least better than the current situation. Nevertheless I would still find it helpful to have a summary of all the created content in kilobytes. Currently we have a summary (the number of articles) but this does not realistically reflect the work a user has invested.

MusikAnimal (talkcontribs)

Oh, I think you mean a sum of the sizes of all pages. That we can do; we'll include both the sum and the average. I have created phab:T229578.

Furfur (talkcontribs)

Yes, that's right, I would like to see a sum of my work (and that of other users) :).

Reply to "EditCounter"

Count of minor/ip/bot/reverted edits for page?

2
NHarateh (WMF) (talkcontribs)

Hello,


First of all, thank you for creating such a useful tool & API! Is there any way I could get the count of minor/ip/bot/reverted edits for a given page? I can see those counts included in the UI but I don't see them returned in any of the page requests that I tried. I see that this is where you build the `articleinfo` response https://github.com/x-tools/xtools/blob/6d8f265ec0aecf6f3a6909b742cc4cda0d28d94f/src/AppBundle/Controller/ArticleInfoController.php#L238 and this is where you get respective edit counts https://github.com/x-tools/xtools/blob/19bcac6775a8302273a4314bbfe1b0e753458255/src/AppBundle/Model/ArticleInfo.php#L485.

MusikAnimal (talkcontribs)

Thank you for the kind words! We can definitely add an option or something to get the number of minor and IP edits. Getting the number of reverted edits however I think will require combing over the entire history revision by revision, which will be too slow for an API endpoint. The revert count is very much an approximate figure, anyway. I would not use it for research purposes.

Reply to "Count of minor/ip/bot/reverted edits for page?"
Davidbena (talkcontribs)

I am the creator of the page "Erich Brauer," but when any onlooker checks the edit history and goes back to the earliest date, the article is listed as being created by User:Magk. The reason for this discrepancy is because, before I created the article, User:Mag2k had already made a "Redirect" for a different article, entitled "Arik Brauer," but he had used the name "Erich" for his redirect. How can I alleviate this problem, and have the article "Erich Brauer" shown in my own list of articles created? ~~~~

MusikAnimal (talkcontribs)

This is phab:T182183. Unfortunately it is a difficult problem to solve. MediaWiki has no formal log of when a redirect became an article. As you can see at https://en.wikipedia.org/w/index.php?title=Erich_Brauer&action=info, MediaWiki also claims Mag2k as the page creator. It's simply looking for the oldest revision. XTools works in the same way. There's a proposed hacky workaround at phab:T190065, but I can't make any promises. The issue is really with MediaWiki, not XTools. Sorry!

Reply to "Erich Brauer"

TopEdits tool not working correctly?

3
Summary by MusikAnimal

Resolved

Ceyockey (talkcontribs)

I was looking for the specific edits by Blueboar and Boracay Bill on the page https://en.wikipedia.org/wiki/Wikipedia_talk:Reliable_sources/Archive_21 and both queries returned 0 edits to that page, which is patently wrong -- see this section where at least they have each edited once --> https://en.wikipedia.org/wiki/Wikipedia_talk:Reliable_sources/Archive_21#Overuse_of_%22third-party%22_in_nutsell_and_intro_paragraphs_causing_problems . Maybe "archive" pages are excluded from the index?? Thanks for input. --~~~~

Ceyockey (talkcontribs)

OK - I found the problem. You need to search the ORIGINAL article from which archives are produced in order to find edits by users which are show in ARCHIVES.

MusikAnimal (talkcontribs)

Correct, the edits themselves were made to the original page, not the archive page. The archive pages are merely a copy/paste of the original text. See the revision history.

Make pie chart consistent with table (Page History: Authorship)

6
Summary by MusikAnimal

Deployed

Minderbinder (talkcontribs)

First of all, thank you for providing such a versatile tool. The German language Wikipedia community are discussing the provision of a deep link to the Authorship section of the Page History tool right now, to be displayed alongside every article. Should the vote come to pass, I would expect to see an increased load on the Page History tool in a few weeks' time. So I hope that there is some caching mechanism.

I would like to ask for a change in the Authorship section. Right now, the rendering of the pie chart takes only the percentages of the first ten contributors into account. If an article has a long tail distribution of contributors, this gives the wrong impression. The top ten contributors to the article Angela Merkel have contributed less than 40% of the current total to the article. Yet the pie chart makes it look as if the #1 contributor has contribued more than a quarter to the article, not 11.7 %. Could you please change the rendering of the pie chart so that the remaining other contributors (lower ranked than #10) get one collective slice of the pie, being as large as their combined total? This pie section could be gray, as this connotes lack of detail. In the Angela Merkel example, this slice of the pie would be 61.5 %, or about two thirds. The ten named top contributors would get proportionally smaller slices.

Thank you! (PS: I am not sure whether I get a feedback or ping through this site, so if you want to contact me, better try my de:WP talk page.)

MusikAnimal (talkcontribs)

Hey! There is a dedicated page you could use to show authorship information, e.g. https://xtools.wmflabs.org/articleinfo-authorship/de.wikipedia.org/Neaira%20%28Hetäre%29 . This will show all contributors, however there are caveats (a) limited to 10 colours, which repeat. (b) I will soon limit it to the top 500 editors or so, because if there are more it sometimes fails to load. I assume this is not a problem for you.

At any rate, yes for the main Article Info page where we only show the top 10 contributors, we can add a slice for the remaining contributors, as you suggest. I'll look into implementing this soon.

Thanks for the suggestions!

MusikAnimal (talkcontribs)
Minderbinder (talkcontribs)

Hello MusikAnimal, a big thank you for the change and incredibly fast deployment. The changed graphic is exactly as I had hoped for. Your work has been well received by the authors in de:WP discussing this topic.

MusikAnimal (talkcontribs)

@Minderbinder My great pleasure :) Regarding caching -- unlike the rest of XTools, the Page History tool actually doesn't cache most data (phab:T208543). This is a caveat of its implementation. However we can easily cache the authorship stats. So my question for you is if it would suffice to link only to the dedicated authorship page, e.g. https://xtools.wmflabs.org/articleinfo-authorship/de.wikipedia.org/Angela%20Merkel, and not the full results? This would be faster for you, and less strain on the XTools servers. If the community wants the full Page History results, that is okay too :) Most of the time it will be no problem, but any high-traffic page such as your Village Pump may be very slow to process or fail entirely.

Another thing I wanted to mention: While I greatly appreciate the praise, the authorship stats you see are fetched from a third-party service called WikiWho. Their superb algorithm provides around 95% accuracy. They should get full credit for this :)

Finally, take note of the path-style URL format that XTools uses. Basically, your link should not replace spaces with + signs ("Foo+bar"), instead use normal percent-encoding like "Foo%20bar". If you are using the {{urlencode:Foo bar}} parser function, just use {{urlencode:Foo bar|PATH}}. The other option is to pass in the page title via query string, e.g. https://xtools.wmflabs.org/articleinfo-authorship/de.wikipedia.org?page=Foo+bar.

Minderbinder (talkcontribs)

@MusikAnimal Thank you for your helpful implementation hints. I will not be changing the GUI to include the link myself, that is left to a group of interface-admins. The formal vote on this change runs until May 8. Though there is currently a 3:1 majority for providing the deep-link to your statistics, it would be premature to discuss implementation details right now. I will point the interface-admins to this discussion after the vote has been tallied.

I like the idea of a dedicated authorship page, both to enable caching and to avoid information overload. Can I make a suggestion though: In the non-dedicated section (i.e. https://xtools.wmflabs.org/articleinfo/de.wikipedia.org/Angela%20Merkel#authorship) the table is cut-off after the tenth contributor, in line with the pie chart. The contributions from rank 11 on are summarized with one line, providing number of remaining contributors and their total contrbution in terms of characters and percentage. That is not the case for the dedicated page (i.e. https://xtools.wmflabs.org/articleinfo-authorship/de.wikipedia.org/Angela%20Merkel ), which renders the table with a different cut-off at rank #500. That makes for a very long page, and effectively prevents the viewing of the pie chart when smaller (mobile) screens are used. Who is going to scroll down 500 lines? Besides, for most articles the lower ranked contributions can be for something as mundane as inserting a wiki link etc. So I would suggest to bring the cut-off of the dedicated page in line with the section of the main page. The last line with contributors from rank 11 should be expandable, so if someone clicks on it, a full table should be rendered. This would also help with chaching, I imagine: Each authorship stats page would have to hold about 24 data items only.

On Wikiwho: I am all for giving credit where credit is due, so I will look into contactiing them.

CheckUsers in Admin Stats

2
Summary by MusikAnimal

This should be fixed now

Mz7 (talkcontribs)

I noticed that the CheckUser group isn't shown in the "User groups" column of admin stats. I was wondering whether there was a reason for this. Seems relevant to include, especially if we're looking for active checkusers on other projects, for example.

MusikAnimal (talkcontribs)

It's supposed to show it but this must have broken when we reworked that tool a while back. Filed a task at phab:T213119

Error querying Wikiwho API: Unknown

7
Summary by MusikAnimal

Fix has been deployed

MisterSynergy (talkcontribs)

As User:Minderbinder already reported earlier on this page, the German Wikipedia community held a vote on the question whether the WikiWho tool in XTools should be linked from each article, in order to make article authorship prominently visible to readers. The vote ended five days ago, and the link was subsequently added to the desktop UI page footer via de:MediaWiki:Wikimedia-copyright. Pretty much directly after the link was added, the tool started to fail showing authorship data; instead, it displays an error message Error querying Wikiwho API: Unknown for most of the requests.

There are several users on German Wikipedia complaining about this problem, and there is some speculation that there may be simply too many requests so that a request quota to the WikiWho server might be exceeded most of the time.

Can you please give some insight into the problem? What can be done to fix this situation?

(I have no idea whether someone else has already contacted you; if that is the case, please link to related discussions.)

Magiers (talkcontribs)

Hello, to add from my observations: It seems "Authorship" has always problems at daytime (in Germany), while it starts to work in the evening/night. So it does not seem to be broken completely, but every day temporarily. I would also be pleased, if someone could give insights or maybe even has a solution to the problem. Thanks!

Count Count (talkcontribs)

According to api.wikiwho.net (not linked due to spam filter) there are API limits in place:

"Currently, there is a limit of 2000 requests/day for unregistered users, and also a 60 requests/minute limit for all users."

It is possible that we are now running into either one of those.

MusikAnimal (talkcontribs)

Hey, sorry for the late reply. I think your assessments are probably right... regardless I will contact the WikiWho maintainers and get this sorted out. I am pretty sure they support this initiative. I am on holiday right now so I may not get to back to you for at least a few more days. As far as I can tell XTools is not suffering in any way, so it's up to you if you want to disable the link in the meantime. Apologies for the disruption!

ToBeFree (talkcontribs)
Magiers (talkcontribs)

Thank you MusikAnimal for looking into this topic. I am sure after living so long without the credits of the authors, de-wp can live some days longer without a working tool. So first enjoy your holiday. But XTools is affected too: When the api is not working, then the section "authorship" in the XTools is empty too (and the Top-Editors are not a surrogate, because they don't deliver useful metrics about authorship). It would be great, if the API limits could be deactivated or maybe even as suggested the tool could be transfered under our own responsibility. Greetings.

MusikAnimal (talkcontribs)

This should be fixed now! \o/ Apologies for the long wait. All the best,