Manual talk:Edit throttling/Meta Archive

Add topic
From mediawiki.org
Latest comment: 17 years ago by (myrtone) in topic Anon only version
The following discussion has been transferred from Meta-Wiki.
Any user names refer to users of that site, who are not necessarily users of MediaWiki.org (even if they share the same username).

Moved from the article core by Evan
Moved to talk page by w:User:Chinasaur

Leaky bucket implementation[edit]

Rather than an explicit limit, why not give each user/IP a leaky bucket initialized to allow 10 edits when the first edit from that user happens - also, you could have several buckets, to limit edits per hour to 100, as well as limiting edits in a minute to e.g. 10. Pakaran 18:06, 6 Feb 2004 (UTC)

For those unfamiliar with the concept, the idea is that each edit requires a "token." Tokens not yet used are stored in a "bucket." in practice an integer variable, which "overflows" and wastes tokens when it reaches a specific value; tokens are added every N seconds. This is a common way to control excessive load in routers and such. We could have one bucket for short-term - for example a limit of 12 edits in 30 seconds, with one token added every 5 seconds and a limit of 6 in the bucket (someone check my math), and another bucket to have a long-term limit e.g. per hour. Pakaran 18:10, 6 Feb 2004 (UTC)

See Meatball:SurgeProtector. Known technique.

Uh... that page seems to be about all kinds of protection against "surges", not particularly for edit floods, and different techniques for doing that. Ward's wiki seems to have some kind of edit flooding protection based on percentage of recent changes, which is pretty interesting. --Evan 03:40, 7 Feb 2004 (UTC)

Recent changes protection[edit]

Note also that for most wikipedians, Special:Recentchanges is still a important way to getting to know what is going on at the moment. (When a wiki becomes very active, it becomes impossible to keep up with the activity from Recentchanges, and people rely more on Watchlist, as I observe). Edit throttle has the effect of a newbie cluttering the Recentchanges by making too many edits on the same page in a short time, rather than using preview or writing in length at once.

Edit throttle is, for this reason, is one of a strongly desired features among some japanese wikipedians. Tomos 03:27, 20 May 2004 (UTC)Reply

Proxied robots?[edit]

But using proxy servers, wouldn't it be possible for a bot to randomize or at least distribute its attacking IPs and thus effectively bypass the throttle? --w:User:Chinasaur

Sure, but at this level of vandalism you are certainly looking at a systemic effort for commercial gain (or at the very least ideological fervor). In order to gain access to this range of IPs one either needs to control a network of trojaned proxy servers or have up end access to the MIT internet link (i.e. have administrator access to a class A network). This means that the vandal is adding the same text (or similar text) to all his edits, i.e. no one goes to this trouble just to put dirty words on alot of pages. Mediawiki already has a feature to allow certain URLs (or arbitrary charachter strings??) to be added to a list of banned edits.
Not true, there are hundreds of lists for free anonymous proxy servers on the net, just need to grab a list and tell the bot to connect through that list. Thus bypassing all the throttle security. Any 14 year old kid with limited vb knowledge can work out how to bypass the throttle. It's sad, but true. The bot can even spoof the ip, you can see a lot of download accelerators doing this. It's big pain for people running download services. --Thomas
For Proxy Blocking, see: m:Proxy blocking. thanks. --Matt57 22:16, 28 October 2006 (UTC)Reply

"Turing tests" for all edits?[edit]

Adding "Turing tests" (e.g. pictures with garbled text that the user must type) as a requirement for all edits should stop all robots except very sophisticated ones and be of little inconvenience to users, with the exception of blocking out blind users. Blind users may however be handled by allowing slow registered editing without tests.

This would be a great feature to have, especially for anonymous users. I've been getting frustrate by spam on my wiki lately. Thanks!
If you want to disable those bots, use Bad Behavior, the link of which is given below in John's post. If you want to stop spam, use this. --Matt57 22:14, 28 October 2006 (UTC)Reply

Bad Behavior Extension[edit]

I'm wondering if anyone can tell me how the proposed edit throttling compares with what the Bad Behavior extension ([1]) already does? Thanks! - John

Hi John, I just ran a test on a wiki I help administer which has Spam BL and BB, both. The BB checks for agents, e.g. if people use programs to flood a wiki, the BB will catch that. It doesnt catch a troll sitting and editing a page every 5 seconds. The SpamBlacklist only checks for spam. BB and Spam BL dont do Edit Throttling. Throttling can be very useful for watching out for new users. I believe it can be easy to develop. It seems Vandal Bots could help here but really the Throttling can reduce damage and suspend for a certain time an IP or user if there are two many edits from it. That would a great thing to have. --Matt57 22:12, 28 October 2006 (UTC)Reply

Developing Edit Throttling[edit]

Some people here are talking about Edit Throttling (ET). I think it should be fairly simple to make. It could be based on the SpamBlackList extension. It would check a certain page which would contain a list of "safe" users which do not need any kind of monitoring. And then this kind of control (the link above explains what this is):

&AddRateLimit?(10*60, 10, 'minutes', 200, 50, 80, 20);

All we need to do is record the username in a little ET database and not down the time which has elapsed since he made his last post. Put in other details as well, like number of edits in the certain time etc. It should be simple and will help watch out for nasties who get a new username and start vandalizing. And then if someone violates the policies, their editing status is disabled for 24 hours or so to reduce the damage.

First, we need a log of all Edit activity in the last one hour:

  • Username
  • Current Time
  • Time it has been since they registered in the Wiki

Thats really it.

From this we can compute the needed variables and put them into a new table. This will record Edit monitoring for each user:

  • Username
  • Time since last edit was done
  • Number of edits made by that username in the last hour

Global monitoring:

  • Number of all edits done within an hour (for example if its a group of vandals)

Can someone PLEASE make this thing?

--Matt57 22:20, 28 October 2006 (UTC)Reply

some users batch connect[edit]

There are all kinds of users. E.g., I'm posting this via a queued Perl LWP::UserAgent POST command via WWWOFFLE on my once a week connection to the net, where I burst-post many comments I've stored up over the week. More like emailing to a page to maximize expensive connect time. --User:Jidanni 2006-11-16

Anon only version[edit]

As far is I know, (I don't know whether other's have observed this) anon users, on average edit less frequently than logged-in users, and may use a diffrent ip adress each time. An anon only version would therefore be much less visable to human editors than one that also apllied to logged-in users. This would be sufficiant provided that account creation were well protected by an effective form of captcha and/or logged in users were required to specify a valid email adress before editing any pages. I would suggest that maybe this system could dampen the possible edit rate every time the user tries to edit more frequently than possible. Therfore, AOL-ers, for example, will have to login to edit pages. Myrt|comments 08:22, 20 November 2006 (UTC)Reply