Manual talk:Combating spam

From mediawiki.org

Success Stopping Spammers with Cloudflare[edit]

I'm sharing an idea that I recently deployed that has stopped spammers completely for at least the past few days. You will need to sign up for a Cloudflare free account and do the required setup to incorporate that with your site. There are many other guides on the Internet to do that, so not reinventing the wheel there (I tried to post a link to a good help article, but external linking is blocked). After your site is set up with Cloudflare:

  1. Navigate to the cloudflare dashboard with your site config
  2. Navigate to the Page Rules tab
  3. Create a new Page Rule with the URL www.yoursiteURL.com/w/index.php?title=Special* and down below select "Security Setting" and the value of "I'm Under Attack".
  4. Save and Deploy this rule, and make sure it is at the top of the list if you have any other page rules.

This will force any navigation matching that syntax, which includes the Create Account and Log In pages to be subjected to the Cloudflare DDoS challenge. So far that has defeated all spammers. I chose to enable it only on those pages as to minimize impact to my users. You can also enable it site-wide, but then ever visitor will get the challenge. The issue on my site has only been from spammers creating accounts, then editing pages with a new account, and this method stops that completely.

I also do some other good practice utilizing cloudflare such as using Firewall rules to block GeoIP connections from China, Russia, Ukraine, North Korea, and a few other small things. Hope this helps some of you! --Caseyparsons (talk) 13:53, 26 June 2019 (UTC)Reply

I could kiss you. This worked great. I have the rule for 'www.yoursiteURL.com/w/index.php?title=Special:CreateAccount*' so that I can use the other special pages without the cloudflare thing. --Absurdmike (talk) 22:58, 28 August 2021 (UTC)Reply

Pages on meta.wikimedia.org[edit]

Were you not aware of these pages? :

Seems like a lot of duplicated effort creating this explanation.

Could move those onto MediaWiki.org ?

-- Halz 10:11, 9 October 2007 (UTC)Reply

By the way I am responsible for writing most of the text on those pages, and would be delighted if more people read them and linked to them (not because I wrote them, but because I want to increase awareness of the issues/remedies). So if they're more likely to be found on here, lets move them here.
I am also very happy for people to redistribute that content, so if we wanted move those pages into the Public Domain Help Pages that would be fine by me. I'm not the sole contributor. Others made various minor tweaks, so maybe there's legal problems with stripping the license.... but I did write the bulk of the text.
-- Halz 10:55, 26 March 2008 (UTC)Reply
It's a nice thought, but anti-spam information is probably out of the public domain help pages' scope (considering their target audience). Would other wikis really benefit from duplicating that content (though they still can under the GFDL)? —Emufarmers(T|C) 02:21, 27 March 2008 (UTC)Reply

Anti-spam features is on this wiki now and yes, its content is highly duplicative of information which is also on this page. I'm also tempted to remove much of the commentary on "rel=nofollow" as (1) it is already 'on' in standard MediaWiki installations by default (and there's no useful purpose served by instructing on how to turn it off here) and (2) it doesn't stop spam. --Carlb 05:31, 8 September 2009 (UTC)Reply

Edit Filtering[edit]

With refference to the section on edit blocking, is there a way to block a specific string inside new page titles? For instance, I'm getting a lot of spam that has "jobs" or "job" in the title. Can I block new pages with this specific text in the title? FrankCarroll 05:57, 6 February 2011 (UTC)Reply

Importing the spam list[edit]

There's a little hitch with the described process. I followed the process to the letter and uploaded it to my wiki, though there still seemed to be bots registering from ip's banned with that extension. After some searching, the only thing i can find is that php.net mentions that require() and include() need a 'proper' php file to work, and the described process doesn't mention the trailing ?> that i would assume should be added.

Does the wiki just skip incomplete extensions in such cases and is that why bots can still register, or is there something else going on? Reiisha 05:07, 15 March 2011 (UTC)Reply

No, ?> is not required; only <?php is. What makes you think bots are registering with IPs you've blocked? Have you used CheckUser to check? —Emufarmers(T|C) 06:29, 15 March 2011 (UTC)Reply
I did use CheckUser and looked up the IP's in the php. They're still registering with them. Reiisha 13:25, 15 March 2011 (UTC)Reply
Since i added in the closing ?> , the extension started working correctly... Reiisha 23:07, 18 March 2011 (UTC)Reply
I'm using MW 1.18, and tested things by blocking my own IP address. It turns out that I could still register, but that I only had read access. Once I took my IP address out of $wgProxyList and refreshed the page, I could edit again. Mr3641 19:32, 13 September 2011 (UTC)Reply
There's a known bug where $wgProxyList has not been working since MW 1.18. What I've noticed on my own site is that blocked IPs can create accounts, but these accounts don't have permissions to edit and they pile up on your Recent Changes. If you look at the bug report, you can see an initial patch that I submitted. There may be a better way to fix this, but this has at least got things working for me again. --Mr3641 (talk) 09:04, 12 September 2012 (UTC)Reply

combating human spammers who create accounts[edit]

I run a mediawiki site and require users to create an account, and used a basic captcha and did the various blacklist techniques listed on this website. I still get some new users that take the effort to create an account and then make spam pages, so perhaps these are people who manually operate and not just automated spambots. If these indeed are live people, then other than IP blacklists, which are not perfect, it is hard to catch them. But I've noticed that their account names are always ending in a number, like Mira2025 or Alina1917 or Vikulay2039 or Maryinkina1996. Therefore, one idea is to create an extension that doesn't allow users to create usernames that have numbers. Of course, you need to explain this clearly on the new account creation page, and it is a slight inconvenience to legitimate users (on my wiki, this is no problem, since the number of legitimate users is right now about 10, and will never grow beyond 100 or 1000 at most, so there is no shortage of usernames; obviously, for wikipedia, allowing numbers in the username is essential). This might make it more difficult for spammers. It would also be possible to collect a list of blacklisted usernames.

Clearly, not a perfect solution, but just an idea to toss around.— Preceding unsigned comment added by 78.240.11.120 (talkcontribs) 20:45, 25 January 2012

There are already extensions to do so, and they're mentioned in the page. You can blacklist such usernames with Titleblacklist; you can do all sorts of checks over them with the abusefilter. Before doing so, try simpler solutions explained on this page and which you didn't mention. Nemo 20:02, 25 January 2012 (UTC)Reply
TitleBlackslist looks a bit complicated, if anybody has a snippet of code to use block usernames that follow with numbers please share, I see most spam is all arriving like stated above.--MAHR88 (talk) 03:09, 5 October 2016 (UTC)Reply

I had this issue. AkismetKlik resolved it. I use 1.18, so as long as your install is current, you should be good to go for it. 68.191.162.116 18:16, 19 September 2012 (UTC)Reply

Spam in wiki[edit]

Our wiki citywiki is being thoroughly spammed, despite using RECAPTCHA and other extensions. Could anyone please give some advice? Thanks!

There's some more advice on Manual:Combating vandalism. You could try and add Extension:SimpleAntiSpam to see if it has any effect on bots, as it's completely harmless for humans. Nemo 11:19, 16 February 2012 (UTC)Reply
I've had the same experience with spammers getting through RECAPTCHA. What has surprisingly worked (so far) is that I use the QuestyCaptcha feature of Extension:ConfirmEdit. To make things unambiguous for my users, I show them a word in all caps and tell them to type in the box. Compared to RECAPTCHA, this seems like something that spammers could easily figure out, but this has cut off the torrent of spam I was getting. I'm also able to leave my wiki open to anonymous edits since the spammers seem to all want to create accounts. I'm not sure how long this situation will last, but this strategy has worked quite well. --Mr3641 (talk) 12:09, 16 February 2012 (UTC)Reply
I'm also using the same catchpa plus most of the anti-spam extensions we could get to work on my wiki. The thing is these guys are just walking on past all that. We're getting between 8-15 spammers signing up and posting their spam to their user pages, they make an article with the same title of their username and post the same message, and sometimes they post to their talk pages. I've noticed they all seem to use long usernames which usually have two names and a bunch of numbers. Something too long or complicated for a regular user to create. What I think is causing the rapid increase is that I've heard that someone created a wiki spamming program that goes by the name of Extreme Wiki Poster which seems to have become rather popular within the spammer community. Popular enough that there are others who are selling wiki website links to be uploaded to the program for spamming purposes. Is anyone coming up with something to defeat this? Is there something that can be coded or could we hack the program and find something we can use to block it? (Please note: like I said, my wiki is running the majority of anti-spam extensions, blacklists, and also regenex anti-spam word blocker from the localsettings.php file.) Brothejr (talk) 22:11, 22 February 2012 (UTC)Reply
Wow, Extreme Wiki Poster is indeed pure evil. We really need to do something about this. --Mr3641 (talk) 06:42, 23 February 2012 (UTC)Reply
One way to deal with these guys is to read their FAQ's. From http://www.deathbycaptcha.com/user/faq:
Q: Can I upload CAPTCHAs in Russian, or not in English in general?
A: Better not to, we don't have solvers able to read non-English CAPTCHAs yet.
That sounds nice. However, one main problem with that: the vast majority of my editors don't read or type in Russian or other non-Latin letters. So switching to Russian catchpas would keep regular people from logging in. Maybe if someone created an anti-spam feature simular to Stopforumspam's addon for the SMF forum. When a user creates an account, their username, email, and IP address are compared to stopforumspam's database. If there is any match to their database or the user is doing something questionable when creating their accounts, the user is flagged and unable to do anything until the admin approves their account. Plus the addon make it easy for the admin to spot the spammer and delete the account before it even has a chance to spam. Plus, it has a reporting feature to help the database grow and makes the spammer's life all that much more painful. Very rarely a regular user is flagged by this feature, but even that it's easy to rectify. About the only way I've seen spammers get past the addon is if they are using brand new usernames, emails, and IP. But even then it isn't long before those things are flagged. Maybe we should create an extension that does something simular for wikis. (The one stopforumspam is not that heavy on on server processing and runs quite smoothly.) This would be very helpful for a lot of wikis and would improve their battle against spammers. Brothejr (talk) 11:29, 23 February 2012 (UTC)Reply
Just as an update, a month ago I mentioned above how I was using the QuestyCaptcha feature of Extension:ConfirmEdit to prevent spam bots. It's still working incredibly well, and I get maybe one spam edit per week, and still allow people to edit without creating a user account. This is a significant improvement from the 10+ spam edits per day I was getting. If you look on youtube, you can find some videos of Extreme Wiki Poster in action. It seems that the key is to adjust your wiki slightly so that their automated scripts fail to create an account. These guys seem to be interested in spamming the most amount of wikis with the least amount of effort, and it doesn't seem to be worth their time to check why they weren't able to write to a particular wiki when they've already spammed hundreds of others. --Mr3641 (talk) 08:21, 15 March 2012 (UTC)Reply
Yea, just added the Extension:ConfirmEdit and set up QuestyCaptcha. I took some fooling around to get the settings right to achieve what I was looking for. Works quite well at slowing these guys down. Another thing I added as an extra layer that also seems to block some of them is this: Extension:RudeProxyBlock. I installed it and then once a week I download the latest spammer proxy IP list from StopForumSpam along with a couple other anti-spam sites and update RudeProxiBlock with those proxy IP's. I can see that a bunch of spammers have been blocked by that extension. Worth a try too as an extra layer of protection. Brothejr (talk) 13:59, 17 March 2012 (UTC)Reply
The problem with QuestyCaptcha is that the number of questions it offers is so small it's easy for spammers to construct a complete database. Extension:Asirra's database is much larger and more dynamic which makes it better. My wiki was getting human-assisted spam (farming CAPTCHAs out to humans), so none of these solutions were sufficient, but Extension:AbuseFilter was highly effective because the automated part of the spam consistently used usernames or article titles fitting a unique pattern. Dcoetzee (talk) 01:45, 12 April 2012 (UTC)Reply
You can add or change the questions all you want. I generally shift them every so often to keep the spammers guessing. Plus, while they may be able to use humans to get around the catchpa during user creation, they are too lazy to create/edit the pages themselves and resort to using their spam programs. So if you also attach the catchpa to every edit, you'll stop them dead in their tracks as all the spamming programs they use are not able to handle catchpas during the edit process, only during the user creation process. Since I started using questycatchpa not only for user creation but also for all edits too, I saw a dramatic decrease in spammer activity. Heck, because of attaching catchpa's to edits, I can still allow anon editing. About the only spammer activity I still see now are the ones using humans to get past the user creation catchpa, but they are still stopped by the edit catchpa. It's to the point that I don't even need to use Extension:AbuseFilter. Brothejr (talk) 12:04, 28 May 2012 (UTC)Reply

I've been flooded with Chinese spam that evades CAPTCHA by including no links, which is bizarre as I can't see any reason for the spam without them, and text is random phrases. It also somehow evades SpamRegex: "gucci", "bottega" and "jimmy choo" (amongst others) are banned but somehow still get through. Hogweard (talk) 13:08, 20 July 2012 (UTC)Reply

massively blocking or removing user accounts[edit]

Hello,

I used several solutions proposed here, with IP blacklisting and regular expression filtering. Nevertheless, spammers created a lot of user accounts and continue to use them. What I would like is a way to block edition for this accounts (there are several thousands of them, so manual blocking isn't usable). Do you know some scripts or extensions to do that ? --Psychoslave (talk) 12:03, 10 April 2012 (UTC)Reply

Use Extension:CheckUser to hard-block the underlying IP addresses. CheckUser can also accomplish mass-blocking.--Jasper Deng (talk) 17:40, 10 April 2012 (UTC)Reply

Merge with Anti-spam features?[edit]

This page has quite a lot of overlap with Anti-spam features, and probably even more now that I expanded that page (not sure why I didn't expand this one instead). Could we consider replacing this page with that page, based on the current version, or merging them in some fashion? Dcoetzee (talk) 19:21, 10 April 2012 (UTC)Reply

Yes, they should be merged. I think the resulting page should probably be in the "Manual" namespace, regardless of which title you choose. —Emufarmers(T|C) 05:52, 18 April 2012 (UTC)Reply

Tornevall[edit]

Is tornevall even being updated anymore? I have checked their forums and it's filled with spam, latest IP removal request being responded to months ago. --Sovereign92 (talk) 19:43, 2 July 2012 (UTC)Reply

',\n'[edit]

The separation of the IP's with ',\n' doesen't work. You must replace , with ',', so the IP-Banlist works. Best Regards 217.5.205.2 10:08, 21 October 2012 (UTC)Reply

CPU usage; IP blacklists[edit]

Spambots can sometimes affect performance a lot, if countermeasures are too weak or too expensive. Should be considered somehow in the page.[1] [2] --Nemo 16:44, 19 May 2013 (UTC)Reply

An answer I got: «It seems that the DNS blacklists are probably fine. They will introduce some delay but largely your name resolver will cache that information so performance really shouldn't be that problematic. I was using the method of downloading the Stop Forum Spam database and then turning that into a call in LocalSettings.php and after looking into it I really wouldn't recommend that approach.» [3] --Nemo 08:16, 24 May 2013 (UTC)Reply

Mediawiki Needs a NORMAL Admin[edit]

With options that allow you to combat such. This script is all over the place & the spam bots are ridiculous. I haven't had mine up for 3 weeks and i get 3 to 4 bots a day I got to ban, delete content and block. Spending more time combatting spam than adding data.

Then your documentation for everything is garbage. None of it can be understood unless you area geek. Then on top of all that there is no community or forum you can interact with others to learn. Its complete crap.

Hello, thanks for your comment. Antispam tools available in MediaWiki are actually more effective than in most software solutions, IMHO, if you consider it's designed to provide completely open wikis. Surely it's not as easy as, say, Wordpress, but the main problem is usually that system administrators and wiki adminstrators don't know how to use it, due to our poor documentation and even poorer default configuration (we're trying to work on both).
The documentation on this page is mostly rather easy, I think: probably you were confused by some of the most technical sections which are actually unneeded. I've now rearranged them: hopefully this makes it easier to find what you need and understand; please let us know what's unclear. I don't know what you mean by "forum" here, but there are several Communication venues where you can ask help. As this page says, you can ask on mediawiki-l or Project:Support desk; if you prefer some typical forum, the closest to it is probably StackExchange.[4] [5] [6] [7] --Nemo 10:47, 3 December 2013 (UTC)Reply

To what end?[edit]

A section describing motive would definitely help. Why do crackers want to fill what's most likely a low traffic mediawiki installation with spam? I installed mediawiki as a pilot two years ago and absently minded left it running. Yes, I realize I'm an idiot, so we can move on. There's absolutely no traffic on this site besides a new found extremely active spammer community. About 1000 IP addresses are going to town filling up my MySQL database with endless varieties of v1agr4 sales pitches. I've been trying to figure out why and ended up here.

I disabled my wiki a week ago. I literally moved the /var/lib/mediawiki directory. I'm still getting thousands of requests per day. Are these requests from spammers/content providers? Or are the requests from clients/content readers? Was my mediawiki installation being used as some sort of data relay or sharing hub? A lot of the content is written in the goofy under-the-radar kr4dwr1te style. This might imply the content is ultimately destined for e-mail. If so, all of this adds up, possibly, to the notion that a concerted number of cracked e-mail servers owned by legitimate businesses were using my mediawiki installation to get the latest and greatest e-mail content tweaked to get around mail filters... Any of this correct? If so, perhaps it can be added to the page.

The other possibility, that spammers just want to crap on medaiwiki installations and irritate the user base, doesn't seem to make sense. It's a theory that suffers from a virus killing its host syndrome. — Preceding unsigned comment added by 132.239.16.168 (talkcontribs)

If we perfectly knew the intent and reward mechanisms of malice, we would already have defeated it. So, yes, your question makes little sense. --Nemo 16:45, 17 July 2014 (UTC)Reply
"I don't know" is more concise and accurate. I think "we" is more than a little liberal. — Preceding unsigned comment added by 132.239.16.168 (talkcontribs)
You're simply being attacked with spambots. They simply target mediawiki installations, and sometimes just any website where one can submit a form (they don't care if it's a search, a contact form, blog comments, etc). They don't care if they're high traffic ones or simply test ones. They're usually powered by botnets. I guess that once they succeed to spam your site, then it's added on their database and they keep spamming it even if it's offline. --Ciencia Al Poder (talk) 11:11, 20 July 2014 (UTC)Reply

Spambots still create user accounts[edit]

Even if you take effective measures to block the spambots they still successfully fill the wiki with an excess of fake user accounts. --Rob Kam (talk) 10:57, 14 December 2017 (UTC)Reply

You should have a captcha in place in the account creation form. If they still circumvent this there's nothing more to do, really, unless you check what IP they originate from and start blocking them (if you find out they're open proxies), which you can do at the webserver level by denying access to them entirely or only on POST requests. --Ciencia Al Poder (talk) 10:35, 15 December 2017 (UTC)Reply
...and? Is that a problem?
On my wiki, the spambots can't even figure out to edit their own user talk page. They literally just create an account, and that's all. 203.96.214.236 20:43, 23 September 2018 (UTC)Reply

Actual effectiveness for mechanisms mentioned[edit]

I've had the dubious honor of running a few MediaWiki sites now which together received thousands of spam edits every day. Through trial and error, I think I'm able to provide a good overview of what works and what doesn't. First off, this doesn't really apply to Wikipedia and other highly active websites - it's more for those smaller instances where there's not going to be an administrator available 24/7 and need a more automated solution.

  • rel=nofollow has no impact on spam. These are spam bots and don't bother checking. I think most of these spam edits are made by "SEO" companies who just need to show their clients 1000 or so backlinks from PR 5 websites or something in their final report. Whether the link actually does anything isn't of concern to the spammer and the client purchasing the spammer's services doesn't know any better.
  • CAPTCHAs are extremely helpful IF you use the right ones. Questy is the most effective one by far, since spammers can't blanket bomb your website. Any spam bot would need to be customized. I'd say a few dozen unique questions are enough. The new reCAPTCHA is effective, but there are going to be paid CAPTCHA cracking services using actual humans living in third-world countries available once the demand is high enough. None of the math ones are useful and it's trivial to write a bot to bypass those anyway.
  • IP address blacklists are pretty worthless. You just can't keep up since it's really easy to get a new IP (e.g. DigitalOcean droplet or some other VPS) for spamming a few times before changing. Loading hundreds of thousands of IPs also noticeably slowed down edit save times for me. Instead of loading an IP blacklist, consider using Cloudflare or a similar service. They'll filter the request by an IP's reputation before it reaches MediaWiki. If you already use Cloudflare, then there's no point in having any sort of generic IP blacklist.
  • $wgSpamRegex and other pattern based anti-spam features aren't needed if you already have the abuse filter set up and configured. I'd recommend disallowing new users (remember to set custom $wgAutoConfirmAge and $wgAutoConfirmCount values first) from adding external links.
  • Disallow new users from creating new pages containing external links.

The actual solution you deploy would really depend on your website though - if it's quite normal for legitimate new users on your website to be posting external links, then obviously the abuse filter catching outbound links won't be a good idea. I guess the key thing is to use a unique CAPTCHA - you'd be forcing customized attacks against your website rather than letting spammers blanket bomb you. Visamucus (talk) 21:42, 16 May 2018 (UTC)Reply

Thanks, really helpful insights Yaxu (talk) 07:57, 26 November 2018 (UTC)Reply
Cloudflare has lists of bad IPs which might be useful, but unfortunately their pre-canned list only seems to be available on the paid "pro" version. The free version is capable of adding IPs to the firewall one-by-one, either through their API or manually from the web interface as Websites → (pick any site) → SecurityWAFTools. Odd that no one's done anything to use the API to automate blacklisting IPs at the firewall? 204.237.89.128 02:01, 29 April 2023 (UTC)Reply

Bulk deletion and cleanup[edit]

"Cleanup scripts or bulk deletion (Extension:Nuke) of existing posts from recently-banned spambots..."

It might be worth adding a mention of Extension:SmiteSpam and Extension:BatchDelete here, as both are useful cleanup tools. Nuke only deletes one user's garbage at a time, and then only if the rubbish is still on Special:RecentChanges. That's rather limiting if many of the worst spambot IPs on Extension:StopForumSpam and Extension:Antispam-style blacklists operate by attempting to brute-force multiple CAPTCHA attempts to bulk-create user accounts, dumping one user page and a few mainspace pages of linkspam with each before moving on. SmiteSpam is like Nuke on steroids by comparison, despite the bugs in its implementation - and the few bits which slip under the radar (usually short stubs or pages with no external links) still have to be killed manually, making the BatchDelete extension rather useful. There's no one single answer to the problem of net.abuse, so listing all three cleanup tools is best. 204.237.89.128 01:32, 29 April 2023 (UTC)Reply

Allowlist IP/IPs? Edits & Account Creation - Anti-Spam[edit]

I would solve most of my mediawiki moderation issue if I could whitelist IPs. The primary mediawiki I maintain supports a community centered around a physical space, and I would like to be able to allow anonymous contributions from there, and allow unrestricted account creation from there. I don't find any mention of the possibility of *improving* reputation by IP in any of the spam limited extensions. Mcint (talk) 20:17, 15 December 2023 (UTC)Reply