Anti-spam features


 * See also Manual:Combating spam and Manual:Combating vandalism, where many of the same issues are addressed.

MediaWiki provides the following features to reduce the problem of Wiki Spam.

Note that many of these features are not activated by default. If you are running a MediaWiki installation on your server/host, then you are the only one who can make the necessary configuration changes! By all means ask your users to help watch out for wiki spam (and do so yourself) but these days spam can easily overwhelm small wiki communities. It helps to raise the bar a little. You should also note however, that none of these solutions can be considered completely spam-proof. Always revisit 'recent changes' (Special:Recentchanges) periodically!

$wgSpamRegex
To prevent a spammer from saving wiki edits with problematic content, use the variable '$wgSpamRegex' (in very old versions of MediaWiki, the setting was called '$wgSpamBlacklist'). Set the variable in LocalSettings.php (overriding the value appearing in DefaultSettings.php). Set it to a regular expression for matching on any URLs (or parts of URLS) which you do not want to allow users to link to. You can also match any other bad content which you wish to ban. Users are presented with an explanatory message, indicating which part of their edit text is not allowed.

Place a line like this somewhere in your LocalSettings.php file. This prevents any mention of 'online-casino' or 'buy-viagra' or 'adipex' or 'phentermine'. The '/i' at the end makes the search case insensitive.
 * Simple example $wgSpamRegex setting:

The example also prevents any reference to 'adult-website.com'. Clearly this kind of setting provides an easy way to get rid of a particular spammer if they keep coming back to your wiki.

Finally the example also blocks certain CSS style attributes which have recently been used to hide spam in many attacks. Unfortunately there are many workarounds this spammer can use, but for the time being this will get them off your back.

This is only a simple example. See $wgSpamRegex documentation for more detail.

Longer spam blacklists
The above approach will become too cumbersome if you attempt to block more than a handful of spammy URLs. A better approach is to have a long blacklist identifying many known spamming URLs, in a more readable format (not a single regular expression). To achieve this, you will need to use the SpamBlacklist extension. With this, you can allow some of your users to edit the blacklist on a wiki page, and you can fetch updates from external sources.

Spam cleanup script
Blacklisting spam words or spammer domain names prevents future spam, but doesn't get rid of existing spam. In fact if you allow existing spam to remain, then the blacklist may interfere with people attempting to make legitimate edits. It's important that you clean-up as well as adding to the blacklist. You can do this by hand, or if you have a widespread spam situation, you may find this spam cleanup extension useful. This script automatically goes back and removes matching spam on your wiki after you make an update to the spam blacklist. It does this by scanning the entire wiki, and where spam is found, it reverts to the latest spam-free revision.

Procedure:
 * 1) Copy cleanup.php to the extensions/SpamBlacklist folder
 * 2) Login using PUTTY.
 * 3) Navigate to the extensions/SpamBlacklist subdirectory
 * 4) type "dir" to confirm that cleanup.php is in the directory
 * 5) type "php cleanup.php" to run the script

CAPTCHA images
The ConfirmEdit extension will confirm that an edit is being made by a human, and not a spam bot. It does this by forcing users to type the text from a CAPTCHA image. By default this is only triggered if they have added a URL as part of their edit. The displayed message reads...

"Your edit includes new URL links; as a protection against automated spam, you'll need to type in the words that appear in this image".

Captchas have some disadvantages in terms of accessibility and inconvenience to your real human users. Also it will not completely spam-proof your wiki. For starters it will not prevent human spammers.

Proxy blocker
As of version 1.4.1, MediaWiki has proxy blocking. The idea is to prevent the use of open proxies. Most spammers use open proxies to obscure their identity, and to avoid IP address bans. It enables them to access a wiki, and make edits, from many different IP addresses.

rel=nofollow link attribute
MediaWiki uses the rel=nofollow link attributes by default (it can be configured, see Manual:$wgNoFollowLinks for details). This tells search engines to not follow any external links added by users, thereby making spammy links much less valuable. Note that this does not prevent spam. Spammers generally don't notice the difference, and will abuse your wiki anyway, but it does mean that they benefit much less from it.

By default, it is put on all external links, plus log and history pages. See NoIndexHistory. Note that putting it on all external links is a rather heavy handed anti-spam tactic, which you may decide not to use (switch off the rel=nofollow option). See Nofollow for a debate about this. It's good to have this as the installation default though. It means lazy administrators who are not thinking about spam problems, will tend to have this option enabled.

Lock down (lazy solution)
You can disallow editing by anonymous users. Force users to create an account with a username, and sign-in every time prior to editing. More extreme (better spam protection) is to create a "gated community" in which new users (and spammers) cannot create a new account. They have to request one from you.

People often naively suggest lock-down as best solution to wiki spam. It does reduce spam, but it is a poor solution and a Lazy Solution, because you are introducing something which massively inconveniences real users. Having to choose a username and password is a big turn off for many people. The wiki way is to be freely and openly editable. This "soft security" approach is one of the key strengths of the wiki concept. Are you going to let the spammers spoil that?

...if so, you can easily lock down your MediaWiki installation as follows:

Add the following to your LocalSettings.php

Note that this only reduces spam. In fact these days MediaWiki installations are routinely targeted by more advanced spam bots, which can perform automated registrations, and so this setting will mean you end up with a lot of bogus user accounts (where the name is just a set of random letters) in the database. You should combine this with the use of Captcha extension, which can keep bots out.

To take the lock down idea to extremes, MediaWiki allows you to create a "gated community" where new users can't even register without asking you to set up an account for them. To do this, add the following to your LocalSettings.php:

See Manual:User rights and Manual:Preventing access for more information.

Other ideas
This page lists features which are currently included, or available as patches, but on the discussion page you will find many other ideas for anti-spam features which could be added to MediaWiki, or which are under development.

There is now also 'Spam Filter' project, dedicated to the task of building more effective spam filtering for MediaWiki.