Extension:Check Spambots

Description
Spambot Search Tool is an automated script that may be configured to query the following databases:


 * 1) fSpamlist - fspamlist.com
 * 2) StopForumSpam - stopforumspam.com
 * 3) Sorbs - sorbs.net
 * 4) Spamhaus - spamhaus.org
 * 5) SpamCop - spamcop.net
 * 6) ProjectHoneyPot - projecthoneypot.org
 * 7) Bot Scout - botscout.com
 * 8) DroneBL - dronebl.org
 * 9) AHBL - ahbl.org
 * 10) Undisposable - undisposable.net
 * 11) Tor Project - torproject.org

Most of these databases list known spambot or open proxy IP addresses. A few also list e-mail or user names. Undisposable identifies bugmenot users and free e-mail addresses intended for one-time use, Tor lists proxies which are part of the TOR project. ProjectHoneyPot (if enabled) requires an API key from projecthoneypot.org and Bot Scout (if enabled) is limited to a small number of enquiries per day unless a key is obtained from botscout.com.

There are two versions of the original Check Spammers script. One is a standalone program, as deployed on http://temerc.com/Check_Spammers - the second, is check_spammers_plain.php. This is a simplified version of the script that can be used for forums, blogs, guestbooks or other web forms that allow users to comment/post. It returns true or false, based on whether or not the user is listed in the databases.

The CheckSpambots.php script listed below is a wrapper to allow Spambot Search Tool to be called as a MediaWiki extension. It is not part of the original package and is not currently distributed by nor supported by the original "check spammers" author.

Installation

 * 1) Download the source files (as listed below) to your "...extensions/CheckSpambots/" directory
 * 2) Edit config.php to indicate which servers you wish to check for spam blacklist information; add any API keys (if applicable)
 * 3) Edit your wiki's LocalSettings.php to add:
 * 4) Set $wgEnableSorbs = false; as it is no longer needed

CheckSpambots.php
CheckSpambots.php is a wrapper function and is MediaWiki-specific:

Spambot Search Tool
The remainder of the code for this extension is available from the original author's site (description, download).

The files from the spambotsearchtool package which are required to deploy the script on MediaWiki are:
 * check_spammers_plain.php
 * config.php
 * en.php
 * functions.php

The following bugfixes are to be applied to check_spammers_plain.php before installation:
 * Addresses of all DNS blacklists must be specified with trailing dots (ie: "example.org." not "example.org") as PHP may otherwise handle a "not found" (NXDOMAIN) condition by returning the server's own address - a false positive
 * Addresses of individual files (code or configuration) included from within the scripts must be changed to specify proper pathnames instead of using "./" (the path to the current directory). MediaWiki invokes extensions with the current directory set not to ".../extensions" or ".../extensions/CheckSpambots" but to (typically) the directory in which LocalSettings.php or MediaWiki itself are installed.

Most of the other files in the package are used to provide a stand-alone web-based user interface (such as this or this) and are not necessary for the deployment of the Spambot Search Tool as a wiki extension.

Limitations
This extension does not provide a means internally for caching previous results (although, for DNS BL servers only, the local domain name server normally already does this). It does not provide a provision for using downloadable lists of known spammers and open proxies (such as those offered by stopforumspam); the use of an external lookup to obtain this same information works but may cause a slight delay to be added to the time to save a wiki edit.

There is currently no 'whitelist' capability (some time potentially could be saved by having the code *not* check edits by sysops or known, established users) and little or no provision to provide feedback back to the external blacklist maintainers as new spambots appear on-wiki.

This extension is also not of use for dealing with any spambots not yet on the external blacklist databases as it does not examine the content of the edit itself for suspicious patterns in text, attempts to use hidden text or links to known spam websites. Extensions such as Title Blacklist, SpamRegex, ConfirmEdit or ReCAPTCHA therefore remain necessary as a means of handling problem cases such as the supposed "new" anon-IP user who wishes only to create pages packed with external links to half of *.ru (or repeatedly break the Unicode on your existing pages if your audience is *.ru) or discuss «h3rb@l v1agra» ad infinitum under questionable, often-deleted page names such as ".../", ".../index.php", "Forum talk:..." and "Category talk:.../".