Extension:Check Spambots

From MediaWiki.org
Jump to: navigation, search

Script error

MediaWiki extensions manual
Crystal Clear action run.png
CheckSpambots

Release status: beta

Implementation Page action
Description Checks editor's IP address, e-mail and name against external known-spambot blacklists.
Author(s) MysteryFCM et al.
Latest version 0.1.1 (2009-09-24)
MediaWiki 1.16+
Database changes No
License No license specified
Download See the code section
Hooks used
GetBlockedStatus

AbortNewAccount

Translate the Check Spambots extension if it is available at translatewiki.net

Check usage and version matrix; code metrics

The Check Spambots extension checks the editor's IP address, e-mail and name against external known-spambot blacklists.

Description[edit | edit source]

Spambot Search Tool is an automated script that may be configured to query the following databases:

  1. fSpamlist - fspamlist.com
  2. StopForumSpam - stopforumspam.com (duplicate of dnsbl.tornevall.org, per [1])
  3. Sorbs - sorbs.net
  4. Spamhaus - spamhaus.org
  5. SpamCop - spamcop.net
  6. ProjectHoneyPot - projecthoneypot.org
  7. Bot Scout - botscout.com
  8. DroneBL - dronebl.org
  9. AHBL - ahbl.org
  10. Undisposable - undisposable.net (server is currently down - leave this disabled)
  11. Tor Project - torproject.org

Most of these databases list known spambot or open proxy IP addresses. A few also list e-mail or user names. Undisposable identifies bugmenot users and free e-mail addresses intended for one-time use, Tor lists proxies which are part of the TOR project. ProjectHoneyPot (if enabled) requires an API key from projecthoneypot.org and Bot Scout (if enabled) is limited to a small number of enquiries per day unless a key is obtained from botscout.com.

Note: If a list is available in both web/API and DNS blacklist format, use of the DNSBL versions is preferred as most domain name servers will cache results, improving the response time of the script. This extension does not itself provide a mechanism to cache lookup results.

There are two versions of the original Check Spammers script. One is a standalone program, as deployed on http://temerc.com/Check_Spammers - the second, is check_spammers_plain.php. This is a simplified version of the script that can be used for forums, blogs, guestbooks or other web forms that allow users to comment/post. It returns true or false, based on whether or not the user is listed in the databases.

The CheckSpambots.php script listed below is a wrapper to allow Spambot Search Tool to be called as a MediaWiki extension. It is not part of the original package and is not currently distributed by nor supported by the original "check spammers" author.

Installation[edit | edit source]

  • Copy the code into a file and extract the file(s) in a directory called Check Spambots in your extensions/ folder. If you're a developer and this extension is in a Git repository, then instead you should clone the repository.
  • Add the following code at the bottom of your LocalSettings.php:
require_once "$IP/extensions/Check Spambots/Check Spambots.php";
  • Done! Navigate to "Special:Version" on your wiki to verify that the extension is successfully installed.

Code[edit | edit source]

CheckSpambots.php

CheckSpambots.php is a wrapper function and is MediaWiki-specific:

<?php
/**
 * MediaWiki extension wrapper for installation of it-mate.co.uk's spambotsearchtool
 *
 * An extension for blocking users names and IP addresses based on external 
 * blacklist servers. Requires check_spammers_plain.php and related scripts from 
 * http://support.it-mate.co.uk/?mode=Products&act=DL&p=spambotsearchtool
 *
 * @file
 * @ingroup Extensions
 */
 
if ( !defined( 'MEDIAWIKI' ) ) {
        die( 'This is an extension to MediaWiki and cannot run standalone.' );
}
 
include ("check_spammers_plain.php");
$wgHooks['GetBlockedStatus'][] = 'CheckSpambots::check_edit';
$wgHooks['AbortNewAccount'][] = 'CheckSpambots::check_newuser';
 
$wgExtensionCredits['other'][] = array(
    'name' => 'Check Spambots',
    'version' => '0.1.1',
    'author' => 'MysteryFCM et al.',
    'description' => 'Blocks known spambots based on external DNS and HTTP blacklists',
    'url' => 'https://www.mediawiki.org/wiki/Extension:Check_Spambots',
);
 
class CheckSpambots {
 
    /**
     * On edit/save: get user's IP/name/e-mail if known
     * then call checkSpambots() from SpambotSearchTool
     *
     * @param $current_user User: current user
     */
    public static function check_edit( $current_user ) 
    {
        wfProfileIn( __METHOD__ );
 
        $ip_to_check = wfGetIP();
        $mail = '';
        $name = '';
 
        if ( $current_user instanceof User ) 
        {
           $name = $current_user->getName();
           $mail = $current_user->getEmail();
        }
 
        if (checkSpambots($mail,$ip_to_check,$name))
        {
           $current_user->mBlockedby = wfMsg( 'proxyblocker' );
           $current_user->mBlockreason = wfMsg( 'proxyblockreason' );      
        }
 
        wfProfileOut( __METHOD__ );
        return true;
    }
 
   /**
    * On registration: Perform the check on new user e-mail/name/IP
    * @param $user User to be checked
    * @return bool
    */
   public static function check_newuser( $current_user ) {
        wfProfileIn( __METHOD__ );
 
        $ip_to_check = wfGetIP();
        $mail = '';
        $name = '';
 
        if ( $current_user instanceof User ) 
        {
           $name = $current_user->getName();
           $mail = $current_user->getEmail();
        }
 
        if (checkSpambots($mail,$ip_to_check,$name))
        {
                global $wgOut;
                $returnTitle = Title::makeTitle( NS_SPECIAL, 'Userlogin' );
                $wgOut->errorPage( 'blacklistedusername', 'blacklistedusernametext' );
                $wgOut->returnToMain( false, $returnTitle->getPrefixedText() );
                wfProfileOut( __METHOD__ );
                return false;
        }
 
        wfProfileOut( __METHOD__ );
        return true;
    }
}

Spambot Search Tool[edit | edit source]

The remainder of the code for this extension is available from the original author's site (description, download).

The files from the spambotsearchtool package which are required to deploy the script on MediaWiki are:

Most of the other files in the package are used to provide a stand-alone web-based user interface (such as this or this) and are not necessary for the deployment of the Spambot Search Tool as a wiki extension.

The check_spammers_plain.php script needs to be v0.39 (29/09/2009) or later. One minor patch was employed to disable any direct output to the screen unless a spambot is detected. The code segment to be disabled is:

 if($spambot == true){
     echo 'TRUE';
 }else{
     echo 'FALSE';
 }

These lines appear after all of the individual checks are complete, and were removed in this example because direct output to the browser will break the display formatting used by MediaWiki - a cosmetic issue.

Limitations[edit | edit source]

This extension does not provide a means internally for caching previous results (although, for DNS BL servers only, the local domain name server normally already does this). It does not provide a provision for using downloadable lists of known spammers and open proxies (such as those offered by stopforumspam); the use of an external lookup to obtain this same information works but may cause a slight delay to be added to the time to save a wiki edit.

There is currently no 'whitelist' capability (some time potentially could be saved by having the code *not* check edits by sysops or known, established users) and little or no provision to provide feedback back to the external blacklist maintainers as new spambots appear on-wiki. Whitelist could be trivially added; for instance (if you had an existing 'skipcaptcha' permission to control ConfirmEdit and wanted to re-use that same permission) add the following to the beginning of check_edit():

    global $wgUser;
    if ( $wgUser->isAllowed( 'skipcaptcha' ) ) 
       {return true;}

There is no check on the body of the text being edited; this extension determines whether the posting IP is a known bot, but does not check added external links to see if they contain known spam URLs. That latter task is done using extensions like SpamBlacklist (which can be extended to block links to blacklisted domains in article text).

There is also a risk of false positives, depending on the blacklist sources chosen. Many lists are intended primarily to target other forms of net.abuse (such as spam e-mail) and users on the same local net as a PC compromised by spammers may find themselves unexpectedly blocked from editing if these lists are used as a check on spambots.

This extension is not of use for dealing with any spambots not yet on the external blacklist databases. Title Blacklist, SpamRegex, Bad Behaviour, AbuseFilter, ConfirmEdit or ReCAPTCHA therefore remain necessary as a means of handling problem cases such as the supposed "new" anon-IP user who wishes only to create pages packed with external links to half of *.ru (or repeatedly break the Unicode on your existing pages if your audience is *.ru) or discuss «h3rb@l v1agra» ad infinitum under questionable, often-deleted page names such as ".../", ".../index.php", "Forum talk:..." and "Category talk:.../".

See also[edit | edit source]