Extension:ConfirmEdit

The ConfirmEdit extension lets you use various different CAPTCHA techniques, to try to prevent spambots and other automated tools from editing your wiki, as well as to foil automated login attempts that try to guess passwords.

ConfirmEdit ships with several techniques/modules to generate captcha.

Some of these modules require additional setup work:
 * MathCaptcha requires both the presence of TeX and, for versions of MediaWiki after 1.17, the Math extension;
 * FancyCaptcha requires running a preliminary setup script in Python;
 * and reCAPTCHA requires obtaining API keys.

Caveats: CAPTCHAs reduce accessibility and cause inconvenience to human users. In addition, they are not 100% effective against bots, and they will not protect your wiki from spammers who are willing and able to use human labor to get through the CAPTCHAs. You may wish to use ConfirmEdit in conjunction with other anti-spam features. Regardless of the solution you use, if you have a publicly-editable wiki it's important to keep monitoring the "Recent changes" page.

Installation
ConfirmEdit may not work if used with a MediaWiki version different from the one specified when downloading via the "Extension distributor".

CAPTCHA types
There are numerous different CAPTCHA types included with ConfirmEdit.

QuestyCaptcha
This module presents a question and the user supplies the answer. You provide the questions in the configuration. This module has proven to offer a strong mechanism against spam bots; it also should have the advantage of a better accessibility, as textual questions can be read by text-to-speech software allowing visually impaired users (but not bots) to answer correctly.

Add the following to LocalSettings.php to enable this CAPTCHA, editing the Q&A:

It will randomly choose a question from those supplied. The minimum is one.
 * QuestyCaptcha is case-insensitive. If the answer is "Paris" and the user writes "paris", or if the answer is "paris" and the user writes "Paris", it will still work.
 * If the answer has a special character like "ó", you may write an answer with "ó" and another without, just in case. For example if the answer is "canción" you can use  in case the user is lazy (or ignorant) and writes "cancion".
 * The answer must be easy to guess for a human interested in your wiki, but not by an automatic program. Ideally, it should not be contained in the text of the question; you can try and edit the captcha help messages and provide the solution to the captcha response there.
 * Just change the questions when/if they start proving ineffective; this may never happen if your wiki is not specifically targeted.
 * Don't ever reuse questions already used by you or others in the past: spambots are known to remember a question and its answer forever once they broke it.
 * You can get even smarter, with questions like «What is the output of "date -u +%V`uname`|sha256sum|sed 's/\W//g'"?».
 * And other dynamic questy captchas. DO NOT use an exact copy of the dynamic questions from the link -- they've been cracked by spammers. However other dynamic questions in the style of the questions presented are highly effective.

ReCaptcha
This module uses the "reCAPTCHA" widget/service. In addition to providing a CAPTCHA, it performs a valuable service because it helps to digitize old books.

To use this module, first go here and obtain a public and private key for your wiki.

Add the following to LocalSettings.php:


 * Recaptcha is only in the ConfirmEdit versions bundled with MediaWiki 1.18 and above. Earlier versions do not have the reCAPTCHA PHP files.
 * Unfortunately, as of 2011, some spammers appear to have figured out a way to bypass it, either through character recognition or by using humans. For that reason, it is not necessarily recommended.
 * Part of the weakness of the ReCaptcha module is that ConfirmEdit doesn't include any penalty mechanism, so spam bots can simply keep trying to bypass the CAPTCHA until they get through. This is an issue that is strongly worth addressing in some way.
 * ReCaptcha creates a partial dependence on user-side JavaScript, as well as a requirement for sound or graphics. It also localises poorly in most less-often used languages.
 * Regardless of its strengths or weaknesses, reCAPTCHA can't be implemented on Wikimedia wikis because it produces a third-party dependency.

ReCaptcha (NoCaptcha)
The new generation of ReCaptcha, called NoCaptcha, was introduced by Google back in December 2014 and reduces the need for humans to solve a CAPTCHA. Based on a user-side JavaScript (which can't be controlled by the user the administrator), reCaptcha tries to identify the site user as a human by analyzing his browsing behavior on the page. The user then has to click an "I'm not a robot" checkbox and (in the best case) doesn't have to do anything further to prove he's a human. However, in some cases, the user still has to solve a CAPTCHA image.

This module implements the new ReCaptcha NoCaptcha solution in ConfirmEdit. You still need a public and a secret key (which you can retrieve from the ReCaptcha admin panel) and install the plugin with:

There is an additional configuration option for this module,  (default:  ), which, if set to true, sends the IP address of the current user to a server from Google while verifying the CAPTCHA. You can improve the privacy for your users if you keep this set to "false". However, remember, that this module adds a client side JavaScript code, directly loaded from a server from Google, which already can collect the IP address of the user (combined with other data, too) and can not be limited by a configuration option.

SimpleCaptcha (calculation)
This is the default CAPTCHA. This module provides a simple addition or subtraction question for the user.

Add the following lines to LocalSettings.php in the root of your MediaWiki to enable this CAPTCHA:

Note that the display of a trivial maths problem as plaintext yields a captcha which can be trivially solved by automated means; as of 2012, sites using SimpleCaptcha are receiving significant amounts of spam and many automated registrations of spurious new accounts. VisualMathCaptcha is also relatively easily defeated. Wikis currently using this as the default should therefore migrate to one of the other CAPTCHAs.

FancyCaptcha
This module displays a stylized image of a set of characters. The Python Imaging Library must be installed in order to create the set of images initially, but isn't needed after that (can be installed with  in most environments).


 * 1) Add the following lines to   in the root of your MediaWiki installation:
 * MediaWiki version is 1.25 or newer:
 * MediaWiki version is older than 1.25:
 * 1) In LocalSettings.php, set the variable   to the directory where you will store Captcha images.  Below it set   to your passphrase.
 * 2) Create the images by running the following, where:
 * 3) * font is a path to some font, for instance AriBlk.TTF.
 * 4) * wordlist is a path to some word list, for instance /usr/share/dict/words. (Note: on Debian/Ubuntu, the 'wbritish' and 'wamerican' packages provide such lists. On Fedora, use the 'words' package).
 * 5) * key is the the exact passphrase you set  to. Use quotes if necessary.
 * 6) * output is the path to where the images should be stored (defined in ).
 * 7) * count is how many images to generate.
 * 8) * An example, assuming you're in the extensions/ConfirmEdit directory (font location from Ubuntu 6.06, probably different on other operating systems):
 * 9) * If you are not satisfied with the results of the words you've generated you can simply remove the images and create a new set. Comic_Sans_MS_Bold.ttf seems to generate relatively legible words, and you could also edit the last line of captcha.py to increase the font size from the default of 40.
 * 10) Put the images you get into captcha directory in your installation
 * 11) Edit your wiki's LocalSettings.php: specify full path to your captcha directory in $wgCaptchaDirectory and secret key you've been using while generating captures in $wgCaptchaSecret
 * 1) * If you are not satisfied with the results of the words you've generated you can simply remove the images and create a new set. Comic_Sans_MS_Bold.ttf seems to generate relatively legible words, and you could also edit the last line of captcha.py to increase the font size from the default of 40.
 * 2) Put the images you get into captcha directory in your installation
 * 3) Edit your wiki's LocalSettings.php: specify full path to your captcha directory in $wgCaptchaDirectory and secret key you've been using while generating captures in $wgCaptchaSecret
 * 1) Edit your wiki's LocalSettings.php: specify full path to your captcha directory in $wgCaptchaDirectory and secret key you've been using while generating captures in $wgCaptchaSecret

See also Generating CAPTCHAs for how Wikimedia Foundation does it.

How to avoid common problems running Python
C:\python\python.exe C:\Ex\CAPTCHA.py --font C:\Ex\FONT.ttf --wordlist C:\Ex\LIST.txt --key=YOURPASSWORD --output C:\Ex\ --count=20
 * 1) Install the most recent version of Python Imaging Library (PIL).
 * 2) Make the installation of Python on a short folder name. Like C:\Python\
 * 3) Create a folder like C:\Ex and place files CAPTCHA.py / FONT.ttf / LIST.txt into the folder.
 * 4) To execute easily, run the following example as a batch file:

Using Pillow instead of PIL
You can use the Pillow Library (can be installed with  or   in most environments) instead of the (old) Python Imaging Library (PIL) by simply changing the following lines in the file   (included in the extension folder of ConfirmEdit):

Change this lines in captcha.py: (cp. Porting existing PIL-based code to Pillow)

To this:

MathCaptcha

 * This requires the Math extension to be installed. Until MediaWiki 1.18 this was part of MediaWiki; in later versions you need to install it manually. See also Extension:Math

This module generates an image using TeX to ask a basic math question.

Set the following to enable this CAPTCHA:

See the readme file in the math folder to install this captcha.

Outside extensions
See each extension's documentation for how to install and configure it.

VisualMathCaptcha
The extension VisualMathCaptcha (now abandoned) had been used in conjunction with ConfirmEdit, but is easily defeated and not worth the effort.

KeyCaptcha
KeyCAPTCHA can be one of the more effective alternatives to ConfirmEdit, but at a price... it requires the user support both JavaScript and (one of) HTML5/Flash in order to be able to solve a graphic of a jigsaw puzzle. This breaks accessibility for visually impaired users, as well as breaking text-only or text-to-speech browsers.

KeyCAPTCHA is proprietary and dependent on an external server; if the company withdraws the product, new user registrations break immediately.

Configuration
ConfirmEdit introduces a 'skipcaptcha' permission type to wgGroupPermissions. This lets you set certain groups to never see CAPTCHAs. All of the following can be added to localsettings.php.

Defaults from ConfirmEdit.php:

To skip captchas for users that confirmed their email, you need to set both:

There are five "triggers" on which CAPTCHAs can be displayed:
 * 'edit' - triggered on every attempted page save
 * 'create' - triggered on page creation
 * 'addurl' - triggered on a page save that would add one or more URLs to the page
 * 'createaccount' - triggered on creation of a new account
 * 'badlogin' - triggered on the next login attempt after a failed one. Requires $wgMainCacheType to be set to something other than   in your , if in doubt the following will always work. Note that   does not trigger captchas on API login, but instead blocks them outright until   expires.

The default values for these are:

The triggers,   and   can be configured per namespace using the   setting. If there is no  for the current namespace, the normal   apply. So suppose that in addition to the above  defaults we configure the following:

Then the CAPTCHA will not trigger when adding URLs to a talk page, but on the other hand user will need to solve a CAPTCHA any time they try to edit a page in the project namespace, even if they aren't adding a link.

URL and IP whitelists
It is possible to define a whitelist of known "good" sites for which the CAPTCHA should not kick in, when the 'addurl' action is triggered.

Sysop users can do this by editing the system message page called MediaWiki:Captcha-addurl-whitelist. The expected format is a set of regex's one per line. Comments can be added with # prefix. You can see an example of this usage on OpenStreetMap.

This set of whitelist regexes can also be defined using the $wgCaptchaWhitelist config variable in LocalSettings.php, to keep the value(s) a secret.

Some other variables you can add to LocalSettings.php: These are described more thoroughly in the code comments
 * $wgCaptchaWhitelistIP - List of IP ranges to allow to skip the CAPTCHA (you can also use MediaWiki:Captcha-ip-whitelist ; see below for details).
 * $ceAllowConfirmedEmail - Allow users who have confirmed their e-mail addresses to post URL links

MediaWiki:Captcha-ip-whitelist can be used to change the whitelisted IP addresses and IP ranges on wiki. They should be separated by newlines. If any other character (apart from a valid IP address or range) is found on a line, it will be ignored but leading and trailing whitespace characters are allowed. For example, a line with only  is considered valid but   will be ignored.

Regular expressions
The global variable wgCaptchaRegexes accepts an array of regexes to be tested against the page text and will trigger the CAPTCHA in case of a match.

Wikimedia projects
For example, Wikimedia Foundation wikis use FancyCaptcha with a custom set of images and the default configuration, modified by what follows.

This means only unregistered and newly registered users have to pass the CAPTCHA.

EmergencyCaptcha mode
Additionally the shortcut named  is designed for use in a limited number of emergency situations, for instance in case of massive vandalism or spam attacks: it changes the default trigger values (see above) into the following:

So all anonymous and new users have to solve a CAPTCHA also before being able to save an edit or create a new page, in addition to the normal situation.

Rate-Limiting
With, ConfirmEdit supports rate limiting for false CAPTCHA. For more information about $wgRateLimits and how to set it up, read the manual, the action key is.

Test plan
See ConfirmEdit Test Plan.

Patch for even more spam protection
This is a patch to allow experienced users to bring in external links without solving a captcha, regardless if they has skipcaptcha permissions. A user is considered to be trusted if she has a large number of edits.

This patch also prohibits new users from adding _any_ external links. Such behavior should help a lot to tackle spam, because the whole reason of spam is to add such links (they call it "link building") and spam is almost always added by newly created users.

Configuration:

Apply the patch to extensions/ after unpacking ConfirmEdit 1.2. If you want to deviate from the defaults, add this to LocalSettings.php:

The patch isn't as well honed as it could be, for example user messages aren't localized. Also, the refusal for newbies to add external links applies no matter which permissions a user has. Other than that, it appears to work just fine. For a wiki using this patch, see http://reprap.org.

Markus "Traumflug" Hitter, September 2013, 

Experience with this patch
After two months with this patch, we still wait for the first spam edit. Other than spambots still creating accounts, misuse of our wiki has completely disappeared.

Legitimate users apparently understand the error message. No complaints, but occasionally useless edits to raise the edit count appear. Typically, these users revert their useless edits without maintainer intervention. Exactly like planned.

-- Traumflug@reprap.org

Authors
The basic framework was designed largely by Brion Vibber, who also wrote the SimpleCaptcha and FancyCaptcha modules. The MathCaptcha module was written by Rob Church. The QuestyCaptcha module was written by Benjamin Lees. The reCAPTCHA module was written by Mike Crawford and Ben Maurer. Additional maintenance work was done by Yaron Koren.