CAPTCHA

This request for comment is to discuss improvements or replacements to our current MediaWiki CAPTCHA.

Background
CAPTCHAs (short for "Completely Automated Public Turing test to tell Computers and Humans Apart") are utilized on Wikimedia wikis, via the ConfirmEdit extension, as a means of ostensibly preventing spam and deterring spammers. In most wikis, a user might hit a CAPTCHA when trying to create an account, create a new page, or add an external link to a page.

On pt.wiki, the CAPTCHA is also "temporarily" shown on every edit of unregistered and new users, allegedly to reduce vandalism (see discussion and 41745).

There are a number of problems with the current CAPTCHA implementation.


 * They are only available in English (5309): the words used by our CAPTCHAs, however they are created, should be in the user's language. An unknown number of new users and edits are lost from non-English speaking people.
 * They violate accessibility principles (4845).
 * They don't effectively prevent bots from spamming.

Replacing CAPTCHA with a honeypot
One possibility for avoiding localizations issues with the CAPTCHA is simply to remove it and replace it with a honeypot.

A homegrown reCAPTCHA clone
Write a version of reCAPTCHA that uses document images that have been processed by MediaWiki's ProofreadPage extension for Wikisource: WikiCAPTCHA. In other words, a CAPTCHA that feeds data to ProofreadPage to augment its OCR processing. You might build on [//github.com/CristianCantoro/wikicaptcha existing code]. It is worth noting that "reCAPTCHA hold no specific patents for the technology behind their text CAPTCHA algorithms (At least none they discuss on their website or are able to be found on the US Patents & Trademark Office site)", according to one blogger.

Filed as bug 32695.

Also discussed at Wikimania 2012 with the presentation Wikicaptcha: a ReCAPTCHA-like solution for Wikisource