CAPTCHA/zh

验证码（CAPTCHA，“Completely Automated Public Turing test to tell Computers and Humans Apart”的缩写）通过拓展在Wikimedia的wiki中被应用，作为一种表面上的阻止垃圾骚扰和制止垃圾骚扰者的方式. 在绝大多数wiki中，用户在创建新账户、新页面或在页面中添加外部链接时会遭遇验证码.

On pt.wiki, in 2008–2013 the CAPTCHA was also "temporarily" shown on every edit of unregistered and new users, allegedly to reduce vandalism (see discussion and 41745).

当下的验证码执行机制存在一些问题.


 * They are only available in English (5309): the words used by our CAPTCHAs, however they are created, should be in the user's language. An unknown number of new users and edits are lost from non-English speaking people.
 * They violate accessibility principles (4845).
 * They don't effectively prevent bots from spamming.



在未来可能会被使用的替代品


图像验证码
图像验证码无需输入文本，对移动端和国际化的相关问题上有帮助. 基于图像验证码的一些想法：


 * Find the different one (view prototype). Several images from the same category (e.g., people) are shown mixed with one image from a different category (e.g., cat). Humans should be able to recognise which is the different one. Note that in this case, the question is always the same (find the different one) and the categories used are not exposed to the user.
 * Find all images of a kind (view prototype). Images from two or more categories are presented together. The user is explicitly asked to find all the images of a given type (e.g., all images of people wearing glasses).
 * Tag images (view prototype). The user is presented with images that contain some tagged elements and options to pick the correct tag (e.g., is it a bird? is it a plane?).

难点在于如何在避开垃圾骚扰机器人的利用途径的情况下制作这样的图像并验证数据. 你需要很大一组验证码（理想情况下数十万个），否则攻击者可以很容易穷举你的验证码数据库. 如果你使用一个公共的图床（如共享站）或一个公开的数据来源（如共享站的分类），存在这样的可能，攻击者可以将验证码和图片来源一一对应并发现破解验证码的方法.



使用蜜罐替代验证码
一种避免验证码本地化问题的方式是不再使用它，并将其替换为蜜罐.



一个本土的reCAPTCHA复制品
Write a version of reCAPTCHA that uses document images that have been processed by MediaWiki's ProofreadPage extension for Wikisource: WikiCAPTCHA. In other words, a CAPTCHA that feeds data to ProofreadPage to augment its OCR processing. You might build on [//github.com/CristianCantoro/wikicaptcha existing code]. It is worth noting that "reCAPTCHA hold no specific patents for the technology behind their text CAPTCHA algorithms (At least none they discuss on their website or are able to be found on the US Patents & Trademark Office site", according to one blogger ).

Also discussed at Wikimania 2012 with the presentation Wikicaptcha: a ReCAPTCHA-like solution for Wikisource

The advantage of this approach is that we can make the latent work force currently wasted in CAPTCHA into profit for a Wikimedia project (Wikisource); and that we can start with a limited data set. In fact, working the reCaptcha way we could create some sort of bootstrap data set, then show people a mix of captchas with known and unknown solutions, and use the known ones for verification and the unknown ones for generating more data. But that is not easy and should get significant focus in the project if you want a CAPTCHA system which is of any practical use at the end.

Accessibility
The accessibility of our current CAPTCHA is extremely bad. If the user has impaired eyesight or uses a screenreader the text-based CAPTCHA is almost entirely inaccessible to them. A handful of our larger wikis solve this via a volunteer-run account request system. Alternatives like image CAPTCHAs still violate accessibility principles (4845), so an alternative such as an audio CAPTCHA should be considered.