About this board

Other related discussions triggered by GSoC 2014: .

Possible solution to Multilingual, usable and effective captchas

AalekhN (talkcontribs)

While researching i found that Wiki Commons could be act as a database to wide no of images but since there are non tags associated with the images and categories the image opt for is highly unreliable (for example in the category "cat" contains several vague images which does not have cat in it instead they are mentioned in the category cat while retrieving images these vague images can also act as correct answer) it could create problem for images they have options as. For the solution i figured out three ways to encounter the problem: 1)Use image recognition:To determine the whether a particular image contain object it is looking for (example:if the captcha asks a question to select the cat; the image recognition tool will then determine whether the image given as option contain cat).This image recognition tool can be build with the help of Python with opencv library since Php's GD library is not ideal for image recognition also it is much slower. 2)Use Clip Arts:We can make a database of clip arts of various objects and making a combination of them and asking the questions accordingly for example:we can use clip arts of monkey,cat and tiger make their combinations and ask the user to select the image with tiger and monkey. 3)we can use first five images of a definite category as an options but we will be left with very few options in our hand

In my opinion image recognition is the best possible solution to the problem since it would also help us with annotations as mentioned here and advantage to image recognition can also be that it could act as an important pillar for Wiki commons as i personally fell that its database is not well organized

Your suggestions are highly recommended and welcome. Cheers ~aalekhN

Pginer-WMF (talkcontribs)

Captchas need to be hard to solve by machines. If machine recognition can be used to infer the categories when creating the captcha, it may be also used to solve it.

The captcha system does not need to pick from all possible images on Commons, it is fine to use just a subset of those that works, and even make the system to improve over time (discarding the images for which users had asked to reload the captcha or users fail to resolve).

This post was posted by Pginer-WMF, but signed as Pginer.

Reply to "Possible solution to Multilingual, usable and effective captchas"
Gmansi~mediawikiwiki (talkcontribs)


These are some approaches i can think of instead of a text based captcha.

The image idea where users are asked to spot the odd one out like demonstrated or find all the similar images like mentioned in here.

Also a picture with a part chipped in could be shown and chipped pictures could be given as options

like find the missing part from a jigsaw puzzle.

The image which would be shown is is the picture which would be the correct option.

The other options could be rotated versions of this , which would not be so easy for the bot to match. (unless it somehow worked some digital processing algorithm and matched the color gradients or something like that).

This is a good option for people who do not know english or are illiterate and maybe would not understand questions like : is this a bird , plane , superman? after being shown a picture.

Tell me what you think

(Sorry to upload those images on imgur. i dont know how to put them on the wiki .Hope that is ok)

This post was posted by Gmansi~mediawikiwiki, but signed as Gmansi.

Nemo bis (talkcontribs)
Reply to "Captcha: a newer idea"
Nemo bis (talkcontribs)
Reply to "change captcha questions"

Bauernegro - What system are we using!?

CFCF (talkcontribs)

I work quite extensively on several different Wikipedias, and CAPTCHAs are normally only a minor nuisance for me. I was importing templates to the Zulu Wikipedia when I was met by the CAPTCHA code BAUERNEGRO. What is this? Are we not applying any filtering at all on the words we use? This was after already being prompted to write gipsydick. This is extremely offensive, and not acceptable from a major website like Wikipedia or from MediaWiki.

Screencapture of the incident.
Nemo bis (talkcontribs)

There is a blacklist but very rough.

CFCF (talkcontribs)

Any way to expand it to include words such as negro, gipsy and dick?

Nemo bis (talkcontribs)

Sure, you can send patches: for instance I have one which does that among other things, gerrit:121255 (based on Wiktionary).

However it may take months or years for the new blacklist to take effect, as it's only used when the captcha images are regenerated. We also don't have control on what sort of dictionary WMF uses.

Reply to "Bauernegro - What system are we using!?" (talkcontribs)
Reply to "Hand gesture CAPTCHA"

Concept: Digitizing for Wikisource

Amgine (talkcontribs)

One thing we might consider is a variation on reCaptcha's early goal: let's use WMF captchas to help digitize scanned texts for Wikisource.


  • Scanned texts (dejavu?) are processed through OCR.
  • OCR issues are identified (e.g. scanned text 'word' caught by spell check as misspelling, image region clipped for use in captcha)
  • One of two images presented in captcha is drawn from a pool of OCR issues, the 'solution' for this image should match a spelling dictionary or fuzzy match the OCR text. Solutions are stored until a statistically significant percentage of results are exactly the same.
  • The other of two images presented in captcha must match solution exactly.
Reply to "Concept: Digitizing for Wikisource"
Nemo bis (talkcontribs)
Reply to "Asirra broken?"
Nemo bis (talkcontribs) : anything of use? Requires Flash it seems; shows 5 icons (like clock, woman, camera, sunglasses, key) and calls one by name asking for it to be clicked. The name could be translated but I have no idea how such a thing can work, probably most protection is given by the inability of spambots to do anything in Flash?

AalekhN (talkcontribs)

Yes this could be a good alternative i just experimented with it on my local computer and gave a thought included it on my proposal. :)

Reply to "visualCaptcha"
Nemo bis (talkcontribs)
AalekhN (talkcontribs)

This system of captcha is somewhat equivalent to the present type of captcha in use, this wont provide any additional protection from the bots also the audio captcha in use can be made more convenient by improving Screen Reader as mentioned here:

Reply to "PHP CAPTCHA"
Pastakhov (talkcontribs)

Take a look at the interesting idea MotionCAPTCHA.

I'm not sure that it's ready to use (at least it should work on the server side), but I think it is possible to use this technology to combat bots.

I think to successfully implement is necessary:

  • Generator pictures (order to prevent reuse of a response)
  • Analyzer entropy (it may be possible to find a filter's formula to separate a human entropy from a computer)
  • Maybe something else, it is necessary expert opinion and brainstorming
Nemo bis (talkcontribs)

Is there any research backing the concept?

Pastakhov (talkcontribs)

I have no skills in this area. It's just an idea to brainstorm. Maybe it can inspire someone.

Nemo bis (talkcontribs)

I didn't ask you to write a paper :) if you can do a Google Scholar search it would helpful.

Reply to "MotionCAPTCHA"