CAPTCHA/Image completion captchas

Image completion captchas

 * Public URL: https://www.mediawiki.org/wiki/User:RuchirangaW/Image_completion_captchas
 * Announcement: (link to the announcement of your proposal at the wikitech-l mailing list.

Name and contact information

 * Name:Thanuditha Ruchiranga Wickramasinghe
 * Email:truchiranga@gmail.com
 * IM networks:Google Talk : truchiranga@gmail.com
 * Google+      : truchiranga@gmail.com
 * Skype          :  Thanuditha Ruchiranga


 * LinkedIn:Thanuditha Ruchiranga
 * Git Hub:https://github.com/Ruchiranga
 * Location:Galle, Sri Lanka
 * Typical working hours:WeekDays: 6 pm - 12.00 mid nights, Weekends: Full Day

Synopsis
The CAPTCHAs currently implemented in Wikimedia projects are mostly broken and might lead to various security threats and vulnerabilities. Furthermore CAPTCHAs that imply English letters are very hard to be considered multilingual. Usually users find it very frustrating when its hard to identify one or more letters in a CAPTCHA. My project idea focuses on desiging multilingual CAPTCHAs that would be a very easy task for a human to solve but would be a very difficult task for a bot to solve.

Proposed idea

The complexity of the captcha can be considered in several phases as shown below. Each approach has different levels of security and the optimum approach is to be discussed and selected during the Community Bonding period. Optimum approach would be the approach that gives sufficient security level for the least amount of overhead work in preparing the image captcha.

Phase 1:

The captcha to solve will be an incomplete image as shown here.

The position of the missing square from the original image can be randomly selected. The removed piece can be placed under the remaining image along with some other non matching images(let us consider 4 non matching images for the moment). The user has to choose the correct piece that matches the removed part of the original image.

Phase 2:

Since image puzzle solving algorithms have been showing up lately, the above approach might be vulnerable. As a solution,  the pieces of images to select can be attached to the original image and the whole captcha can be displayed as one single image. This way, it would be difficult for a bot to identify the possible pieces to be placed. The pieces does not necessarily need to be of the same image size making identifying them much more complex, for a computer program.

Phase 3:

This idea can be made even more secure by placing the pieces not just below the image, but by placing the image choice pieces randomly above the original incomplete image once again giving a one final single image. This too makes it much more hard for a bot to identify the choices.

Phase 4:

In choosing pictures for this, we can consider trying to select pictures having somewhat noticable gradients(i.e some noticable edges). This is because, for a plain kind of picture like a one showing the blue sky, it would be difficult even for a human to find the exact match for a certain position.

Phase 5:

The level of security can be increased further by making modifications to the incomplete image too. We can lay a certain text across the missing patch in the incomplete image such that it would make it difficult to find the gradients at the edges of the missing parts. And additionally we can make the user type the text that is laid over the remaining image. That text also can be made hard to be read by a bot just like the words in current captchas by making them wavy and adding some other effects on them.

The details of the whole captcha image can also be reduced by laying white stripes continously over the final captcha image. This makes processing the captcha image very much hard for a bot or a computer program.


 * Possible mentors: Pau Giner, User:Emufarmers

Deliverables

 * This project aims to develop effective, very much less vulnerable, highly secure and user friendly captchas by using Wikimedia Commons as a database for the orginal captcha generation images.


 * Develop a more effecive and secure captcha generation system than the one which is currently being used


 * Develop a system for selecting images from the Wikimedia Commons image database that meets the necessary requirements for a image to be eligible to be used as a Captcha. This might include certain image processing algorithms.


 * Develop a system to bring a plain text to a desired level of unreadability. Once again image processing techniques will be used.


 * Develop a system to bring out a final captcha image by combining and including all the fore mentioned strategies.

Project Schedule

 * Community bonding period - Planning and deciding the best suited approach with the mentors as well as the other interested community people.


 * 2 weeks - Getting familiar with the wikidata API and collecting the necessary information on how to use the Wikimedia Commons as a database for generating captcha.


 * 1 week - Finding out the way to test the work done


 * 2 weeks - Develop a filter for selecting the most suitable images for a captcha out of all the images in Wikimedia Commons


 * 3 - 4 weeks - Develop a system that gives out a Captcha image with the desired level of security once a certain image is given


 * 2 weeks - Ensuring proper integration of the developed systems with Mediawiki


 * 2 weeks - Testing and documentation work

Participation
I can communicate through Skype and Gmail. And I will maintain a repository at github where I will regularly commit what I develop. I will also maintain a good communication with the mentors, as well as in wikimedia developer mailing lists.

About you
Education completed or in progress

I am Ruchiranga Wickramasinghe, a first year undergraduate at University of Moratuwa, Sri Lanka. I am studying Computer Science and Engineering. I have mastered languages Java, C and C++, I have basic knowledge in HTML and I am currently learning Java Script, PHP and CSS. My interests are mainly on Image Processing, Algorithms and Artificial Intelligence. And I have great passion towards contributing for the open source community.

For more information you can refer my user page on media wiki: Ruchiranga Wickramasinghe

How did you hear about this program?

I first heard about this program through a GSoC meetup held in our university - GSoC Sri Lanka Meetup. And also through the DecadeOf GSoC program held at our University(University of Moratuwa) with the presence of 1. Mr. Chris DiBona, Director of Social Impact and Open Source - Google Inc. 2. Ms. Mary Radomile - Program Manager - Google Inc. 3. Ms. Stephanie Taylor - Program Manager - Google Inc. 4. Mr. Rohan Jayaweera, Sri Lanka Country Consultant for Google Inc.

Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?

Since in Sri Lanka, we do not have a summer or a summer vacation, my university academic work has to be carried out along with the program. But I am confidant about myself that I am perfectly able to manage both work loads and meet the requirements on time as planned.

'''We advise all candidates eligible to Google Summer of Code and FOSS Outreach Program for Women to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?'''

Past experience
I honestly am a newbie to the FOSS world. As far as I have understood, this movement will make a remarkable impact on the quality of the lives of millions of people around the globe. And so I see no reason why I should not be a part of something really big as this and serve for the betterment of the future of the man kind. And I hope that this notion would keep me focused on contributing to FOSS projects throughout my career.
 * Please describe your experience with any other FOSS projects as a user and as a contributor
 * Please describe any relevant projects that you have worked on previously and what knowledge you gained from working on them (include links):


 * What project(s) are you interested in (these can be in the same or different organizations)?