User:Drakefjustin/Fill-in-the-blanks

Comments and feedback is welcome.

Identity
Name: Justin Drake Email: drakefjustin (gmail.com account) Project title: Fill-in-the-blanks

Contact/working info
Timezone: UTC +1 (Should be in France this summer.) Typical working hours: 10:00-13:00; 14:30-19:00; 21:00-22:30 Skype: Randomblue12

Abstract
This project aims to make learning content from Wikipedia articles (and potentially Wiktionary) more interactive, in a very simple way. I propose to develop a fill-in-the-blanks extension for MediaWiki, automatically generating cloze tests. The tool would remove at most one link per sentence in a selected portion of article to be (semi-randomly) removed and replaced with a blank input box. The key observation is that an article's wikilinks hold pertinent and localized information, ready to be exploited for active learning and quizzing. Automatic correction and scoring will be available, with the option for users to override decisions.

Implementation details
What? MediaWiki extension

Languages PHP and javascript

Links

 * Retrieving links: Relevant classes are ApiQueryLinks, SpecialWhatLinksHere and Linker.
 * Selecting links: Links will be selected at random. Not only is this easy to implement, but it allows quizzes on the same article to vary. If time permits, I could find heuristics to gauge the "technicality" of a link, for example counting links to an article. The higher the count, the more likely it is a common concept. The difficulty of the quizz can be altered by using such metrics. Also, the right amount of links should be removed. For now, I expect at most one link per sentence is a good balance. This could be made customizable.
 * Search instances of links: It is Wikipedia convention not to link every instance of a relevant concept within an article. I will therefore have to find the non-liked instances of linked concepts, maybe using ApiQuerySearch.
 * Checking links: Efficient correction can make the difference between a useful tool and a broken tool. As a first easy solution, I intend to allow the user, for each individual answer, to override the automatic correction if he feels his answer is correct. A next step is to improve the automatic correction. For example, for each link, redirects will be counted as valid answers since synonyms, variant spellings, common misspellings, etc. are often redirected to the corresponding article. The API can retrieve the list of redirects to an article easily.

UI

 * Highlighting text: I want users to be able to select the text they want to be quizzed on right from the Wikipedia article. Highlighting text is an easy and natural solution, so ArticleHighlight could be useful.
 * Identifying sentences: I need to identify the sentence structure within an article for two reasons. The first is that at most one link per sentence should be removed. The second is to avoid the user to select fragments of sentences: the basic "quizz unit" is a sentence. Some work has been done on sentence-level editing.
 * Input boxes: The blank input boxes should all have the same width. Code from InputBox and RemoveRedlinks could be useful.

Scoring

 * Instant feedback: Have small logos such as Green tick.svg and Red x.png in the top right corner using ArticleEmblems.
 * Logging of results: This doesn't have to be secure. That is, users can cheat themselves if they wish. Logging of results can be done locally on each user page, e.g. at User:example/QuizResults/.
 * Show score: Potentially use space in sidebar with SkinBuildSidebar. Pop-out might also be an option.

About me
I'm a fourth year student in mathematics at the University of Cambridge, UK. I completed my undergrad there, and I am currently enrolled in a Master's program called "Part III". This is probably my last year as a student. In particular, I should be completely free this summer, for full commitment on this project.

I don't have any experience programming for MediaWiki, but I use MediaWiki through Wikipedia a lot. I am relatively tech-savvy and I know PHP and javascript, although most of my programming experience is in C for my coursework. For sure this will be a great learning experience, and I look forward to it.

I am especially enthusiastic as this relates to a team project which grew out of the start-up weekend in Cambridge. Myself, and two others, have plans to build a much fuller (and smarter!) automatically-generated interactive learning framework for Wikipedia, using other pieces of structured data, such as infobox templates. MediaWiki mentoring through the GSoC would provide a significant boost for our project!

Deliverables
This project (as described in the abstract) is definitely doable in the given time frame. Having said that, it can be made "arbitrarily difficult" be adding fancy features, such as:
 * 1) Improved automatic correction with synonyms and variant names, variant spellings and misspellings, stemming, etc.
 * 2) Automatic recording of quiz results for signed-in users. (E.g. store results in user/Example/Fill-in-the-blanks/Results.)
 * 3) Fine content selection, such as per-sentence selection in one click.
 * 4) Customization of density of links to remove, level of randomness, instant feedback, skinning, etc.
 * 5) Advanced scoring and statistics.
 * 6) Identify problems with an article or its linkage by finding statistical anomalies.

Added to the coding-related difficulties, this project has to produce something pedagogically useful, which I'm not taking for granted. I intend to test the extension periodically and fine tune according to feedback I receive.

Project schedule
First half of summer


 * Focus on writing code for the Links and UI sections discussed above.
 * Reach a working product, with basic functionality.
 * Start asking for feedback, and rethink the best way forward for second half of summer.

Second half of summer


 * Reach a useful product, which is easy to use.
 * Write code for the Scoring section and some more fancy features (from the above list).
 * Tweak and optimize, write some documentation, and rethink the best way forward for future.

Participation
While preparing the proposal during the past week I found that reaching the MediaWiki community for help is relatively easy. I've used the IRC, the mailing lists and I found this list of developers useful. Users Jeblad, Apexsharma, JanPaul123, MaxSem and others have kindly offered their help. I think communicating progress through this wikipage (or similar) is my favorite option. I'd post links to my code regularly, and encourage the community to provide feedback. To complement the code, screen shots and other documentation can help keep people interested. I will also get in touch with educators to test the tool. In particular, I know how to reach the Wikipedia and Wikiversity community if necessary (Help desk, Village Pump, etc.). Finally, the list of extensions and class reference have been useful.

Mock-ups