GSOC 2013 Improvement of glossary tools

=Project= Name: Improvement of glossary tools Public URL: http://www.mediawiki.org/wiki/User:Zhenya Announcement: http://lists.wikimedia.org/pipermail/wikitech-l/2013-April/068991.html Bugzilla report: https://bugzilla.wikimedia.org/show_bug.cgi?id=47981

= Identity = Name: Yevheniy Vlasenko (CV) Position: Student Location: Ukraine, Chernihiv Typical working hours:  9:00 - 18:00 (EEST) (06:00 - 15:00 UTC, 23:00 - 08:00 PDT)

University: Chernihiv Technological University (Ukraine)

Contact information:
 * IRC: Zheee
 * GMail: eu.vlasenko[at]gmail.com
 * Skype: Zhenya_987

=Synopsis= Presently there are two extensions for handling glossaries in MediaWiki: Lingo and Semantic Glossary. Lingo searches each viewed wiki page for the occurrence of the terms defined in the glossary and - when hovered over with the mouse - shows a tooltip with the respective definition. It uses a dedicated wiki page to store the terms and definitions of the glossary. Semantic Glossary is an extension to Lingo that uses semantic data stored in a Semantic MediaWiki store instead of Lingo's dedicated wiki page. This project is aimed to remove bugs and improve the functionality of Lingo and Semantic Glossary.

=Deliverables=

This section will be developed to document considerations and decisions on design and implementation during the course of the project.

Support for multiple definitions per term
Current situation: The user can specify one defition for each term. To specify a second defintion a complete new glossary entry is necessary.

Goal: Users shall be able to enter more than one definition per term. When the tooltip appears, it shall display all definitions.

Support for inflection
Current situation: Inflected forms of terms have to be set manually:
 * in Lingo with the help of special markup
 * in Semantic Glossary with the help of semantic properties

Goal: Inflected forms of terms shall be recognized automatically, e.g. by using stemmers.

Customizing the tooltip by using a template
Current situation: The tooltip displays the definition of the term exactly as specified in the glossary without any formatting done.

Goal: It shall be possible to specify a template to be used to format definitions. The mechanism for specifying the template needs to be investigated. Possibilities could be specification with the glossary entry, specification with the viewed page or wiki-wide specification in a wiki setting variable.

Ability to turn off the recognition of glossary terms in certain places
Current situation: The marking up of terms can be turned off for complete pages by using a magic word (__NOGLOSSARY__) or for portions of a page by wrapping them in a html element (e.g. a span) and setting its class to "noglossary".

Goal: On the viewed wiki page there shall be a button "This is not a term". When the button is pressed, the word shall not be highlighted as a term for this page in this place. The wiki shall remember that the word should not be highlighted as a term even if the page has been modified. An intermediate step would be to first provide a wiki tag to turn off the marking up for portions of a page.

Context support (for Semantic Glossary)
Current situation: All terms and definitions in the glossary are used when marking up a viewed page. If a term has more than one definition, all definitions are shown.

Goal: It shall be possible to specify which definition (if any) is applicable for each occurrence of a term. A mechanism for specifying the applicable definition needs to be investigated. Possible solutions could be to specify the applicable definition per occurrence of the term or to specify in the glossary with each definition the context where it is applicable, e.g. category or namespace of the viewed page.

Integration with FlaggedRevs/ApprovedRevs
Current situation: The most recent version of the glossary is used.

Goal: Integration with FlaggedRevs/ApprovedRevs, i.e. only approved revisions of the glossary (or of terms) shall be taken into account for marking up viewed pages.

=Timeline=

Precoding period (May 27 - June 17)

 * May 27-June 17: reading the documentation, creating profiles in gerrit, familiarize with the code

First mid-term period (June 17 - July 29)

 * Writing automated tests for the most important existing features (June 17 - June 21)
 * Ability to turn off the recognition of glossary terms in certain places. Phase 1: tags (June 24)
 * Support for multiple definitions per term (June 25 - June 26)
 * Support for inflection. Phase 1: Support for synonyms in Semantic Glossary (June 27 - June 28)
 * Context support: Ask the users which definition they meant (July 1 - July 12)
 * Ability to turn off the recognition of glossary terms in certain places. Phase 2: visual interface for turning off (July 15 - July 19)
 * Integration with FlaggedRevs/ApprovedRevs for Lingo (July 22 - July 26)

Second mid-term period (July 30 - September 16)

 * Customizing the tooltip by using a template (July 30 - August 5)
 * Context support (Terms for specified namespaces, categories, …) (August 6 - August 9)
 * Support for inflection. Phase 2: automatic support use of stemmers (try it with Russian pilot) (August 12 - August 23)
 * Optimizing performance, Testing and Bugfixing (August 26 - September 16)

End of the project: September 16

 * (OPTIONAL) Investigate integration of FlaggedRevs/ApprovedRevs with Semantic Glossary

=About me=

I am a student of postgraduate program in Chernihiv State Technological University (Unkaine), Department of Computer Science. My fields of interests are: web development, graphics, game development.

I have a good knowledge in such technologies: Java, NodeJS, PHP, Objective-C, HTML, CSS, SQL, Ruby, Javasript (jQuery), WebGL, Canvas, XML, JSON, MySQL, MongoDB.

Have skills in a filed of vector and 3D graphics.

Also have understanding of C++/C, Qt, Assembler.

Know how to use version control systems.

=Participation=

I plan to communicate with my mentor via email and Skype as its a very fast way to connect to each other. I will form the reports about "what I have done" and "what I'm planning to do" with all the sources.

I have a good knowledge of English and Russian so it will be easy for me to have phone or video sessions. One of the mentors is Russian.

To connect with the community I can enter IRC chat or use a MediaWiki's mailing list.

Mentors: Stephan Gambke and  Yury Katkov.

=Past open source experience=

Google Summer of Code 2011 (June 2011 – August 2011)

Project «MediaWiki Extension: SocialProfile - UserStatus feature»

Organization: Wikimedia Foundation

Description: The aim of the project was to create UserStatus feature for SocialProfile Extension. It allows users to post short "status updates" on user profile page. See project description.

Technologies: PHP, MySQL, JS, jQuery, HTML/CSS

Chernihiv State Technological University (January 2012 – May 2012) Science project «Grid Computing management tools»

Description: Project represents the API for collecting statistics in Grid networks and a JS library which allows to build GUI web application for modeling and submitting tasks.

Technologies: C/C++, ARC, XML, JS, HTML

Google Summer of Code 2012 (June 2012 – August 2012)

Project «jQuery mobile application for TagTeam»

Organization: Berkman Center for Internet & Society at Harvard University

Description: The aim of the project was to create a mobile interface for JSON API built into TagTeam with the help of jQuery Mobile.

Technologies: jQuery, Javascript, AJAX, JSON, HTML/CSS, Ruby.