WikidataEntitySuggester/Proposal

Entity Suggester to recommend relevant properties and values for an item, improving sort order in entity selector

 * Public URL: Entity Suggester
 * Bugzilla report: Entity Suggester : Bug #46555, Entity Selector sort : Bug #45351
 * Announcement:

Name and contact information

 * Name: Nilesh Chakraborty
 * Email: nilesh@nileshc.com
 * IRC or IM networks/handle(s):
 * jabber: nilesh@nileshc.com
 * freenode nick: nileshc
 * Location: Kolkata, India (UTC +0530h)
 * Typical working hours: 15:00–24:00 IST (9:30–18:30 UTC)

Summary
Wikidata authors have to spend a considerable amount of time on finding the required properties and values for them. This project is meant to make their task easier. The goal of this project is three-fold - (i) suggesting properties relevant to the context (depends upon the item that is being edited), (ii) suggesting values to the recommended properties or a new property that the author starts with, (iii) make the sorting mechanism of the entity selector smarter so that more relevant properties appear at the top. A collaborative filtering approach will be used to suggest the properties and do the sorting. In order to suggest the values, a more individual/granular approach has to be used for each type of property.

Benefit to MediaWiki
This project will make the process of adding a new item to wikidata much more efficient and easier for the authors, since they will receive real-time recommendations for properties and values rather than always having to repeatedly come up with all the properties themselves. Also, the ordering of properties under an item will be improved.

Optional
Blah

About you
I am a 3rd year undergraduate student of computer science, pursuing my B.Tech degree. In short, I love programming and it's pretty much what I do all day, if I'm not on occasion busy doing something else! I have unending enthusiasm for working on anything related to big data, data mining, machine learning and recommendation engines and like researching on those topics because I'm passionate about them.

To find the idea on building an entity suggester for wikidata, on the mediawiki GSoC ideas page, was serendipity if not anything else. If I could build something that would make the job easier for wikidata authors and let them become more efficient, it would be nothing short of fabulous. Since I have a thorough experience with recommendation engines (both Apache Mahout and Myrrix), I believe that I can use my skills to the fullest and make the entity suggester quite possibly "the most awesomest wiki enhancement ever". :-)

Participation
I will make a weekly or bi-weekly post on my blog at nileshc.com about my progress on the project, status on the milestones etc. and communicate with my mentor and the community via the [wikidata-l@lists.wikimedia.org wikidata-l] or [wikitech-l@lists.wikimedia.org wikitech-l] mailing lists.

Though honestly I'm not much of a blogger and prefer to just focus on working, with only a moderate amount of interaction.

I will use github to track the source code and will share the repo link with my mentors and the community once it's set up.

Past open source experience
Honestly, I do not have a lot of published open source code. I am currently working on a Facebook friend-suggester that recommends friends based on semantic similarity of each other's interests. Previously I have worked on an online interactive social college magazine from scratch (using Java EE/JSF, Websphere and DB2 server) and designed the database schema for it; I was in a team of 4. Unfortunately, it never reached a point of completion. The database schema and use-case diagrams I designed are available here.

Any other info
Currently, I am browsing the mediawiki technical manual, studying the data model and learning how to develop extensions