WikidataEntitySuggester

This is a prototype for the Entity Suggester's first and second objectives - suggesting properties and values for a new item in wikidata. I'll be working on adding this entity suggester to Wikidata and improving the sorting order of the entity selector for Google Summer of Code 2013.

As of now, Myrrix is used to build a basic model. Optimal value of lambda and no. of features that I found from ParameterOptimizer are not being used currently. I need to do more experimentation for that.

It's an initial prototype written in Java and PHP, using Myrrix' Java API and Guzzle. The Java backend is a Myrrix instance, plus a couple of custom wrapper servlets that are used to push data into the Myrrix instance and get recommendations from it. The PHP client is built on top of Guzzle and exposes a neat PHP API that can be used to query the backend.

Setting it up is easy - basically, fire up tomcat with the backend war file, run a few commands. Use the PHP API to reap it. I have included a command line standalone client jar too. After building from source, you can find it here:

Wiki Pages
Please read these pages in sequence to learn how to set everything up and how it works. The instructions are for Ubuntu, so it should be fairly easy to follow them and set this up on Labs.
 * How to set everything up on linux (must read!)
 * CSV file explanation
 * Using the PHP client (also contains examples)
 * Using the command line client (also contains examples)
 * Which class does what

Acknowledgements

 * 1) Byrial for sharing the programs used to generate database tables from the wikidata data dump. The property statistics are also being very helpful. I have written a couple of sql codes to generate CSV files as required by the Entity Suggester. The C codes are slow, and the char arrays often break - not portable/robust. Therefore, I'm migrating to Python MapReduce (to run through hadoop streaming) scripts that I'm writing myself for parsing the wiki dumps.
 * 2) bcc-myrrix - it's a PHP client for Myrrix built on top of Guzzle. I used its code and modified it to suit my needs for the Entity Suggester PHP Client.

Progress Reports
I'll be maintaining monthly and weekly reports on this page.