WikidataEntitySuggester/Progress

= Monthly reports = I'll be dividing each report into "Things done" and "Things to do" sections, the former being what I did over the month, the latter being my immediate goals.

June
Things done:
 * 1) Did some research on techniques to provide recommendations for values.
 * 2) I finished documentation on the Wikidata Entity Suggester prototype, and am in the middle of posting them at this page and its sub-pages.
 * 3) Set up the Gerrit repository, got access to an m1.large Labs instance wikidata-suggester.
 * 4) I had written MapReduce scripts in Python to be used with Disco in May, to replace the C programs that Byrial shared (to parse the wiki dump, generate a csv file and database tables) since the C programs sometimes broke if some fields overshot some limits. Disco has an erlang dependency, so I decided to change the scripts to be used with Hadoop through Hadoop Streaming. I've configured Hadoop on the wikidata-suggester server and tested the scripts.
 * 5) Transferred the prototype code to Gerrit. I'm yet to push the PHP client.
 * 6) Made a few changes at my GSoC proposal page to reflect the new developments (addition of the wiki pages, extension page etc.)
 * 7) Have done partial deployment of the prototype on the labs server, should be finished in a couple of days. The instance now has a public IP; have opened a few ports to monitor Hadoop, Myrrix etc.
 * 8) Created the extension page for entity suggester here.

Things to do:
 * 1) Add some functionality to the MapReduce scripts to create database tables.
 * 2) Finish deploying the prototype on labs.
 * 3) Receive feedback, ask for new ideas/features.
 * 4) Do more research for recommendation, case-based reasoning, write code.