MediaWiki Developer Meet-Up 2009/Notes/WikiWord


 * By user:Duesentrieb


 * WikiWord extract a thesaurus from Wikipedia.
 * Homepage: http://brightbyte.de/page/WikiWord
 * thesis extract: http://brightbyte.de/page/WikiWord/Excerpt
 * Navigatior: http://toolserver.org/~daniel/wikiword/wikiword.php
 * Thesaurus supplies relations:
 * term <-> concept (meaning relation)
 * concept <-> concept (related, similar, broader, narrower)
 * concepts = wiki articles
 * terms = title, redirect, anchor text, sort key, etc
 * multilingual
 * concepts from multiple wikipedias combined
 * terms in multiple languages refering to one concept
 * useful for indexing, disambiguation
 * plan: multilingual image search for commons (german blog post)
 * ideas for improvement:
 * get magic names and patterns from pywikipediabot config
 * use incremental updates as much as possible
 * look at coocurrance in paragraphs, look at co-coocurance
 * for image serach: index by yimage caption (used images)