Extension:Semantic Drilldown

Description
Semantic Drilldown is an extension to MediaWiki that provides a page for drilling down through a site's data, using categories and filters on semantic properties. It is heavily tied in with the Semantic MediaWiki extension, and is meant to be used for structured data that has semantic markup. Having Semantic MediaWiki installed is a precondition for the Semantic Drilldown extension; the code will not work without it.

The "Browse data" page is the heart of the extension. It shows, at the top, all the top-level categories in the wiki; i.e., the categories that are not subcategories of another category, and the number of pages within that category. Each category name is a link to a drilldown for the pages in that category. It lets the user select additional constraints to limit the number of results. These constraints come in two types:


 * Subcategories - if the category has any subcategories, those will show up in a "Subcategory" row. Each one will be a link that will let the user only view the pages that belong to that subcategory. The resulting drilldown page will include links for all the filters for the top-level category, and links for any subcategories of that subcategory. You can thus use the "Browse data" page to navigate through an entire category tree.


 * Filters - filters based on semantic properties can be manually added for any top-level category. Each such filter gets its own row within the constraints area, to let the user limit the results to only those pages that have that value for this semantic property. There are four ways to set the possible values for a filter:
 * From property values - the default method; the filter simply shows all current values for this property.
 * Pages in a category - the filter's values can be all the pages that belong, either directly or through a subcategory, to a category.
 * By date range - results are grouped into date ranges, based on a specified time period.
 * Set manually - the filter's values can be set manually. For properties that are numbers, a numerical range can also be set for a filter value, instead of a single number.

In the display of the filter in the drilldown, values that do not have any results for them will not be displayed. The filter will also show two additional values: "Other" and "None". Pages that show up for "Other" are those that have a value for that filter's property other than one of the pre-specified values. Pages that show up for "None" are those that have no value for that property. "Other" and "None", like other filter values, will not show up if there are no results for them.

After any amount of clicking on different subcategories and filters, the user will be able to see, in the page header near the top of the page, the set of subcategories and filters he/she has clicked on, that currently set constraints on the result. The user can get rid of any constraint by clicking on the "x" next to its name in the header.

See below for instructions on how to define such filters, and how to add them to categories.

Code and download
You can download the Semantic Drilldown code in either one of these two compressed files:


 * semantic_drilldown_0.4.7.tar.gz
 * semantic_drilldown_0.4.7.zip

You can also download the code directly via SVN from the MediaWiki source code repository, at http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/SemanticDrilldown/. From a command line, you can call the following:

To view the code online, including version history for each file, you can go here.

Installation
After you've obtained a 'SemanticDrilldown' directory (either by extracting a compressed file or downloading via SVN), place this directory within the main MediaWiki 'extensions' directory. Then, in the file 'LocalSettings.php' in the main MediaWiki directory, add the following line somewhere below the calls for the Semantic MediaWiki extension (both the main 'include_once' line and the 'enableSemantics' line):

You may also wish to change the number value for the new "Filter" namespace, defined in SD_Settings.php; by default it is set to 170.

Also, if you have any custom namespaces declared, you should add the following declaration before the 'include_once' call in the file 'LocalSettings.php':

(Or, instead of 170, whatever number you want the "Filter" namespace set to.)

NOTE: The definition of $sdgNamespaceIndex and the call to SD_Settings.php must be placed after the initialization of any custom namespace definitions in LocalSettings.php. Otherwise, the Filter namespace will not initialize.

Languages supported
Semantic Drilldown has full support for English, Taiwanese Chinese, Mainland Chinese, German and Persian, and partial support for Afrikaans, Arabic, Belarusian, Bulgarian, Catalan, Danish, Dutch, Esperanto, Finnish, French, West Frisian, Galician, Low German, Greek, Hawaiian, Hindi, Hungarian, Icelandic, Indonesian, Interlingua, Javanese, Khmer, Kotava, Manx, Malayalam, Marathi, Nahuatl, Northern Sotho, Norwegian Bokmål, Norwegian Nynorsk, Occitan, Ossetic, Pashto, Polish, Portuguese, Brazilian Portuguese, Ripuarian, Romanian, Russian, Seeltersk, Serbian Cyrillic, Silesian, Spanish, Swedish, Tajik, Telugu, Tetum, Turkish, Upper Sorbian, Vietnamese, Volapük and Zazaki.

Author
Semantic Drilldown was written by Yaron Koren, reachable at yaron57 -at- gmail.com.

Version
Semantic Drilldown is currently at version 0.4.7.

The version history is:


 * 0.1 - December 10, 2007 - Initial version


 * 0.2 - December 17, 2007 - 'CreateFilter' page added; improved property lookup for SMW 1.0; Persian-language support added; small bug fixes
 * 0.2.1 - December 20, 2007 - Added "Uses time period" and "Requires filter" properties; header includes link to category; German-language support added
 * 0.2.2 - December 26, 2007 - Filters with only a property set have their possible values retrieved dynamically, instead of from "Allows value" settings


 * 0.3 - January 8, 2008 - Language files changed to be updated by Betawiki, the MediaWiki translation wiki; look of main page updated, and name changed from "View data" to "Browse data"
 * 0.3.1 - January 9, 2008 - Bug fixes; "Browse data" look updated; language support added for Arabic, Bulgarian, Dutch, Finnish, French and Upper Sorbian
 * 0.3.2 - January 15, 2008 - Language support added for Slovak; Persian support improved
 * 0.3.3 - January 24, 2008 - Language support added for Galician, Greek, Hungarian, Kotava, Occitan and Seeltersk
 * 0.3.4 - January 31, 2008 - Bug fix for SMW 1.0 support; language support added for Portuguese; improvements to other languages
 * 0.3.5 - February 6, 2008 - Bug fixes for SMW 1.0 and others
 * 0.3.6 - February 12, 2008 - Bug fixes for page names with apostrophes and for numerical-range filters; language support added for Afrikaans and Volapük
 * 0.3.7 - February 25, 2008 - Language support added for Khmer, Northern Sotho, Norwegian Bokmål, Swedish, Telugu, and improved for other languages
 * 0.3.8 - March 3, 2008 - URL structure improved for top-level categories; "?_cat=CategoryName" is now just "/CategoryName"
 * 0.3.9 - March 13, 2008 - Improved display of boolean values; language support added for Esperanto, Icelandic, Marathi, Pashto, Polish and Tajik


 * 0.4 - April 24, 2008 - Handling added for selecting multiple values per filter (using "or"); display improved for header (better use of bolding), filters (hiding and tag-cloud display enabled) and drilldown results (results are now in category-page style); categories list can be removed with new "_single" URL query variable; drilldown page titles can be set for a specific category, and default title now includes the name of the category; fixed language-value handling for MW versions before 1.11; language supported for Danish, Hindi, Manx, Norwegian Nynorsk, Ossetic, Russian, Serbian Cyrillic, Silesian, Tetum and Vietnamese
 * 0.4.1 - April 27, 2008 - Display of "remove filter" images simplified, and overall CSS improved, to work better on non-Firefox browsers; handling fixed for subcategory names with apostrophes; some SQL modified to get around some permissions restrictions
 * 0.4.2 - May 8, 2008 - Sorting of drilldown results now uses "default sort" value when available; language support added for Javanese and Malayalam
 * 0.4.3 - May 30, 2008 - Fixed display of Boolean 'no' value; fixed page display for categories with no pages; language support added for Indonesian and Ripuarian
 * 0.4.4 - June 23, 2008 - Added compatibility with SMW 1.2's new database structure; language support added for Catalan, Low German, Hawaiian, Turkish and Zazaki
 * 0.4.5 - July 2, 2008 - Updated initialization of special pages to use autoloading of classes and language values
 * 0.4.6 - August 1, 2008 - Bug fixes for SMWSQLStore2; added explanatory line at top of 'BrowseData' page; language support added for West Frisian, Interlingua, Nahuatl, Brazilian Portuguese, Romanian
 * 0.4.7 - August 28, 2008 - Another SMWSQLStore2 bug fix; fix for tag-cloud display; improved some SQL structure; grouping added for special pages; language support added for Belarusian and Spanish

Special pages
The extension defines three "special" MediaWiki pages:


 * Special:BrowseData - displays a drilldown interface for browsing all the data on the site. (See example of page)
 * Special:CreateFilter - lets a user create a new filter. (See example of page)
 * Special:Filters - lists all filter pages on the site. (See example of page)

Code structure
The following are the directories and files in the Semantic Drilldown extension:

/includes


 * SD_AppliedFilter.php - defines the class SDAppliedFilter, which represents a filter in conjunction with a value.
 * SD_Filter.inc - defines the class SDFilter, which represents a filter on a single property.
 * SD_GlobalFunctions.php - functions and constants used by the rest of the Semantic Drilldown code.
 * SD_Settings.php - various settings for Semantic Drilldown.

/languages


 * SD_Language.php - parent class for all language files
 * SD_LanguageDe.php - German-language text
 * SD_LanguageEn.php - English-language text
 * SD_LanguageFa.php - Persian-language text
 * SD_LanguageZh_cn.php - Mainland-Chinese-language text
 * SD_LanguageZh_tw.php - Taiwanese-Chinese-language text
 * SD_Messages.php - Display messages for all languages

/skins


 * SD_main.css - main CSS file for Semantic Drilldown

/specials


 * SD_CreateFilter.php - defines the 'CreateFilter' special page
 * SD_Filters.php - defines the 'Filters' special page
 * SD_BrowseData.php - defines the 'BrowseData' special page

Getting started
Before you set up Semantic Drilldown, you should have all the data structures on your site set up - properties/attributes/relations, categories, templates, and, if you're using them, forms. See the Semantic Forms "Getting started" section for more on how to set these up.

After all this is done, and you've added some actual data, you should take the following steps:


 * Drill down through the data. As soon as you've installed Semantic Drilldown, you can go to the "BrowseData" special page and see what the category structure looks like on your site. There you can see what filters are needed or would be helpful for each category.


 * Create filters. Every filter you want to be able to drill down with has to be defined separately, on a page in the "Filter:" namespace. The easiest way to create filters is using the 'CreateFilter' special page (see above).


 * Add filters to categories. To add a filter to a category, simply add the tag Has filter::Filter :filter-name to the category's page. Filters should only be added to top-level categories.

Filter settings
Within the page for each filter, you should place semantic tags that define the filter. The allowed tags are:


 * - The only mandatory tag. Specifies which relation, attribute or property this filter applies to.
 * - States that the possible values for this filter are all the pages that are members of a certain category.
 * - Adds a specific value allowed for this category. You can add as many such tags as you want to a filter.
 * - Used for date filters; indicates the period of time that values are divided into (options are "Month" and "Year").
 * - Specifies the name of the filter which will be displayed on the screen.
 * - States that a filter should only be displayed for users if they have already selected a value from the specified filter.

If neither "Gets values from category", "Has value" or "Uses time period" are defined for a filter, the extension will simply list all current values of the property for the given set of pages.

Example
Here is the relevant part of the source code for the 'Sources' category page at Discourse DB:

This category uses the filters Has filter::Filter:Type, Has filter::Filter:Circulation and Has filter::Filter:Country.

And here are the source codes for the three relevant filters - Type:

This filter covers the property Covers property::Attribute:Publication type.

Circulation:

This filter covers the property Covers property::Attribute:Has circulation. It has the values Has value:=< 100,000, Has value:=100,001 - 250,000, Has value:=250,001 - 500,000, Has value:=500,001 - 1,000,000 and Has value:=> 1,000,000.

and Country:

This filter covers the property Covers property::Attribute:Is published in country. It has the values Has value:=Australia, Has value:=Great Britain and Has value:=United States.

Setting drilldown page title
You can manually set the title of the drilldown page for any specific category, by adding the special property "Has drilldown title" to the category's page. An example would be, in a page called "Category:Cities", the property:

Tag-cloud-style display of filter values
You can set the drilldown page to show the values for each filter in "tag-cloud" style, where the size of each value's name is dependent on the number of results it has. To do this, you need to add two values to your LocalSettings.php file, "$sdgFiltersSmallestFontSize" and "$sdgFiltersLargestFontSize"; these represent the font size of the names of the least-popular and most-popular filter values, respectively, in pixels. Here is an example:

Setting the display of drilldown results
By default, the list of results, or pages that match the current set of filters, is displayed in three columns, with a maximum of 100 results per page. You can change both of these by adding two values to your LocalSettings.php file, "$sdgNumResultsColumns" and "$sdgNumResultsPerPage". To set the page to show just one column, with 250 results per page, add the following:

Removing the list of categories
You may want to have the drilldown page show the data only for one category, and not include the list of categories for the user to click away from the current one. To enable that, just add the string "_single" to anywhere in the URL query string; this will remove the list of categories. An example of such a URL is "Special:BrowseData/Cars?_single".

Excluding categories from the drilldown
You may want certain categories to not show up in the main top-level list of categories. There is no way to completely exclude such a category. However, you can create a new category, called "Other" or anything else, and make each such category a member of that new category. That way, all such categories will only be viewable as subcategories of that "Other" category, instead of cluttering up the list of categories at the top.

Sites that use Semantic Drilldown
Here are some sites that use Semantic Drilldown in conjunction with Semantic MediaWiki:


 * Ani-Jobs
 * Discourse DB
 * GMAT Club
 * Mikomos
 * Placeography - Histories and stories about buildings and places
 * Technical Presentations
 * TheMusicSnob.com
 * Verwaltungskooperation - Cooperation in Public Administration
 * WeCoWi

You can see an alternate listing of sites that use Semantic Drilldown on the Semantic MediaWiki Community Wiki, an SMW-based wiki that contains additional information on each wiki.

Mailing list
Semantic Drilldown has no mailing list of its own, but you can use the Semantic MediaWiki mailing list, semediawiki-user, for any questions, suggestions or bug reports on Semantic Drilldown. You must be a member of the list to post.

Hosting
Currently only one wiki hosting site offers support for Semantic Drilldown: Referata, created and run by Yaron Koren. Wikis on Referata can use Semantic MediaWiki, Semantic Forms, Semantic Drilldown and a variety of related extensions; basic usage is free.

Bugs and feature requests
You can submit bug reports and requests for new features at MediaWiki's Bugzilla, here.

The current list of known bugs and requested features for Semantic Drilldown can be found here.

Contributing patches to the project
If you found some bug and fixed it, or if you wrote code for a new feature, please create a patch by going to the main "SemanticDrilldown" directory, and typing:

svn diff >descriptivename.patch

Then go to the relevant bug report in Bugzilla, or create one if one doesn't exist (note, again, that Bugzilla is used for both bugs and feature requests), and attach this patch file to it.

If, for any reason, you don't wish to use Bugzilla, feel free to simply send this patch, with a description, to Yaron Koren.

Translating
Translation of Semantic Drilldown is done through Betawiki. The translation for this extension can be found here. To add language values or change existing ones, you should create an account on Betawiki, then request permission from the administrators to translate a certain language or languages on this page (this is a very simple process). Once you have permission for a given language, you can log in and add or edit whatever messages you want to in that language.