Extension:Cargo/Querying data

From MediaWiki.org
Jump to: navigation, search
Cargo - navigation
Basics Main pageExtension:Cargo (talk) · Download and installationExtension:Cargo/Download and installation · Quick start guideExtension:Cargo/Quick start guide · Other documentationExtension:Cargo/Other documentation · SMW migration guideExtension:Cargo/SMW migration guide
Using Cargo Storing dataExtension:Cargo/Storing data · Querying dataExtension:Cargo/Querying data (Display formatsExtension:Cargo/Display formats) · Browsing dataExtension:Cargo/Browsing data · Exporting dataExtension:Cargo/Exporting data · Other featuresExtension:Cargo/Other features
Resources for help Common problemsExtension:Cargo/Common problems · Known bugs and planned featuresExtension:Cargo/Known bugs and planned features · Getting supportExtension:Cargo/Getting support
About Cargo Authors and creditsExtension:Cargo/Authors and credits · Version historyExtension:Cargo/Version history · Sites that use CargoExtension:Cargo/Sites that use Cargo · Cargo and Semantic MediaWikiExtension:Cargo/Cargo and Semantic MediaWiki · FAQExtension:Cargo/FAQ

Querying of data within Cargo can be done via two functions, #cargo_query and #cargo_compound_query, as well as a special page, Special:ViewTable. The Special:ViewTable page provides a simple interface that lets you set the parameters for a query. The #cargo_compound function essentially calls two or more queries at the same time, then displays their results together.

#cargo_query[edit]

The #cargo_query function is essentially a wrapper around SQL, with a few modifications. It is called with the following syntax:

{{#cargo_query:
tables=table1=tableAlias1, table2=tablesAlias2, etc.
|join on=table1.fieldA = table2.fieldB,table2.fieldC=table3.fieldD, etc.
|fields=field1=fieldAlias1,field2=Alias2, etc.
|where=table1.fieldE="some value" AND/OR etc.
|group by=table1.fieldG
|having=table1.fieldG="some value", etc.
|order by=table2.fieldF, etc.
|limit=some number
|intro=some text
|outro=some text
|default=some text
|more results text=some text
|no html
|format=format
...additional format-based parameters
}}

The first eight parameters should look fairly familiar to anyone experienced with SQL SELECT queries:

  • tables= - corresponds to the FROM clause (with the optional "alias" values corresponding to AS clauses); "table=" can be used instead.
  • join on= - corresponds to the JOIN...ON clauses.
  • fields= - corresponds to the SELECT clause (with the optional "alias" values corresponding to AS clauses). Its default value is "_pageName".
  • where= - corresponds to the WHERE clause.
  • group by= - corresponds to the GROUP BY clause.
  • having= - corresponds to the HAVING clause (similar to WHERE, but applies to values computed for "groups").
  • order by= - corresponds to the ORDER BY clause, which sorts the answer set. Its default value is "_pageName ASC" which does an ascending sort on page name.
  • limit= - corresponds to the LIMIT clause.

The next three parameters are conditional, being applied depending on whether or not there are results:

  • intro= - sets text that goes right before the query results; applied only if there are results.
  • outro= - sets text that goes right after the query results; applied only if there are results.
  • default= - sets text that goes in place of the query results, if there are no results. The default value is "No results", in the language of the user. To not have any text appear, just set "default=".

The last set of parameters is:

  • more results text= - sets text that goes after the query display, to link to additional results. The default value is "More...", in the language of the user. To not have any text appear, just set "more results text=".
  • no html - specifies that this query does not include any HTML in its results. This is important for embedded queries (a #cargo_query call contained within another one, or within a different parser function). Such embedded calls often fail due to parsing; placing 'no html' within the inner queries tends to fix the problem.
  • format= - sets the format of the display (see Display formats).
  • There can also be additional allowed parameters, specific to the chosen display format. Again, see Display formats for the possible values of both the "format" parameter and these additional parameters.

Of all these parameters, "tables=" is the only required one; although if "tables=" holds multiple tables, "join on=" is required as well.

Examples[edit]

The following query gets all the cities on the current (fictional) wiki and their populations, and displays them in a table:

{{#cargo_query:
tables=Cities
|fields=_pageName=City,Population
|format=table
}}

The following query gets only cities in Asia, and displays the city name, the country name and the population, all in a dynamic, JavaScript-based table:

{{#cargo_query:
tables=Cities,Countries
|join on=Cities.Country=Countries._pageName
|fields=Cities._pageName=City,Countries._pageName=Country,Cities.Population
|where=Countries.Continent="Asia"
|format=dynamic table
}}

The following query gets all countries in the wiki, and the number of cities in each one that have a page in the wiki, for each country that has more than two cities; it then displays that set of numbers with a bar chart, with the highest number of cities at the beginning:

{{#cargo_query:
tables=Cities,Countries
|join on=Cities.Country=Countries._pageName
|fields=Countries._pageName=Country,COUNT(*)
|group by=Countries._pageName
|having=COUNT(*) > 2
|order by=COUNT(*) DESC
|format=bar chart
}}

#cargo_compound_query[edit]

You may want to have the results of more than one query within the same display; this is possible using the #cargo_compound_query function. The main usage for such "compound queries" is to display multiple sets of points on a map or calendar, with a different icon (in maps) or a different color (in calendars) for each set; though it's also possible to use it to show multiple result sets within simpler formats like tables and lists.

#cargo_compound_query is called by passing in the sets of parameters of one or more calls to #cargo_query, with the overall sets separated by pipes, and the internal parameters now separated by semicolons.

Here is an example call, that would show two groups of pages in a map, with a different icon for each:

{{#cargo_compound_query:
tables=Restaurants;where=City=Lisbon;fields=_pageName,Address,Rating,Coords;icon=Fork.jpg
|tables=Bars;where=City=Lisbon;fields=_pageName,Address,Hours,Coords;icon=Martini.png
|format=openlayers
}}

Essentially, each subquery functions as its own "parameter". These subqueries have their own sub-parameters that are a subset of the allowed parameters of #cargo_query: only the parameters directly related to querying - "tables", "join on", etc. - are allowed, plus two more parameters related to display in calendars and maps: "color" and "icon".

Besides the subqueries, the only allowed other parameters for #cargo_compound_query are "format", plus whatever additional parameters there are for the chosen format.

Display formats[edit]

Data returned from a Cargo query can be displayed in many different ways, including lists, tables, charts, calendars, maps and other visualizations. See Display formats for a full listing.

The "HOLDS" command[edit]

SQL's own support for fields that contain lists/arrays is unfortunately rather poor. For this reason, #cargo_declare creates an additional, helper table for each field that holds a list of values. Additionally, #cargo_query supports a new, SQL-like command, "HOLDS", within the "join on=" and "where=" parameters, that makes querying on such data easier. Instead of having to manually include the helper table in each such call, you can use "HOLDS" to simplify the syntax; it is what is known as "syntactic sugar" for the true, more complex, SQL.

You can use "HOLDS" within the "where=" parameter to find all rows whose list field contains a certain value. To use our earlier example, if we have a table called "Books" that contains a field, "Authors", holding a list of authors, we can use the following #cargo_query call to get all books written or co-written by Leo Tolstoy:

{{#cargo_query:
tables=Books
|fields=_pageName=Book,Authors
|where=Authors HOLDS "Leo Tolstoy"
}}

This call is equivalent to this more complex one:

{{#cargo_query:
tables=Books,Books__Authors
|join on=Books._ID=Books__Authors._rowID
|fields=_pageName=Book,Books.Authors__full=Authors
|where=Books__Authors._value = "Leo Tolstoy"
}}

Similarly, you can use "HOLDS" within "join on=" to join two tables together based on values within list fields. For example, if information about authors is stored within its own database table, "Authors", and you wanted to display a table of books, their authors, and those authors' dates of birth, you could have the following call:

{{#cargo_query:
tables=Books,Authors
|join on=Books.Authors HOLDS Authors._pageName
|fields=Books._pageName,Books.Authors,Authors.Date_of_birth
}}

This is equivalent to the explicit call of:

{{#cargo_query:
tables=Books,Books__Authors,Authors
|join on=Books._ID=Books__Authors._rowID,Books__Authors._value=Authors._pageName
|fields=Books._pageName,Books__Authors._value,Authors.Date_of_birth
}}

"HOLDS LIKE"[edit]

There is an additional command, "HOLDS LIKE", that maps the SQL "LIKE" command onto all of a list of values; it works just like "HOLDS". For instance, to get all books written or co-written by anyone with "Leo" in their name, you could call:

{{#cargo_query:
tables=Books
|fields=_pageName=Book,Authors
|where=Authors HOLDS LIKE "%Leo%"
}}

The "NEAR" command[edit]

Like arrays, coordinates are not well-supported overall by relational databases. For that reason, similarly to arrays, coordinates have special handling for both storage and querying. For the case of coordinates, if you want to query on coordinates, the recommended approach is to use the "NEAR" command, which like "HOLDS" is a virtual command, defined by Cargo.

"NEAR" finds all the points near a specified set of coordinates, within a specified distance. The coordinates and distance must be placed in parentheses, separated by commas; and the distance must be in either kilometers (specified as "kilometers" or "km") or miles (specified as "miles" or "mi").

For instance, if there is a table called "Restaurants", holding a list of restaurants, and it contains a field called "Coords" holding the coordinates of each field, you could call the following query to display all restaurants (and some relevant information about them) within 10 kilometers of the Piazza San Marco in Italy:

{{#cargo_query:
tables=Restaurants
|fields=_pageName=Restaurant,Address,Rating,Coords
|where=Coords NEAR (45.434, 12.338, 10 km)
}}

The "MATCHES" command[edit]

For fields of type "Searchtext", you can do a standard text search on their contents using the "MATCHES" command. This is equivalent to a MATCH ... AGAINST call in MySQL; this command is currently available only for MySQL.

For most wikis, the only field of type "Searchtext" will be the automatically-generated field _pageName._fullText (see Storing page data). A simple query, to get all the pages in the wiki containing the word "meeting", might look like:

{{#cargo_query:
table=_pageData
|fields=_pageName=Page,_fullText=Search results
|where=_fullText MATCHES 'meeting'
}}

You can see more complex demonstrations of the MATCHES functionality here.

Using SQL functions[edit]

You can include native functions from whichever database system you're using within #cargo_query, in the "fields", "join on" and "where" parameters. For the sake of security, the set of allowed SQL functions is defined in a global variable, $wgCargoAllowedSQLFunctions. The default set is:

  • Math functions: 'COUNT', 'FLOOR', 'CEIL', 'ROUND', 'MAX', 'MIN', 'AVG', 'SUM', 'POWER', 'LN', 'LOG'
  • String functions: 'CONCAT', 'LOWER', 'LCASE', 'UPPER', 'UCASE', 'SUBSTRING', 'FORMAT'
  • Date functions: 'NOW', 'DATE', 'YEAR', 'MONTH', 'DAYOFMONTH', 'DATE_FORMAT', 'DATE_ADD', 'DATE_SUB', 'DATEDIFF'

If you want to use any additional functions, you can enable them by adding lines to LocalSettings.php like this one, after the inclusion of Cargo:

$wgCargoAllowedSQLFunctions[] = 'CURDATE';

You can find documentations on all of these functions online. For MySQL, for example, you can see the string functions explained here, and date functions explained here. Below are some examples of how SQL functions can be used, all assuming a MySQL system.

Custom link text
You can use CONCAT() to create custom link text for both internal and external links. Example:
{{#cargo_query:table=Newspapers
|fields=CONCAT( '[[', _pageName, '|View page]]' ) = Newspaper, Circulation, CONCAT( '[', URL, ' View URL]' ) = URL
}}
Removing page links
Conversely, you can use CONCAT() to remove links to values, for fields of type "Page". By default, such values are displayed as links, but you may want to instead display them as just strings. The CONCAT() function is probably the easiest way to do that. If the "Author" field here is of type "Page", then to display author values as just strings, you could call the following:
{{#cargo_query:tables=Blog_posts
|fields=_pageName,CONCAT(Author)
}}
This works because, as long as what is being displayed is not simply the field name, #cargo_query will not apply any of that field's special handling.
If the field holds a list of values, you should instead call "CONCAT(fieldName__full)". So the call could look like this:
{{#cargo_query:tables=Blog_posts
|fields=_pageName,CONCAT(Topics__full)
}}
Date filtering
You can use date functions like DATEDIFF() to get items with a date within a certain range. Example:
{{#cargo_query:tables=Blog_posts
|fields=_pageName,Author,Date
|where=DATEDIFF(Date,NOW()) >= -7
|order by=Date DESC
}}
Truncating strings
You can use string functions like LEFT() or SUBSTRING() to trim strings. The following example also uses CONCAT() and IF() to append an ellipsis, only if the string value (a quote) has been truncated.
{{#cargo_query:tables=Authors
|fields=_pageName=Author, CONCAT( LEFT( Quote, 200 ), IF( LENGTH( Quote ) > 200, "...", "" ) )=Quote
}}

Configuration[edit]

There are several settings you can add to LocalSettings.php to change the display of query results:

  • $wgCargoDefaultQueryLimit - the number of results to show for #cargo_query if no limit is set (default is 100)
  • $wgCargoMaxQueryLimit - the maximum allowed number of results for #cargo_query (default is 5000)
  • $wgCargoAllowedSQLFunctions - the set of allowed SQL functions (see "Using SQL functions", above)
  • $wgCargoHideNamespaceName - an array of namespaces that, if a page from that namespace is displayed among the query results, the namespace part of the page name is hidden (by default, the array holds just NS_FILE)
  • $wgCargoDecimalMark - the character that separates between the main number and the decimal part for numbers on this wiki (depends on the wiki's country of origin) (default is ".")
  • $wgCargoDigitGroupingCharacter - the character that separates digits (usually thousands) for numbers on this wiki (default is ",")

If any of these are added to LocalSettings.php, they should be placed below the inclusion of Cargo.