Extension:Cargo/Querying data

Querying of data within Cargo can be done in a number of ways:
 * Via two parser functions, #cargo_query and #cargo_compound_query
 * At the page Special:ViewData, which provides a simple interface that lets you set the parameters for a query
 * Via a query written in Lua, using the Scribunto extension
 * Simply by SQL calls, if you have direct access to the Cargo database.

The #cargo_query and #cargo_compound_query functions are explained on this page; #cargo_compound_query function essentially calls two or more queries at the same time, then displays their results together. The interface at Special:ViewTable provides the same parameters that #cargo_query does.

For help querying Cargo data within Lua/Scribunto, see Other features#Lua support.

There is no documentation here for how to do direct SQL querying of Cargo data; hopefully it is self-explanatory.

#cargo_query
The #cargo_query function is essentially a wrapper around SQL, with a few modifications. It is called with the following syntax:

The first eight parameters should look fairly familiar to anyone experienced with SQL SELECT queries:
 * - corresponds to the FROM clause (with the optional "alias" values corresponding to AS clauses); " " can be used instead.
 * - corresponds to the JOIN...ON clauses.
 * - corresponds to the SELECT clause (with the optional "alias" values corresponding to AS clauses). Its default value is " ".
 * - corresponds to the WHERE clause.
 * - corresponds to the GROUP BY clause. Note that DISTINCT does not work in Cargo (except within a function like COUNT) so in most cases you must use group by to eliminate duplicates.
 * - corresponds to the HAVING clause (similar to WHERE, but applies to values computed for "groups").
 * - corresponds to the ORDER BY clause, which sorts the answer set. Its default value is " " which does an ascending sort on page name.
 * - corresponds to the LIMIT clause.
 * - corresponds to the OFFSET clause.

The next three parameters are conditional, being applied depending on whether or not there are results:
 * - sets text that goes right before the query results; applied only if there are results.
 * - sets text that goes right after the query results; applied only if there are results.
 * - sets text that goes in place of the query results, if there are no results. The default value is "No results", in the language of the user. To not have any text appear, just set "default=".

The last set of parameters is:
 * - sets text that goes after the query display, to link to additional results. The default value is "More...", in the language of the user. To not have any text appear, just set "more results text=".
 * - specifies that this query does not include any HTML in its results. This is important for embedded queries (a #cargo_query call contained within another one, or within a different parser function). Such embedded calls often fail due to parsing; placing 'no html' within the inner queries tends to fix the problem.
 * - sets the maximum number of characters to display for any field value; if a value is larger than this number of characters, it is replaced with a link reading "Show" that, when clicked, displays the entire text.
 * - sets the format of the display (see Display formats).
 * There can also be additional allowed parameters, specific to the chosen display format. Again, see Display formats for the possible values of both the "format" parameter and these additional parameters.

Of all these parameters, "tables=" is the only required one; although if "tables=" holds multiple tables, "join on=" is required as well.

Examples
The following query gets all the cities on the current (fictional) wiki and their populations, and displays them in a table:

The following query gets only cities in Asia, and displays the city name, the country name and the population, all in a dynamic, JavaScript-based table:

The following query gets all countries in the wiki, and the number of cities in each one that have a page in the wiki, for each country that has more than two cities; it then displays that set of numbers with a bar chart, with the highest number of cities at the beginning:

#cargo_compound_query
You may want to have the results of more than one query within the same display; this is possible using the #cargo_compound_query function. The main usage for such "compound queries" is to display multiple sets of points on a map or calendar, with a different icon (in maps) or a different color (in calendars) for each set; though it's also possible to use it to show multiple result sets within simpler formats like tables and lists.


 * 1) cargo_compound_query is called by passing in the sets of parameters of one or more calls to #cargo_query, with the overall sets separated by pipes, and the internal parameters now separated by semicolons.

Here is an example call, that would show two groups of pages in a map, with a different icon for each:

Essentially, each subquery functions as its own "parameter". These subqueries have their own sub-parameters that are a subset of the allowed parameters of #cargo_query: only the parameters directly related to querying - "tables", "join on", etc. - are allowed, plus two more parameters related to display in calendars and maps: "color" and "icon".

Besides the subqueries, the only allowed other parameters for #cargo_compound_query are "format", plus whatever additional parameters there are for the chosen format.

Display formats
Data returned from a Cargo query can be displayed in many different ways, including lists, tables, charts, calendars, maps and other visualizations. See Display formats for a full listing.

The "HOLDS" command
SQL's own support for fields that contain lists/arrays is unfortunately rather poor. For this reason, #cargo_declare creates an additional, helper table for each field that holds a list of values. Additionally, #cargo_query supports a new, SQL-like command, "HOLDS", within the "join on=" and "where=" parameters, that makes querying on such data easier. Instead of having to manually include the helper table in each such call, you can use "HOLDS" to simplify the syntax; it is what is known as "syntactic sugar" for the true, more complex, SQL.

You can use "HOLDS" within the "where=" parameter to find all rows whose list field contains a certain value. To use our earlier example, if we have a table called "Books" that contains a field, "Authors", holding a list of authors, we can use the following #cargo_query call to get all books written or co-written by Leo Tolstoy:

This call is equivalent to this more complex one:

Similarly, you can use "HOLDS" within "join on=" to join two tables together based on values within list fields. For example, if information about authors is stored within its own database table, "Authors", and you wanted to display a table of books, their authors, and those authors' dates of birth, you could have the following call:

This is equivalent to the explicit call of:

"HOLDS LIKE"
There is an additional command, "HOLDS LIKE", that maps the SQL "LIKE" command onto all of a list of values; it works just like "HOLDS". For instance, to get all books written or co-written by anyone with "Leo" in their name, you could call:

The "WITHIN" command
The "WITHIN" command can be applied to hierarchy fields which hold a single value. This command let you match against not just a value, but all of its children values, as defined in the hierarchy.

For example, let’s say there is a table called "Recipes", and among its fields is one called "Main_ingredient", which is defined as a hierarchy. You can do the following query to look for recipes having a main ingredient that is a root vegetable:

"HOLDS WITHIN"
The "HOLDS WITHIN" command can be applied to hierarchy fields which hold a list of values. As you might imagine from the name, it is a combination of "HOLDS" and "WITHIN" - it finds all entries that have at least one match for a value and/or one of its children values.

For example, let’s say there is a table called "Movies", which contains a field called "Genres", which is defined as a hierarchy. To get all the movies that are defined as being, at least in part, a "Drama" or one of its sub-genres, you could run the following query:

The "NEAR" command
Like arrays, coordinates are not well-supported overall by relational databases. For that reason, similarly to arrays, coordinates have special handling for both storage and querying. For the case of coordinates, if you want to query on coordinates, the recommended approach is to use the "NEAR" command, which like "HOLDS" is a virtual command, defined by Cargo.

"NEAR" finds all the points near a specified set of coordinates, within a specified distance. The coordinates and distance must be placed in parentheses, separated by commas; and the distance must be in either kilometers (specified as "kilometers" or "km") or miles (specified as "miles" or "mi").

For instance, if there is a table called "Restaurants", holding a list of restaurants, and it contains a field called "Coords" holding the coordinates of each field, you could call the following query to display all restaurants (and some relevant information about them) within 10 kilometers of the Piazza San Marco in Italy:

The "MATCHES" command
For fields of type "Searchtext", you can do a standard text search on their contents using the "MATCHES" command. This is equivalent to a MATCH ... AGAINST call in MySQL; this command is currently available only for MySQL.

For most wikis, the only field of type "Searchtext" will be the automatically-generated field _pageData._fullText (see Storing page data). A simple query, to get all the pages in the wiki containing the word "meeting", might look like:

You can see more complex demonstrations of the MATCHES functionality here.

Using SQL functions
You can include native functions from whichever database system you're using within #cargo_query, in the "fields", "join on" and "where" parameters. For the sake of security, the set of allowed SQL functions is defined in a global variable, $wgCargoAllowedSQLFunctions. The default set is:
 * Math functions: 'COUNT', 'FLOOR', 'CEIL', 'ROUND', 'MAX', 'MIN', 'AVG', 'SUM', 'POWER', 'LN', 'LOG'
 * String functions: 'CONCAT', 'LOWER', 'LCASE', 'UPPER', 'UCASE', 'SUBSTRING', 'FORMAT'
 * Date functions: 'NOW', 'DATE', 'YEAR', 'MONTH', 'DAYOFMONTH', 'DATE_FORMAT', 'DATE_ADD', 'DATE_SUB', 'DATEDIFF'

If you want to use any additional functions, you can enable them by adding lines to LocalSettings.php like this one, after the inclusion of Cargo:

You can find documentations on all of these functions online. For MySQL, for example, you can see the string functions explained here, and date functions explained here. Below are some examples of how SQL functions can be used, all assuming a MySQL system.

Custom link text
You can use CONCAT to create custom link text for both internal and external links. Example:

Removing page links
Conversely, you can use CONCAT to remove links to values, for fields of type "Page". By default, such values are displayed as links, but you may want to instead display them as just strings. The CONCAT function is probably the easiest way to do that. If the "Author" field here is of type "Page", then to display author values as just strings, you could call the following:

This works because, as long as what is being displayed is not simply the field name, #cargo_query will not apply any of that field's special handling.

If the field holds a list of values, you should instead call "CONCAT(fieldName__full)". So the call could look like this:

Date filtering
You can use date functions like DATEDIFF to get items with a date within a certain range. Example:

Truncating strings
You can use string functions like LEFT or SUBSTRING to trim strings. The following example also uses CONCAT and IF to append an ellipsis, only if the string value (a quote) has been truncated. Example:

Displaying hard-coded (non-Cargo) values
You can use nested IF statements to display custom data that does not come from Cargo within the query results. Example:

Configuration
There are several settings you can add to LocalSettings.php to change the display of query results:
 * - the number of results to show for #cargo_query if no limit is set (default is 100)
 * - the maximum allowed number of results for #cargo_query (default is 5000)
 * - the set of allowed SQL functions (see "Using SQL functions", above)
 * - an array of namespaces that, if a page from that namespace is displayed among the query results, the namespace part of the page name is hidden (by default, the array holds just NS_FILE)
 * - the character that separates between the main number and the decimal part for numbers on this wiki (depends on the wiki's country of origin) (default is ".")
 * - the character that separates digits (usually thousands) for numbers on this wiki (default is ",")

If any of these are added to LocalSettings.php, they should be placed below the inclusion of Cargo.