Extension:Cargo/Querying data

Querying of data within Cargo can be done via two functions, #cargo_query and #cargo_compound_query, as well as a special page, Special:ViewTable. The Special:ViewTable page provides a simple interface that lets you set the parameters for a query. The #cargo_compound function essentially calls two or more queries at the same time, then displays their results together.

#cargo_query
The #cargo_query function is essentially a wrapper around SQL, with a few modifications. It is called with the following syntax:

The first eight parameters should look fairly familiar to anyone experienced with SQL SELECT queries:
 * tables= - corresponds to the FROM clause; "table=" is an alias for it.
 * join on= - corresponds to the JOIN...ON clauses.
 * fields= - corresponds to the SELECT clause (with the optional "alias" values corresponding to AS clauses). Its default value is "_pageName".
 * where= - corresponds to the WHERE clause.
 * group by= - corresponds to the GROUP BY clause.
 * having= - corresponds to the HAVING clause (similar to WHERE, but applies to values computed for "groups").
 * order by=</tt> - corresponds to the ORDER BY clause. Its default value is "_pageName ASC</tt>".
 * limit=</tt> - corresponds to the LIMIT clause.

The next three parameters are conditional, being applied depending on whether or not there are results:
 * intro=</tt> - sets text that goes right before the query results; applied only if there are results.
 * outro=</tt> - sets text that goes right after the query results; applied only if there are results.
 * default=</tt> - sets text that goes in place of the query results, if there are no results. The default value is "No results", in the language of the user. To not have any text appear, just set "default=".

The last set of parameters is:
 * more results text=</tt> - sets text that goes after the query display, to link to additional results. The default value is "More...", in the language of the user. To not have any text appear, just set "more results text=".
 * format=</tt> - sets the format of the display.
 * There can also be additional allowed parameters, specific to the chosen display format. See "Display formats" below for the possible values of both the "format" parameter and these additional parameters.

Of all these parameters, "tables=" is the only required one; although if "tables=" holds multiple tables, "join on=" is required as well.

Examples
The following query gets all the cities on the current (fictional) wiki and their populations, and displays them in a table:

The following query gets only cities in Asia, and displays the city name, the country name and the population, all in a dynamic, JavaScript-based table:

The following query gets all countries in the wiki, and the number of cities in each one that have a page in the wiki, for each country that has more than two cities; it then displays that set of numbers with a bar chart, with the highest number of cities at the beginning:

#cargo_compound_query
You may want to have the results of more than one query within the same display; this is possible using the #cargo_compound_query function. The main usage for such "compound queries" is to display multiple sets of points on a map or calendar, with a different icon (in maps) or a different color (in calendars) for each set; though it's also possible to use it to show multiple result sets within simpler formats like tables and lists.


 * 1) cargo_compound_query is called by passing in the sets of parameters of one or more calls to #cargo_query, with the overall sets separated by pipes, and the internal parameters now separated by semicolons.

Here is an example call, that would show two groups of pages in a map, with a different icon for each:

Essentially, each subquery functions as its own "parameter". These subqueries have their own sub-parameters that are a subset of the allowed parameters of #cargo_query: only the parameters directly related to querying - "tables", "join on", etc. - are allowed, plus two more parameters related to display in calendars and maps: "color" and "icon".

Besides the subqueries, the only allowed other parameters for #cargo_compound_query are "format", plus whatever additional parameters there are for the chosen format.

Display formats
The "format=" parameter lets you set in which format to display the results. The Cargo extension supports the following formats:

Lists

 * list</tt>: Displays results in a delimited list.
 * Parameters:
 * delimiter</tt> - sets the separator character (default is comma)


 * ul</tt>: Displays results in a bulleted list.
 * Parameters:
 * columns</tt> - sets the number of columns (default is 1)
 * Example: Discourse DB latest items


 * ol</tt>: Displays results in a numbered list.
 * Parameters:
 * columns</tt> - sets the number of columns (default is 1)


 * category</tt>: Displays results in the style of MediaWiki categories.
 * Parameters:
 * columns</tt> - sets the number of columns (default is 3)

More complex text displays

 * <tt>template</tt>: Displays results using a MediaWiki template that formats each row of results. The template in question must use unnamed parameters, i.e. it must refer to its parameters as, etc.
 * Parameters:
 * <tt>template</tt> - specifies the name of the template to use (mandatory)
 * Example: Discourse DB "position" page (queries here, template here)


 * <tt>embedded</tt>: Shows the full text of each queried page (only the first field of the query is used). (No parameters.)


 * <tt>outline</tt>: Shows results in an outline format.
 * Parameters:
 * <tt>outline fields</tt> - holds a comma-separated list of the query's fields; these fields are used to define the outline, by which the results are grouped together (mandatory)
 * Example: Discourse DB opinion items outline


 * <tt>tree</tt>: Shows results in a tree format, in which a single field defines all the relationships between "child" pages and "parent" pages.
 * Parameters:
 * <tt>parent field</tt> - holds the name of this connector field (mandatory)


 * <tt>table</tt>: Displays results in a table. (No parameters.)


 * <tt>dynamic table</tt>: Displays results in a "dynamic" table that includes sorting, pagination and searching, using the DataTables JavaScript library. (No parameters.)
 * Example: Discourse DB opinion items


 * <tt>tag cloud</tt>: Shows results in a "tag cloud" format, where the number corresponding to each string value dictates the font size for that string.
 * Parameters:
 * <tt>min size</tt> - the percentage of normal font size that the smallest text displays will have (default is 80)
 * <tt>max size</tt> - the percentage of normal font size that the largest text displays will have (default is 200)
 * <tt>template</tt> - specifies the name of a template to use to display each result
 * Example: MWstake tags

Image displays

 * <tt>gallery</tt>: Displays a gallery of images, in the style of the MediaWiki &lt;gallery&gt; tag. The images must be files that were uploaded to the wiki; they can either be the pages that are directly queried (if the image pages call a Cargo-based template), or fields, of type "File", of other pages.
 * Parameters:
 * <tt>mode</tt> - sets the display mode of the gallery; can be <tt>traditional</tt> (the default), <tt>nolines</tt>, <tt>packed</tt>, <tt>packed-overlay</tt> or <tt>packed-hover</tt>. See here for a demonstration of these options.

Time-based displays

 * <tt>calendar</tt>: Displays results in a calendar, using the FullCalendar JavaScript library.
 * Parameters:
 * <tt>width</tt> - sets the width of the calendar (default is 100%)
 * <tt>start date</tt> - sets the date at which to display the calendar (default is the current date)
 * <tt>view</tt> - sets the starting display of the calendar - options are 'day', 'week', or 'month' (default is month)
 * <tt>color</tt> - sets the color in which to show the names of events; useful within #compound_query (default is set by the FullCalendar library)
 * Additionally, a field within the query with the name or alias "name" will set the name of each event to the value of that field for that event.
 * Example: Discourse DB opinion calendar


 * <tt>timeline</tt>: Displays results in a timeline, using the SIMILE Timeline library.
 * Parameters:
 * <tt>height</tt> - sets the height of the timeline (default is 350px)
 * <tt>width</tt> - sets the width of the timeline (default is 100%)
 * Examples: Discourse DB topic (see bottom of page), MWstake meetings

Numerical displays

 * <tt>bar chart</tt>: Displays results in a bar chart (with horizontal bars), using the NVD3 JavaScript library.
 * Parameters:
 * <tt>height</tt> - sets the height of the bar chart (default is based on the number of bars)
 * <tt>width</tt> - sets the width of the bar chart (default is 100%)
 * Example: Discourse DB most popular authors

Maps

 * <tt>googlemaps</tt>: Displays results in a map, using the Google Maps service.
 * Parameters:
 * <tt>height</tt> - sets the height of the map (default is 400px)
 * <tt>width</tt> - sets the width of the map (default is 700px)
 * <tt>icon</tt> - sets a custom icon to be used to display points; value must be the name of a file that has been uploaded to the wiki. This is especially useful within #compound_query.
 * <tt>zoom</tt> - sets the zoom level, i.e. an integer value from around 0 to around 20, with higher numbers being more zoomed in (default is based on the area of the set of points being displayed).
 * Example: Discourse DB map test


 * <tt>openlayers</tt>: Displays results in a map, using the OpenLayers service.
 * Parameters: same as for <tt>googlemaps</tt>.
 * Example: Discourse DB map test (see bottom of page)

More complex displays

 * <tt>exhibit</tt>: Displays results in a browsable interface, using the SIMILE Exhibit library/service.
 * Parameters:
 * <tt>view</tt> - sets the view(s) that will be displayed, separated by comma if more than one. Valid values are <tt>map</tt>, <tt>tabular</tt> and <tt>timeline</tt>. If not set, the view(s) will be set based on the types of the fields in the query.
 * <tt>facets</tt> - sets the fields to be used for the facets/filters, separated by comma if more than one. A maximum of three are allowed. If not set, the first three fields of the result are used.
 * <tt>datalabel</tt> - sets the label that refers to the data. Default is "Item".`


 * for the "timeline" view:
 * <tt>end</tt> - sets the name of the field holding the end time for each event (if any)
 * <tt>color</tt> - sets the name of the field used to color-code the markers (if any)
 * <tt>topunit</tt> - sets the unit for the top band: <tt>millisecond</tt>, <tt>second</tt>, <tt>minute</tt>, <tt>hour</tt>, <tt>day</tt>, <tt>week</tt>, <tt>month</tt>, <tt>year</tt>, <tt>decade</tt>, <tt>century</tt>, <tt>millennium</tt>
 * <tt>toppx</tt> - sets the width, in pixels, for each interval in the top band
 * <tt>bottompx</tt> - sets the width, in pixels, for each interval in the bottom band


 * Example: Programas Educativos Exhibit demo

Export
Three export-based display formats are defined: <tt>csv</tt>, <tt>json</tt> and <tt>excel</tt>. See Exporting data for documentation on these. In addition, the "exhibit" format displays an orange toolbox that lets you download the current data in various export formats: BibTex, RDF/XML, JSON, Semantic Wikitext and TSV.

If no format is specified, <tt>list</tt> is the default format if there is only a single field being displayed in the results, while <tt>table</tt> is the default format if there is more than one field.

The "HOLDS" command
SQL's own support for fields that contain lists/arrays is unfortunately rather poor. For this reason, #cargo_declare creates an additional, helper table for each field that holds a list of values. Additionally, #cargo_query supports a new, SQL-like command, "HOLDS", within the "join on=" and "where=" parameters, that makes querying on such data easier. Instead of having to manually include the helper table in each such call, you can use "HOLDS" to simplify the syntax; it is what is known as "syntactic sugar" for the true, more complex, SQL.

You can use "HOLDS" within the "where=" parameter to find all rows whose list field contains a certain value. To use our earlier example, if we have a table called "Books" that contains a field, "Authors", holding a list of authors, we can use the following #cargo_query call to get all books written or co-written by Leo Tolstoy:

This call is equivalent to this more complex one:

Similarly, you can use "HOLDS" within "join on=" to join two tables together based on values within list fields. For example, if information about authors is stored within its own database table, "Authors", and you wanted to display a table of books, their authors, and those authors' dates of birth, you could have the following call:

This is equivalent to the explicit call of:

"HOLDS LIKE"
There is an additional command, "HOLDS LIKE", that maps the SQL "LIKE" command onto all of a list of values; it works just like "HOLDS". For instance, to get all books written or co-written by anyone with "Leo" in their name, you could call:

The "NEAR" command
Like arrays, coordinates are not well-supported overall by relational databases. For that reason, similarly to arrays, coordinates have special handling for both storage and querying. For the case of coordinates, if you want to query on coordinates, the recommended approach is to use the "NEAR" command, which like "HOLDS" is a virtual command, defined by Cargo.

"NEAR" finds all the points near a specified set of coordinates, within a specified distance. The coordinates and distance must be placed in parentheses, separated by commas; and the distance must be in either kilometers (specified as "kilometers" or "km") or miles (specified as "miles" or "mi").

For instance, if there is a table called "Restaurants", holding a list of restaurants, and it contains a field called "Coords" holding the coordinates of each field, you could call the following query to display all restaurants (and some relevant information about them) within 10 kilometers of the Piazza San Marco in Italy:

Using SQL functions
You can include native functions from whichever database system you're using within #cargo_query, in the "fields", "join on" and "where" parameters. For the sake of security, the set of allowed SQL functions is defined in a global variable, $wgCargoAllowedSQLFunctions. The default set is:
 * Math functions: 'COUNT', 'FLOOR', 'CEIL', 'ROUND', 'MAX', 'MIN', 'AVG', 'SUM', 'POWER', 'LN', 'LOG'
 * String functions: 'CONCAT', 'LOWER', 'LCASE', 'UPPER', 'UCASE', 'SUBSTRING', 'FORMAT'
 * Date functions: 'NOW', 'DATE', 'DATE_FORMAT', 'DATE_ADD', 'DATE_SUB', 'DATEDIFF', 'DATE_FORMAT'

If you want to use any additional functions, you can enable them by adding lines to LocalSettings.php like this one, after the inclusion of Cargo:

You can find documentations on all of these functions online. For MySQL, for example, you can see the string functions explained here, and date functions explained here. Below are some examples of how SQL functions can be used, all assuming a MySQL system.


 * Custom link text: You can use CONCAT to create custom link text for both internal and external links. Example:


 * Removing page links: Conversely, you can use CONCAT to remove links to values, for fields of type "Page". By default, such values are displayed as links, but you may want to instead display them as just strings. The CONCAT function is probably the easiest way to do that. If the "Author" field here is of type "Page", then to display author values as just strings, you could call the following:


 * This works because, as long as what is being displayed is not simply the field name, #cargo_query will not apply any of that field's special handling.
 * If the field holds a list of values, you should instead call "CONCAT(fieldName__full)". So the call could look like this:


 * Date filtering: You can use date functions like DATEDIFF to get items with a date within a certain range. Example:


 * Truncating strings: You can use string functions like LEFT or SUBSTRING to trim strings. The following example also uses CONCAT and IF to append an ellipsis, only if the string value (a quote) has been truncated.

Configuration
There are several settings you can add to LocalSettings.php to change the display of query results:
 * <tt>$wgCargoDefaultQueryLimit</tt> - the number of results to show for #cargo_query if no limit is set (default is 100)
 * <tt>$wgCargoMaxQueryLimit</tt> - the maximum allowed number of results for #cargo_query (default is 5000)
 * <tt>$wgCargoAllowedSQLFunctions</tt> - the set of allowed SQL functions (see "Using SQL functions", above)
 * <tt>$wgCargoDecimalMark</tt> - the character that separates between the main number and the decimal part for numbers on this wiki (depends on the wiki's country of origin) (default is ".")
 * <tt>$wgCargoDigitGroupingCharacter</tt> - the character that separates digits (usually thousands) for numbers on this wiki (default is ",")

If any of these are added to LocalSettings.php, they should be placed below the inclusion of Cargo.