Extension:Cargo/Cargo and Semantic MediaWiki

Semantic MediaWiki (SMW) is an extension to MediaWiki that lets you store and query data. It has a large number of spinoff extensions -- around 30 active ones -- that make use of it, and together turn an SMW-based system into something resembling a full-fledged, easy-to-use data framework.

The Cargo extension was consciously designed to mimic the full system of SMW and many of its spinoff extensions, in its syntax options and overall interface. In a few cases, code itself has been copied over as well, though in a modified form. In all, Cargo provides some or all of the functionality of seven extensions from the SMW "family": Semantic MediaWiki, Semantic Result Formats, Maps, Semantic Drilldown, Semantic Compound Queries, Semantic Internal Objects and Semantic Scribunto.

Design differences
If Cargo is essentially just a clone of SMW and some other extensions, why was it created in the first place? And why should anyone use it? Cargo does have a number of differences from SMW, that give it some advantages.

Philosophically, Cargo differs from SMW in three main ways:
 * Cargo ties data storage directly to templates. In SMW, semantic values can be placed anywhere on the page, even though in practice they're usually confined to templates; but in Cargo, it is the template itself that is responsible for storing its data.
 * Cargo stores its data in as simple a fashion as possible, using standard database tables to hold tabular data; while SMW uses a database to represent "triples" of data.
 * Though this is a more minor difference, Cargo is less customizable than SMW and its spinoff extensions, opting instead to base display settings on the data itself.

The first two differences especially enable the code in Cargo built around both storage and querying to be much simpler than that of SMW. Cargo lets users make near-direct use of SQL "SELECT" statements; which means that a custom query language does not need to be defined or supported. It also means that Cargo's own code for displaying query results in various formats can be significantly simpler than the corresponding code in SMW, SRF etc. And it means that the setup and maintenance work for administrators can be simpler. Cargo, a single extension, can take the place of about 15 extensions: the seven extensions listed before, plus another seven or so "library" extensions required by Semantic MediaWiki, like DataValues.

Features checklist
The table below shows the main set of functionality that SMW-based sites tend to make use of, and how it is, or is not, available in a Cargo-based system.

Advantages of Cargo
The previous section covered the ways in which Cargo does and does not measure up to Semantic MediaWiki's abilities. But there are some things that Cargo can do better, or which SMW currently cannot do at all. These are listed below.

More powerful querying
The usage of near-direct SQL enables Cargo to do queries that are not easily possible in SMW. These include:
 * Finding blank values. You can get the set of pages that do not have a value for some field; this is not possible with SMW.
 * Sorting with blank values. With SMW, if you sort on a particular property, pages that have a blank value for that property will not be displayed in the results. With Cargo, blank values are handled in the same way as all other values.
 * String operations. With Cargo, you can do string operations and comparisons within queries, like finding all rows that have a value for some field with exactly five characters.
 * Complex logical combinations of AND, OR and NOT.

Easier data structure setup

 * No properties. Cargo does not use properties and property pages; thus, its data structure is quite a bit more minimal, since properties can easily make up 95% or more of the pages in an SMW data structure.
 * No subobjects. As noted above, with SMW you need to use either subobjects or "internal objects" (essentially the same thing) to store an array of data within a page. In Cargo, all data is stored the same way, eliminating this complication.
 * Automatic display of all data. From the page Special:CargoTables, users can click through and see table display of all the Cargo data. In SMW, queries would have to be created manually to show all this data.
 * Automatic drilldown filters. In Cargo, filters for drilldown are set automatically, based on the fields in each table and their types. In SMW (really Semantic Drilldown), each such filter has to be defined manually.

Faster performance
Cargo uses a simple database structure, instead of Semantic MediaWiki's more complex, custom DB structure (assuming a triplestore is not used); so one might expect Cargo's querying to be at least somewhat faster than SMW's. One small-scale test comparing the two has been run; you can see the details at the page Performance testing. In this test, Cargo querying was around 50% faster than SMW querying.

Other

 * Full text searching. If you are using MySQL, you can do a standard text search on the text of pages, on the text of files (PDF only), and potentially on other fields as well, within queries. This is not possible with Semantic MediaWiki.
 * Full text search within drilldown. Similarly, in Cargo's Special:Drilldown page, there is a search input for searching on the contents of pages and uploaded PDF files. This is not possible with Semantic Drilldown.
 * Easier querying by outside systems. Both SMW and Cargo provide an API for querying their contents by external systems. But with Cargo, you can also have such systems query the database tables directly, if it has the proper permissions. (This is also doable in SMW, but extracting data from its tables through direct SQL queries is difficult.)