Extension talk:Cargo

[BUG?] Recategorisation and _pageData table
I have a feeling that when the Replace Text extension has been used, the pages disappear from the _pageData table, or at least the pages don't show up on Cargo queries on that table. I haven't tested this out though, but can do if you think it would be useful. Jonathan3 (talk) 08:48, 1 May 2021 (UTC)

I've had a think again. I was using Replace Text to recategorise pages. I then recategorised some remaining pages manually (by editing the page) which had the same effect. I don't think it's to do with the Replace Text extension after all.

I think what's happening is this. Page X is in categories A, B, C and D. I replace. Cargo now realises that page X is in category E, but has forgotten that it's also in categories B, C and D. _pageData._categories only contains "E". I am 99% sure of this from what I've seen of the effects of recategorisation, though I've not yet tried to reproduce it deliberately. Jonathan3 (talk) 09:15, 1 May 2021 (UTC)

A further observation. Page X again is in categories A, B, C and D. If I remove (talk) 09:39, 1 May 2021 (UTC)


 * It's true that categories are handled differently from all the other fields within _pageData - that's because MediaWiki does category setting via jobs, so it doesn't happen instantaneously on page save. I don't know what's happening with this particular bug, though. Does it only happen when the page is modified via Replace Text? Yaron Koren (talk) 03:55, 3 May 2021 (UTC)


 * I've had a look using the query below, and changing the categories to see what it says.


 * When the job queue is empty, it works fine. When I add/remove/edit a category, the query prints the correct sentence as soon as the page is saved.
 * When the job queue is not empty (e.g. when I edit a template) it does not work but, instead, does as described above.
 * Then, even when the job queue is empty again after running runJobs.php, it still doesn't "catch up" and show the correct categories, even when the page is purged. At that stage, I need to recreate the _pageData table, or (unrealistically) remove the category definitions, save the page, add them back to the page, and save the page again.
 * I don't think Replace Text is relevant except that it fills up the job queue, which in turn stops Cargo from working properly. Jonathan3 (talk) 06:54, 3 May 2021 (UTC)
 * I don't think Replace Text is relevant except that it fills up the job queue, which in turn stops Cargo from working properly. Jonathan3 (talk) 06:54, 3 May 2021 (UTC)

I've had a think about this. The following code in CargoHooks.php, function addOrRemoveCategoryData, called by the CategoryAfterPageAdded or CategoryAfterPageRemoved hooks, is trying to get the page's existing categories from the _pageData table:

When the job queue is not running, it works fine, and returns an array of the page's categories from the _pageData table. The next parts of the code either remove or add the new category to that array.

When the job queue is running, it returns an empty array even when the _pageData table contains the correct categories information. This in turn leads to the problems described above: (1) when you add a category, it ends up being the only category in the Cargo table; and (2) when you remove a category, there end up being no category in the Cargo table.

Maybe I should add that I have  and run the job queue as a cron job and/or at the command line. The problem seems to appear when runJobs.php is running (not merely when there are jobs in the queue). Jonathan3 (talk) 22:11, 7 May 2021 (UTC)


 * I think maybe it's getting the wrong $rowID (i.e. one which doesn't yet exist in the _pageData table). Jonathan3 (talk) 22:40, 7 May 2021 (UTC)


 * OK here is my last guess for now. I'm filling the job queue by making a minor edit to a Cargo template. Maybe while the job queue is running, the _ID in _pageData for the pages (or at least the page in question) changes... when the code above wrongly gets an empty array the _ID has increased by 2... this'll be why when the code above looks up the _ID there is nothing in _pageData__categories... if you look in the database, the old orphaned _rowID rows remain in _pageData__categories. Maybe the answer is to work on a replica database for both queries, or turn them into a single query using a join, then when updating use the real database having checked for any new row ID. Or don't let the the _ID in _pageData for the pages ever change. Jonathan3 (talk) 22:59, 7 May 2021 (UTC)


 * Having slept on it, it's clear that it's getting the correct (new) row id for the page data table but it's not matching up with the same row id in the categories full table (which still has the old row id).


 * There's a function that stores page data but ignores categories. I think it deletes the page data table row before recreating it. Maybe this is when the row id changes? Maybe instead of ignoring categories it should at least ensure the categories full row id matches the page data row id? Jonathan3 (talk) 07:41, 8 May 2021 (UTC)

Maybe the _rowID of _pageData___categories should be the page's ID (which stays the same even when a page is moved) instead of being _pageData's _ID field (which seems to change sometimes). Otherwise you'd need to keep them in sync somehow in the Cargo code. Jonathan3 (talk) 15:11, 8 May 2021 (UTC)

Stream of consciousness collapsed......................................................................


 * There's a lot to go over here. Could it be that the main _pageData storage code (called by the PageSaveComplete hook) is being called after the category info storage code (called by the two hooks you listed before)? If so, that would definitely explain what you're seeing - and if that's the case, I can't think of any way around it at the moment. Yaron Koren (talk) 18:03, 12 May 2021 (UTC)


 * More stream of consciousness...


 * Might it be the other way round? Basically addOrRemoveCategoryData checks a page's _pageData._ID row and associated _pageData__categories._rowID rows - but sometimes (caused by the job queue running) by the time it does that the page's _pageData._ID has changed and the _pageData__categories._rowID has not. So it gets the correct, new _pageData._ID but can't use it to obtain anything from _pageData__categories because _ID is no longer equal to _rowID. It then bases the add/remove on an empty array (or once, I think _ID coincided with an orphaned _rowID with category rows, so instead of an empty array it was just completely wrong).


 * Possible alternative solutions might be:
 * Use _pageID as the link between _pageData and the _pageData__categories rows (e.g. either set _ID=_pageID and _rowID=_pageID, or just set _rowID=_pageID and change the code) as _pageID (I think) rarely/never changes.
 * When onPageSaveComplete deletes the _pageData row it should either:
 * keep a record of the _ID/_rowID and send it to storeValuesForPage so that if _ID changes so does _rowID and the _pageData__categories rows therefore continue to be linked with the newly-created _pageData row.
 * delete the _pageData__categories rows too, and storeValuesForPage/storeAllData should recreate those rows from the MW database.


 * Jonathan3 (talk) 21:47, 12 May 2021 (UTC)


 * That seems implausible to me, though maybe I haven't thought about it enough. I still like my theory - do you think it could be right? Yaron Koren (talk) 02:13, 13 May 2021 (UTC)


 * I'm not sure. Which is the implausible part? I don't know what in the job queue changes the _ID (could be something odd about my wiki) but I did some debugging and the logs show the other stuff happening (changed _ID and unchanged _rowID wrongly getting empty array). I gave up when I couldn't fix it so wasn't 100% thorough though. Jonathan3 (talk) 06:34, 13 May 2021 (UTC)

Duplicate rows again
Did we ever get to the bottom of what causes duplicate rows?

I've just run the following query:

It showed four pages which had two rows each. Doing a null edit (edit and save with no change) gets rid of a duplicate row. Purging the page does not.

It would be good if Special:CargoTables would identify whether a table has duplicate rows.

Alternatively, can you suggest an improved Cargo query that identify duplicate rows across various tables on a wiki? Maybe using Scribunto (which I downloaded this week...)

Maybe I should do it with plain PHP and get it to email me when it finds duplicates.

Or maybe the answer is to run the various recreate table scripts nightly?

I'm using MW1.34 and Cargo code from about a week ago.

Thanks. Jonathan3 (talk) 21:52, 4 May 2021 (UTC)

I've put that query (modified to make it look nicer) into a template, and notice that in each table, the pages with duplicate rows are mainly pages I've not even looked at for ages, and also that there are never more than two identical rows. Jonathan3 (talk) 22:00, 4 May 2021 (UTC)

To contradict something above, a file I uploaded tonight appears on the list now... every time I edit the File: page the count (of rows) increases by one... Jonathan3 (talk) 22:51, 6 May 2021 (UTC)

Do you think maybe the duplicate rows thing could be linked with the recategorisation problem? E.g. a List's __fieldname rows get orphaned when their _rowID no longer is the _ID of its main table, so a proposed new main table row has an incorrect fieldname__full field which means that the "is this a duplicate row?" check wrongly thinks it's different. I've recreated all my tables recently so can't check... Jonathan3 (talk) 20:29, 8 May 2021 (UTC)

Disable storing on User namespace
Hello again!

So, as the header suggests, I'm trying to find a way to disable  and   on the User namespace, is that possible? Currently I have some very important templates with a lot of data, and would be very awkward to query the pages and randomly there's rows with junk data.

I think it's possible to disallow certain templates to be rendered with AbuseFilter, but I'm not sure if there's another way or if that will work for most of the time. Unfortunately, I can't guarantee that all users will respect the rules by just throwing an alert.

Thanks! Lakelimbo (talk) 20:02, 5 May 2021 (UTC)


 * I don't think there's a way to automatically disable #cargo_store for certain namespaces... I suppose you could use #if and the variable to only call #cargo_store for certain namespaces. Or you could check the namespace in the query, like "where=_pageNamespace != 2". Yaron Koren (talk) 20:15, 5 May 2021 (UTC)
 * I see. Could you take this as a suggestion, then? Would be awesome if we could just simply turn off any storage of data on specific namespaces. Thank you! Lakelimbo (talk) 01:54, 6 May 2021 (UTC)


 * Why are people putting these templates in the User namespace (presumably, in their own user page, or a subpage of it) if they're not supposed to? Is it for testing purposes, or do they just not know? My concern with shutting off certain namespaces entirely is that the usefulness of it might be limited - since there could be certain templates that you do want people to put in user pages, or in other namespaces. (Actually, there's probably more of an argument for being able to shut off storage in other namespaces, like "Talk", where there's no real reason to store data.) Yaron Koren (talk) 17:06, 6 May 2021 (UTC)
 * I don't know why, but it does happen, however I imagined like it being possible to disable within the  itself, not as a "global parameter" (something like  ). But I do understand why this may raise concerns. In any way, thanks again. Lakelimbo (talk) 12:47, 7 May 2021 (UTC)
 * That's an idea also. Actually, the more I think about it, the more I think a global variable like $wgCargoDisabledNamespaces (or $wgCargoEnabledNamespaces) could make sense, precisely for all those non-content namespaces. Or would only a parameter like "ignore namespace" work for this case? Yaron Koren (talk) 18:15, 7 May 2021 (UTC)
 * On my case specifically, I think just "ignore namespace" would work, but making something like  would be very useful I suppose, especially for wikis that heavily rely on Cargo. And sorry for not responding yesterday, I was a bit busy . Lakelimbo (talk) 20:41, 9 May 2021 (UTC)


 * Hey Lakelimbo if they're putting a template there and not a direct call to, inside of the template you can do either   or even more generaly  . I do this on my wiki for every single template with a cargo store, with a list of explicitly allowed namespaces (in Lua not wikicode but same idea). This allows sandboxing without any problematic stores, and you can still have some other stores in the User namespace if you needed. Though if they're writing   directly it would still be a problem. But, AbuseFilter can also prohibit saving completely, instead of just a warning, if you wanted to try that. --RheingoldRiver (talk) 02:56, 31 May 2021 (UTC)

[SOLVED] Date display
I'm wrestling with date displays in infoboxes using Cargo dates. The Page Values page displays the date as I intend (January 1, 1883) but the infobox insists on displaying 1883-1-1. When I try to format the date within the infobox (using MW date formatting), I can get full dates to work, but partial dates (for instance only the year) insert TODAY'S date info in the missing sub-fields. That's no good. Is there a date formatting procedure that I'm missing? Here's an example: http://wiki.martenet.com/index.php/Pratt_Library_Branch_4 You can see the page values link displays the full month and year (I have the $wgAmericanDates set to "true"). But the infobox is listening to something else, apparently. --Parma100 (talk) 00:37, 12 May 2021 (UTC)


 * You could check the __precision field for the date: Extension:Cargo/Storing data.


 * Having said that, here's what I did within a date-displaying template:




 * Jonathan3 (talk) 06:56, 12 May 2021 (UTC)
 * Thanks for that. I think it would be more precise to depend on the __precision field in this date template. However, I can't seem to find how to retrieve that field within a query template. I see on a 2015 thread where Yoren says "I just added in (re-added, really) the ability to display date precision fields in the query. Is that by itself enough to get this working, though? Or would you also need the ability to call "CASE", or something like it, in the "fields=" parameter?", but I don't see how to incorporate that. The internal date translations work well enough for me--I just can't find how to grab the __precision field to make the magic happen. What am I missing? --Parma100 (talk) 14:14, 13 May 2021 (UTC)


 * If your date field is called Startdate you'd use Startdate__precision, I think. Jonathan3 (talk) 14:20, 13 May 2021 (UTC)
 * That's what I figured. I seem to be getting null values, and that makes me think the fetch from the database isn't occurring. Those "hidden" fields (coord lat & long, this precision value, etc) aren't mentioned in the actual fields listing, and I'm not able to get any values, so I'm wondering if they're not being fetched. I've checked the actual MySql db, and the values are there, so storage is working. I can't figure out the retrieval. --Parma100 (talk) 14:31, 13 May 2021 (UTC)
 * It works now, with no processing at all, although I don't know why. It has something to do with being edited in the form. If I modify the (correct) date in the form, then save and go back to the page, the correct format appears -- month and year only, if no day is specified, and year only, if that is what is specified. I'm not going to argue with success, but I don't know why that edit vehicle works while the original form input does not. And this is without any of the processing you kindly shared above. Go figure.--Parma100 (talk) 20:16, 13 May 2021 (UTC)
 * Maybe an upgrade in Cargo since you last saved the pages? I've just done a quick query on a table with date fields and the dates all show properly, according to their precision... I had expected the missing parts to be filled with 1s (e.g. 2000 to show as something like 2000-01-01, May 2000 to be 2000-05-01). What happens if you just do a null edit on the page, or edit something unrelated? Maybe if you recreate the table it'll sort it out for every page. Jonathan3 (talk) 21:35, 13 May 2021 (UTC)
 * Actually I think it is due to the evolving nature of my coding here. Early on I used Cargo, but not Page Forms (because I didn't understand what Page Forms might be doing behind the scenes) and just used the editor for infoboxes inside the VisualEditor. That, of course will allow you to change the text, and Cargo picked up those changes. All good. But I don't think the secondary fields like __precision were getting updated as well. When I checked the MySql table, I saw lots of NULLS, but not all, in the __precision fields. I (incorrectly, now, I think) supposed that everything was getting saved properly. Later, I incorporated the Page Forms editor, abandoning the VisualEditor editing for infoboxes. Yesterday I stumbled on the fact that editing one of the "wrong" entries in Page Forms, and then saving it, automatically corrected the display. I went through and changed all of the incorrect displayed entries, and now they're all good. So I think the silver bullet is editing them in Page Forms, not using VisualEditor. This Cargo package is a well-thought-out suite of programs, and got me close to the finish line with a minimum of effort. It deserves high praise. The biggest problem may be the "nut behind the wheel!" --Parma100 (talk) 13:26, 14 May 2021 (UTC)
 * It's great that you got it working! Although it should work the same whether the template calls are being edited with Page Forms or VisualEditor (or by hand); so I don't know what the problem was here. Yaron Koren (talk) 18:09, 19 May 2021 (UTC)

[SOLVED] Compound, compound queries
Is it possible to get a compound, compound query without resorting to SQL? I'd like to fetch records which satisfy Criteria A AND either Criteria B OR Criteria C. I thought I could wrap the B or C code in parenthesis, as below, but that doesn't fly. The parenthesis trigger an error. What is the method of combining these tests? --Parma100 (talk) 19:08, 18 May 2021 (UTC)


 * This probably doesn't help you but I've used the above to create materially the same query with a table of mine and it worked all right (I changed the ='{{PAGENAME parts to <> just to get a result on a test page). Maybe something else is causing the error. It's not really a compound query without the second tables= line but I guess that's next. Jonathan3 (talk) 23:05, 18 May 2021 (UTC)


 * Thanks for the hint. Apparently more "nut behind the wheel" stuff. I'd labeled one of my fields incorrectly. The code does work with the parenthesis. --Parma100 (talk) 16:33, 19 May 2021 (UTC)

Maintenance script to switch in Cargo table
To work round the duplicates problem, I'm going to recreate all the Cargo tables each night.

It would be good if I could use the  parameter then, once the existing maintenance scripts have run, use another script to switch in the newly created replacement tables.

In the meantime I'll just choose an "Ungodly hour" to recreate the tables and hope nobody is using the website then :-) Jonathan3 (talk) 22:10, 24 May 2021 (UTC)

Annoyingly, after not using the  parameter, for the first time there are duplicates pretty much straight after running the scripts :-( Usually I use that parameter and the recreation scripts clear the duplicates. Jonathan3 (talk) 22:48, 24 May 2021 (UTC)


 * ...or how about a new parameter  which means the script creates a replacement table and at the end switches it in? It would mean that the Cargo tables are only incomplete for a split second and save the bother of using the web interface and all those clicks to switch the tables in :-) Jonathan3 (talk) 17:58, 25 May 2021 (UTC) ... and might avoid the duplicates problem - switching in a replacement table seems to avoid the problem - maybe because the job queue doesn't interfere with it? Jonathan3 (talk) 09:46, 30 May 2021 (UTC)

_pageData and Extension:UserMerge
When UserMerge deletes a User: page, the _pageData details aren't updated fully (unsurprisingly I suppose). I did a query out of curiosity and the deleted page appeared at the top of the list, probably because it is still in the Cargo table but doesn't have a date field any more.

Thanks. Jonathan3 (talk) 20:22, 25 May 2021 (UTC)


 * P.S. This resolves itself when the _pageData table is recreated. Jonathan3 (talk) 09:01, 28 May 2021 (UTC)

[SOLVED] Cargo and Extension:Popups
I'd like to use Extension:Popups, but most of my pages these days are formed from Cargo template queries, and Extension:TextExtracts, which gets the page extract for the popup, returns "..." (nothing plus a standard "..."). It would be great if somehow the popup text extract could be based a Cargo field - so when it extracts text from a page, it checks whether there's a Cargo template, and if so returns the contents of the predetermined field for that template. I think this may be a relevant link: Extension:Popups. Is this something you'd consider adding? Thanks. Jonathan3 (talk) 08:59, 28 May 2021 (UTC)


 * It works fine for me now after I got rid of "div" within "ExtractsRemoveClasses" in Extension:TextExtracts's extension.json file. Jonathan3 (talk) 15:27, 1 June 2021 (UTC)

[SOLVED] Cargo and Scribunto/Lua
For my first Lua script I'd like to display the contents of a Cargo table in the following format (where Name and About are fields):

Name
 * About
 * About

Name
 * About

Name
 * About
 * About
 * About

I imagine something similar has been done before so wonder if anyone can point me to anything. Thanks in anticipation :-) Jonathan3 (talk) 06:33, 29 May 2021 (UTC)

I've made a start by modifying the example on Extension:Cargo/Other_features but it seems only to be returning the first row:

It prints "1..." but I'd have expected "1... 2... 3... 4... " etc. Where am I going wrong? Thanks. Jonathan3 (talk) 07:10, 29 May 2021 (UTC)

I don't usually ping people but User:RheingoldRiver I heard you on Yaron's podcast and wonder whether you would be willing to share your expertise here :-) Jonathan3 (talk) 07:53, 29 May 2021 (UTC)


 * Got the following code to work:




 * I changed it to make it more generic so hope no errors crept in. I'd be interested to hear whether I could have done things better. Thanks. Jonathan3 (talk) 22:43, 29 May 2021 (UTC)


 * Glad you you got it working. Check out, normally you'll do something like  .   iterates through an array in order, while   iterates through the elements of a table in arbitrary order. (ipairs is "indexed pairs"). If you're not using the   variable, a common convention is to use   instead, so.


 * Though I would recommend, which is what I do on Leaguepedia and as a result is pretty standard on Gamepedia, so if you look for code examples there, you'll see this convention used. (Also a direct benefit is that   and   look more different than   and   so it's harder to misread on accident haha --RheingoldRiver (talk) 02:48, 31 May 2021 (UTC)


 * Thank you very much. I'll look into this. Jonathan3 (talk) 21:45, 31 May 2021 (UTC)

Merge similar cells and "striping" or rows
Each separate column is alternately grey or white in its background. This is all right for normal tables, but when "merge similar cells" is used it can create a patchwork effect. Would it be possible for the background colour to reset as soon as there is a full unmerged row? Thanks. Jonathan3 (talk) 22:08, 29 May 2021 (UTC)

Leaflet clustering
Is there some trick for getting leaflet to cluster markers? Adding |cluster=yes or |cluster=on does not seem to have any effect, although the identical code works for googlemaps. For instance does not cluster any markers. If I change the format to googlemaps it works. Do I need to add a cluster extension to leaflet? I see on the maps page that clustering is available. Just not sure how, within a cargo context, to get it to work. --Parma100 (talk) 01:26, 4 June 2021 (UTC)


 * Yes, clustering only works for the "googlemaps" format. Does the Maps extension support clustering for Leaflet? If so, that must involve some code that would need to be added to Cargo. Yaron Koren (talk) 17:39, 4 June 2021 (UTC)


 * It seems so. See https://maps.extension.wiki/wiki/Leaflet_SMW_queries. There are a number of clustering parameters specified.--Parma100 (talk) 18:26, 4 June 2021 (UTC)