Extension talk:External Data

accessing the internal variable (__json)
I need some help. Background: I'm working on an extension that uses a webcall to retrieve a JSON object and converts it into concise html code. This extension now is working properly, but lacks caching. This JSON object is complex: containing multiple nested arrays with varying number of elements and it took me a while to program it in php.

My current line of thought is to transfer the entire json object (that is stored in the internal variable __json) to my extension.

Question 1) how does one access the internal variables?? The code below doesn't work.

Question 2) do you have any comments / suggestions on my approach?


 * first bullit:
 * second bullit:

Harro Kremer (talk) 21:20, 3 January 2023 (UTC) Alexander Mashintalk 03:46, 9 January 2023 (UTC)
 * Since  is not a string but a Lua table, it is only accessible from Lua (Scribunto is needed) and not from wikitext:

Using External Data in a Template
I want to fill a custom Infobox Template with data from ExternalData. The data will reside in a CSV file matching the non-namespace part of the page importing.

However, this tries to fetch data from a CSV file named "myinfobox" instead What is a better way to achieve this?


 * Is the template called "infobox myinfobox", or just "myinfobox"? And do you see the problem on the "mypage" page, or right on the template page? Yaron Koren (talk) 00:34, 10 January 2023 (UTC)
 * my template page is called "Vorlage:Infobox_VM". The problem occurs on "mypage" as the template page doesn’t really have any visible content nor has it the CSV providing information. "mypage" tries to read from a csv called Vorlage:Infobox_VM.csv instead if mypage.csv.

Alexander Mashintalk 02:11, 10 January 2023 (UTC)
 * This looks like a strange way to invoke a template: . Why not just  ? And, as said above, fetching data from   is what to be expected on the template page itself, unless the code is wrapped with.


 * yes you’re right, i tried being verbose for clarity. in my page i have only and that seems to work well enough. I'm not really concerned what csv is being read on the template page itself, as that page isn’t really supposed to be looked at. I want "mypage" to look at mypage.csv through the template. that’s what doesn’t work.


 * What are the settings for the data source  in   ?
 * that path and all the files in it are world readable. that stuff works in principle. i am able to pull info from csv. just not via the page name in the template.
 * that path and all the files in it are world readable. that stuff works in principle. i am able to pull info from csv. just not via the page name in the template.

Alexander Mashintalk 10:59, 14 January 2023 (UTC)
 * Could not reproduce the issue at my MediaWiki installation. Looks like there you have some sophisticated wiki code that substituted  too early on.
 * I solved the problem this way:  on "mypage" i call  and on the template page i use that variable to find the right file. 24.134.95.253 15:51, 17 January 2023 (UTC)

Slightly different csv data files, only one works
I have 2 48 line csv files, < 10k in size with a minor difference, but only one can be read in ExternalData in 1.39

Simplified case https://johnbray.org.uk/expounder/Extdataproblem1 uses

get_web_data: url=https://files.johnbray.org.uk/Documents/Expounder/Q/532/9928/datagood and then get_web_data: url=https://files.johnbray.org.uk/Documents/Expounder/Q/532/9928/databad

checking that has something from a line of the csv file. datagood works, but databad does not. The difference between the files is a few characters on one line, and the good file is actually longer than the bad

< "item",+1996-04-05T00:00:00Z,+1996-04-08T00:00:00Z,"","in Heathrow wit h, , , , , ,","","","","","","",51.4673,-0.4529

> "item",+1996-04-05T00:00:00Z,+1996-04-08T00:00:00Z,"","in Heathrow wit h, , , , AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBA,","","","","","","",51.4673,-0.4529

Both files are < 10k in size, 48 lines, and the good file is actually longer than the bad

wc datagood databad 48  338  9245 datagood 48  341  9180 databad Vicarage (talk) 14:05, 13 January 2023 (UTC) Alexander Mashintalk 10:45, 14 January 2023 (UTC)
 * Neither of the CSV files is formed perfectly from PHP's point of view. The problems start at the line 7 (one-based). For  this causes the automatically recognised delimiter to be   rather than  . You can overcome this by adding.
 * Thanks for the quick response. That cured the problem, both for my trivial and full cases. Vicarage (talk) 11:00, 16 January 2023 (UTC)

What is the best practice to fetch many values (>300) from the same place?
Is it better to use the legacy method, like #get_web_data, and fetch all the values at once, then display using #external_value; or is it better to use the new method, #external_value with source parameter, 300 times? What performance consideration might there be?

Jeremi Plazas (talk) 17:51, 17 January 2023 (UTC) Alexander Mashintalk 03:22, 18 January 2023 (UTC)
 * If you use caching, the difference is not that big. Using  will save cache lookups, but the legacy mode will stop working once MediaWiki is upgraded to use Parsoid, since it does not guarantee parsing order. The optimal solution is to handle data fetching and display with one Lua function, where you can save the fetched data into a variable and later display it.
 * Thanks, we'll look into Lua. We do have caching setup so the standalone method might be fine, now that you've helped us iron out the kinks. Thanks again for the help! Jeremi Plazas (talk) 17:54, 19 January 2023 (UTC)

Some JSONPATH doesn’t retrieve results
I have a JSON file with Information about network interfaces in it. I am retrieving this information with get_file_data and two jsonpath instructions. However, only one of them seems to be executed/filled with data.

This is my json: and this is the template code i use to retrieve the data: i call this template in the following fashion: where netnames is just an array, usually only one entry like  and filename points to the correct file

You could try a regular expression:. Also, you can get the bulleted list without a template: Alexander Mashintalk 06:23, 19 January 2023 (UTC)
 * If  is an array, I don't know how transcluding it within single quotes in a JsonPath query could work. Neither , nor   is a working JsonPath.   would be. I would suggest replacing   with  , but this does not seem to be implemented.

Strange bug parsing CSV with pipe character
I have a field called  which is called in. All of the values have at least one pipe character, and the page cuts off everything before and including the first pipe character. But the strange thing is that it only happens if is located in a specific place.

The page can be seen at https://comprehensibleinputwiki.org/wiki/Mandarin_Chinese/Videos and the external data is at https://comprehensibleinputwiki.org/wiki/Data:Mandarin_Chinese/Videos. If you look at the wiki source of the first link, I have twice, one of them in a hidden div. If you view the source of the page, the first one is missing part of the value, while the hidden one is complete. Dimpizzy (talk) 20:08, 22 January 2023 (UTC) Alexander Mashintalk 02:01, 23 January 2023 (UTC) UPD: Or, you can add a second  before. Alexander Mashintalk 07:21, 23 January 2023 (UTC)
 * Add  to.
 * It didn't seem to change anything. I changed it to:
 * Dimpizzy (talk) 03:00, 23 January 2023 (UTC)
 * At least, the videos are displayed now. If the current problem is the trimmed titles in the "Title" column, this is not related directly to the extension. The beginning of the title is treated as attributes to the  tag by MediaWiki parser. Wrap   with , like this:  , to see the first chunk of the title.
 * That worked, thanks! I didn't notice any issues on my end with the videos not displaying before, but good to know! Dimpizzy (talk) 09:36, 23 January 2023 (UTC)

Parameter parsing problems
Hi. I use External Data to retrieve data from a PostgreSQL database. In most cases I use prepared statements and I noticed that the passing of parameters seem not to work correctly. Here a self contained test to visualize what I mean.

In the database I have a table with a single column of type text:

SELECT * FROM public.test; txt

a simple text another simple text a text with, a comma in it

Notice that the lines contain spaces and in one case a comma.

Then I have a search function that receives a parameter of type text and returns a set of text:

SETOF TEXT public.mw_test(p_search TEXT)

in the configuration LocalSettings.php for this looks like this:

$wgExternalDataSources['wikidoc'] = [ 'server' => 'xxx', 'type' => 'postgres', 'name' => 'xxx', 'user' => 'xxx', 'password' => 'xxx', 'prepared' => [ 'test' => 'SELECT mw_test FROM public.mw_test($1);' ] ];

In the wikipage the snippet is as follow:

It simply displays what it finds on a line.

What happens is the the list of parameters cannot contain a comma. The snippet as is above works fine and returns:

a text with, a comma in it

But something like this not:

The error is "Fehler: Es wurden keine Rückgabewerte festgelegt."

It is clear that a comma is used to separate parameters and that is the reason why this does not work. My question is how can I pass the whole string "with, a" as a single parameter.

I tried enclosing it in single and double quotes, but this did not help. It leads to this exception:

[6882cc0e426ebdb6cf6911bc] /w/index.php?title=IT/IT_Infrastructure/KOFDB_Uebersicht&action=submit TypeError: EDParserFunctions::formatErrorMessages: Argument #1 ($errors) must be of type array, null given, called in /home/wiki/application/w/extensions/ExternalData/includes/EDParserFunctions.php on line 98

Any idea what I could do to solve this? Help is very appreciated. Thanks

It looks like I found a way to solve this. I can enclose the whole string in round parenthesis and it works.

Alexander Mashintalk 06:10, 11 February 2023 (UTC)
 * An intresting workaround; however, upgrade to be able to use double quotes.

composer.json
When trying to install dependencies via composer, the current requirement says  on REL1_39 branch. The latest version is currently 2.5.4 which doesn't meet this requirement. Could this be updated to a more permissive requirement? Prod (talk) 18:29, 24 February 2023 (UTC)


 * This is being addressed in T330485. Prod (talk) 19:39, 13 March 2023 (UTC)