Jump to content

Extension talk:External Data

Add topic
From mediawiki.org


Iterating over one variable while keeping another one in the same state

[edit]

I retrieve data from JSON files for a certain Host. Among that data is a list (1 item or more) mac addresses and the host name (1 item for sure).

I then try to iterate over the list of mac addresses to call a template that uses the mac address to look up the ip addresses on that interface.

However i need to supply the host name to the template too, so it knows what json file to look up the values in. I can’t get this to work properly.

This is an annotated example of the calls I’ve tried:


 {{#for_external_table:|
  ### this produces and empty hostname, if macaddresses has more than one item
  {{Interface|hostname={{{hostname}}}|hwaddress={{{macaddresses}}}}}

  ### this also produces an empty hostname, if macadresses has more than one item but will also fail if macaddresses has only one item, because external value will be evaluated to an empty string on the second iteration (bug?)
  {{Interface|hostname={{#external_value:hostname}}|hwaddress={{{macaddresses}}}}}

  ### this also doesn’t work, because the external value will be empty the same moment that name is empty, because they’re both iterated over
  {{Interface|hostname={{{name|{{#external_value:name}}}}}|hwaddress={{{macaddresses|}}}}}
 }}
 

Is there a way to achieve this or will this have to be implemented first?

I could imagine something like {{{hostname.keep}}} to mark those variables that don’t get iterated or just returning the same value over and over if the iteration variable isn’t an iterable

That's a tricky one. You may have to use the Variables extension, to set some variable to the value of #external_value, and then use that variable within #for_external_table. Although maybe the best solution is to move all this code into a Lua module, using Scribunto - it will give you much more flexibility. Yaron Koren (talk) 12:59, 1 May 2024 (UTC)Reply
How to deal with the data columns of different height could be an interesting topic for discussion some time ago. But now, the obvious solution for this and some other issues is Lua. Variables should not now be used at all, for the parsing order is now uncertain.
Alexander Mashin talk 08:17, 13 May 2024 (UTC)Reply

Using multiple JOIN ON parameters in ED

[edit]

Hi, I am having trouble getting a query to work

I have translated my problem to the public RFAM-Database ( docs.rfam. org/en/latest/databa se.h tml) so you can reproduce the problem.

This is the query that i'm aiming for:


SELECT family.description,clan.description FROM clan
JOIN clan_membership on clan.clan_acc = clan_membership.clan_acc
JOIN family ON family.rfam_acc = clan_membership.rfam_acc;

It gives me 544 rows ( i copied the Rfam-DB a while ago, so there might be more rows now)

This is the get_db_data line that i have on a page:


 {{#get_db_data: db = Rfam
  |join on=clan_membership.clan_acc = clan.clan_acc
  |join on=family.rfam_acc = clan_membership.rfam_acc
  |from=clan
  |data=clan_description=clan.description,family_description=family.description
 }}
 

i have also tried this as data line: |data=clan_description=clan.description AS cdesc,family_description=family.description AS fdesc the AS makes it into the query, i checked. but it doesnt’ help

I then call this to list the results:


 {{#for_external_table:|
 ;{{{clan_description}}}
  : {{{family_description}}}
 }}
 

The list is empty.

The SQL that is being produced by ED looks like this:

SELECT clan.description,family.description FROM `clan`,family`
JOIN `clan_membership` ON ((family.rfam_acc = clan_membership.rfam_acc))

So this query only has one JOIN and the "((" "))" which i have never seen before (i'm not a sql pro though, so that might not mean anything) This query selects clan and family which, again, i'm not sure if it’s right to explicitly state that in a join statement. The query returns two columns called "description" unless i add an alias for the columns in the |data= param, but that doesn’t make it work. running this query on the DB directly gives 66430 rows ( with a lot of duplicates) but limiting to a few rows with LIMIT=25 doesn’t make it work either.

So, how do i make this work?

Edit: This is the current Code i have, after discussing with Yaron Koren


 {{#get_db_data: db = Rfam
  |join on=clan.clan_acc=clan_membership.clan_acc,join on=family.rfam_acc = clan_membership.rfam_acc
  |from=clan
  |data=clan_description=clan.description,family_description=family.description <!-- empty result -->
 }}
 
I think it would just need to be join on=clan_membership.clan_acc = clan.clan_acc, family.rfam_acc = clan_membership.rfam_acc. Yaron Koren (talk) 19:10, 29 April 2024 (UTC)Reply
Alright, that gives me the correct query in the logs, but no results appear in for_external_table 24.134.95.253 20:50, 29 April 2024 (UTC)Reply
some values appear, when i remove the table names in the data:-line, but then i will get duplicates, because there is a clan_acc row in each table 24.134.95.253 21:00, 29 April 2024 (UTC)Reply
If you run that correct query directly in the database, does it return results? Yaron Koren (talk) 01:44, 30 April 2024 (UTC)Reply
yes, running the query in the DB directly returns two columns called "description" and the corresponding values, so the joins do work, but now i cannot get the data into mediawiki because of the identical column names. 77.22.6.114 09:14, 30 April 2024 (UTC)Reply
Can't you do "clan_description=clan.description, family_description=family.description"? Yaron Koren (talk) 13:42, 30 April 2024 (UTC)Reply
i mean yes, that’s what i had first, but it doesn’t show any data. 77.22.6.114 19:39, 30 April 2024 (UTC)Reply
Okay, good news! We just checked in a fix, here, so that now the "AS" keyword is handled correctly, which is what was needed here. So now if you do "|data=clan_description=clan.description AS clandesc, family_description=family.description AS familydesc", it should (hopefully) work. Note that the "AS" aliases don't matter, as long as they exist and they're different from one another. Yaron Koren (talk) 16:20, 3 May 2024 (UTC)Reply

Deprecated : strtolower(): Passing null to parameter #1 ($string) of type string is deprecated

[edit]
Setup
  • MediaWiki 1.39.6 (0e03068) 2024-03-11T16:26:30
  • PHP 8.1.2-1ubuntu2.14 (apache2handler)
  • MariaDB 10.6.16-MariaDB-0ubuntu0.22.04.1
  • External Data 3.4-alpha (20a6b7f) 2024-03-15T09:43:16
Issue

Deprecated : strtolower(): Passing null to parameter #1 ($string) of type string is deprecated in /../w/extensions/ExternalData/includes/EDParsesParams.php on line 117

-- [[kgh]] (talk) 17:49, 15 March 2024 (UTC)Reply

I need the wikicode of the parser function you call and the relevant $wgExternalDataSource[] (with sensitive information censored out). Also, when did the warning appear: after upgrading MediaWiki, External Data, or adding a new data source or parser function call?
Alexander Mashin talk 03:42, 16 March 2024 (UTC)Reply
The extensions $wgExternalDataSource configuration parameter is at its default. On pages with this issue we are using the extensions with calls like this one:
{{#display_external_table:
   source=https://example.org/w/images/9/9c/Export_slice_123.csv
  |format=CSV with header
  |header lines=1
  |start line=3
  |end line=10
  |data= mynr=recordnumber, priref=orig_performance_ref, pc=productcode
  |template=Test Template
}}
|}
I cannot tell it this was an issue before the upgrade since I did not enable logging for the wiki before.
Backtrace
[4b34db6f537fa5e035bcd426] /wiki/Test_Page   PHP Deprecated: strtolower(): Passing null to parameter #1 ($string) of type string is deprecated
#0 [internal function]: MWExceptionHandler::handleError()
#1 /../w/extensions/ExternalData/includes/EDParsesParams.php(117): strtolower()
#2 /../w/extensions/ExternalData/includes/EDParsesParams.php(55): EDConnectorBase::paramsFit()
#3 /../w/extensions/ExternalData/includes/connectors/EDConnectorBase.php(234): EDConnectorBase::getMatch()
#4 /../w/extensions/ExternalData/includes/connectors/EDConnectorBase.php(248): EDConnectorBase::getConnectorClass()
#5 /../w/extensions/ExternalData/includes/EDParserFunctions.php(88): EDConnectorBase::getConnector()
#6 /../w/extensions/ExternalData/includes/EDParserFunctions.php(115): EDParserFunctions::get()
#7 /../w/extensions/ExternalData/includes/EDParserFunctions.php(204): EDParserFunctions::fetch()
#8 /../w/extensions/ExternalData/includes/EDParserFunctions.php(428): EDParserFunctions::emulateGetExternalData()
#9 /../w/extensions/ExternalData/includes/EDParserFunctions.php(487): EDParserFunctions::actuallyDisplayExternalTable()
#10 /../w/includes/parser/Parser.php(3439): EDParserFunctions::doDisplayExternalTable()
#11 /../w/includes/parser/Parser.php(3124): Parser->callParserFunction()
#12 /../w/includes/parser/PPFrame_Hash.php(275): Parser->braceSubstitution()
#13 /../w/includes/parser/Parser.php(2953): PPFrame_Hash->expand()
#14 /../w/includes/parser/Parser.php(1609): Parser->replaceVariables()
#15 /../w/includes/parser/Parser.php(723): Parser->internalParse()
#16 /../w/includes/content/WikitextContentHandler.php(301): Parser->parse()
#17 /../w/includes/content/ContentHandler.php(1721): WikitextContentHandler->fillParserOutput()
#18 /../w/includes/content/Renderer/ContentRenderer.php(47): ContentHandler->getParserOutput()
#19 /../w/includes/Revision/RenderedRevision.php(266): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput()
#20 /../w/includes/Revision/RenderedRevision.php(237): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached()
#21 /../w/includes/Revision/RevisionRenderer.php(221): MediaWiki\Revision\RenderedRevision->getSlotParserOutput()
#22 /../w/includes/Revision/RevisionRenderer.php(158): MediaWiki\Revision\RevisionRenderer->combineSlotOutput()
#23 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}()
#24 /../w/includes/Revision/RenderedRevision.php(199): call_user_func()
#25 /../w/includes/poolcounter/PoolWorkArticleView.php(91): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#26 /../w/includes/poolcounter/PoolWorkArticleViewCurrent.php(97): PoolWorkArticleView->renderRevision()
#27 /../w/includes/poolcounter/PoolCounterWork.php(162): PoolWorkArticleViewCurrent->doWork()
#28 /../w/includes/page/ParserOutputAccess.php(299): PoolCounterWork->execute()
#29 /../w/includes/page/Article.php(714): MediaWiki\Page\ParserOutputAccess->getParserOutput()
#30 /../w/includes/page/Article.php(528): Article->generateContentOutput()
#31 /../w/includes/actions/ViewAction.php(78): Article->view()
#32 /../w/includes/MediaWiki.php(542): ViewAction->show()
#33 /../w/includes/MediaWiki.php(322): MediaWiki->performAction()
#34 /../w/includes/MediaWiki.php(904): MediaWiki->performRequest()
#35 /../w/includes/MediaWiki.php(562): MediaWiki->main()
#36 /../w/index.php(50): MediaWiki->run()
#37 /../w/index.php(46): wfIndexMain()
#38 {main}
[[kgh]] (talk) 17:08, 18 March 2024 (UTC)Reply

Warning : Cannot modify header information - headers already sent by

[edit]
Setup
  • MediaWiki 1.39.6 (0e03068) 2024-03-11T16:26:30
  • PHP 8.1.2-1ubuntu2.14 (apache2handler)
  • MariaDB 10.6.16-MariaDB-0ubuntu0.22.04.1
  • External Data 3.4-alpha (20a6b7f) 2024-03-15T09:43:16
Issue

Warning : Cannot modify header information - headers already sent by (output started at /var/www/html/w/extensions/ExternalData/includes/EDParsesParams.php:117) in /../w/includes/WebResponse.php on line 75

-- [[kgh]] (talk) 17:51, 15 March 2024 (UTC)Reply

I assume this error message is just due to the one you reported above. Yaron Koren (talk) 18:08, 15 March 2024 (UTC)Reply
I cannot tell. The wiki is using Tweeki and the respective pages look really messy. [[kgh]] (talk) 18:11, 15 March 2024 (UTC)Reply
It could be connected though the issue does not appear everytime I get the above reported deprecation issue. Here is a backtrace:
#0 [internal function]: MWExceptionHandler::handleError()
#1 /../w/includes/WebResponse.php(75): header()
#2 /../w/includes/OutputPage.php(2732): WebResponse->header()
#3 /../w/includes/OutputPage.php(2891): OutputPage->sendCacheControl()
#4 /../w/includes/MediaWiki.php(922): OutputPage->output()
#5 /../w/includes/MediaWiki.php(562): MediaWiki->main()
#6 /../w/index.php(50): MediaWiki->run()
#7 /../w/index.php(46): wfIndexMain()
#8 {main}
Cheers --[[kgh]] (talk) 17:17, 18 March 2024 (UTC)Reply

Accessing identically named subkeys

[edit]

Hi! I'm trying to use External Data to get data from an API that returns(part of) it's data as {"midweekMeetingTime":{"weekday":2,"time":"18:30:00"},"weekendMeetingTime":{"weekday":7,"time":"12:00:00"}}, where the subkey names are the same in different arrays. Is there a way to specify which value to set a variable to (ie weekend_day=weekendMeetingTime.time,midweek_day=midweekMeetingTime.time without using a Lua module, as while I believe it would work in Lua, I'd prefer to keep it in wikitext, as most other parameters work and making a module for just this seems like a waste. Thanks all in advanced! PixDeVl (talk) 20:37, 1 May 2024 (UTC)Reply

It looks like the "use jsonpath" parameter would be helpful for this case; see here. Yaron Koren (talk) 21:03, 1 May 2024 (UTC)Reply
Oh, thanks! Would you happen to know any existing instances of the extension using this to reference? There seems to be some weirdness going on with the parser showing the expression working as intended(ie $..midweekMeetingTime.time, the double dot is there since the API returns the array in a list, [{...}]) and the extension giving an undefined error, so I want to refer it to the syntax used by others, although I'll be the first to admit I may be simply be misunderstanding or writing something. Thanks again for the tip and making such a useful extension! PixDeVl (talk) 22:18, 1 May 2024 (UTC)Reply
Sure - all three example queries here use JSONPath. I hope this helps uncover the problem... Yaron Koren (talk) 13:06, 2 May 2024 (UTC)Reply

#store_external_table rendering raw

[edit]
Setup
  • MediaWiki 1.39.7 (2ba7e95) 2024-05-13T11:45:16
  • PHP 8.1.2-1ubuntu2.17 (apache2handler)
  • MariaDB 10.6.16-MariaDB-0ubuntu0.22.04.1
  • External Data 3.4-alpha (c23dc0d) 2024-05-18T10:11:41
Issue

The #store_external_table parser function is rendering raw, i.e., instead of not visibly being shown on a page the user sees {{#store_external_table:Is fruit in |Has name={{{name}}} |Has color={{{color}}} |Has shape={{{shape}}} }}. Despite this the parser function still does it's job, i.e., stores the subobjects holding the annotations. [[kgh]] (talk) 18:25, 5 June 2024 (UTC)Reply

Yes, the #store_external_table parser function was removed from External Data - I need to release version 3.4, to make it an official removal. I don't think it's storing data - my guess is that that SMW data you're seeing was already there beforehand. Yaron Koren (talk) 19:04, 5 June 2024 (UTC)Reply
Thanks for the info. How is information stored for SMW in new releases, i.e., what replaced this parser function? Is #get_web_data doing this now or do I need to create a template that maps the data to the properties? [[kgh]] (talk) 19:09, 5 June 2024 (UTC)Reply
You now need to do it via a template, yes. Yaron Koren (talk) 19:18, 5 June 2024 (UTC)Reply
Ah, ok. Thank you for confirming. It looks like one needs to use #display_external_table for this. [[kgh]] (talk) 19:20, 5 June 2024 (UTC)Reply

Suggestion: add cookies to getWebData in Lua

[edit]

Hi. I have played around with getWebData, especially to see the potential of calling my private wiki's API. I already use getDbData, but I think both can be complimentary.

The problem is that anonymous users cannot read my private wiki (they have to create an account) and getWebData only allows for anonymous connections. I have tried to login to my wiki using getWebData the same way my custom bot does, but it doesn't work. I get an error "Unable to continue login. Your session most likely timed out." and I reckon this is because the session ID is passed as a cookie which is ignored here.

Furthermore, once you are logged in, you have a cookie to avoid logging in every time. That could also work with other sites.

So my suggestion/request is a way to handle cookies with getWebData in Lua (I don't think it would be useful in the template version). I propose the following:

  • The Lua getWebData would return three values instead of two: result, error and cookies (so that it doesn't break existing code which take only result and error)
  • The Lua getWebData would accept a new field in its table argument named "cookies" as a table of cookies which would be passed along with the HTTP request.

That way, we could get cookies from an HTTP request and send them to another request, if necessary, without breaking any existing code.

Thanks. Steff-X (talk) 13:21, 16 June 2024 (UTC)Reply

  • You could try to declare a ExternalDataBeforeWebCall hook to get somehow your cookie and put it into $options['headers']['Cookie'] .= ";your_cookie=$cookie";.
    Note to self: add or remember a less global way to add callbacks to External Data sources.
    Alexander Mashin talk 14:18, 16 June 2024 (UTC)Reply
    Thanks Alex, that worked.
    For other people interested, here's how to do it:
    • In Firefox or Chrome, connect to the wiki you're interested in
    • Display the cookies. The method depends on your browser.
    • Copy the name and value of the cookies ending with UserID, UserName and Token
    • In LocalSettings.php, add the following code, substituting the cookies' name and value by yours (don't miss the .= (dot-equal) sign and the final semi-colon):
    *:$wgHooks['ExternalDataBeforeWebCall'][] =
    *:function ( string $method, string &$url, array &$options, array &$errors ): bool {
    *:	if ( $url === 'https://your.wiki/api.php' ) ) {
    *:        $options['headers']['Cookie'] .= ";mediawikiUserID=5;mediawikiUserName=myUserName;mediawikiToken=(censored)";
    *:		return true;
    *:	}
    *:};
    
    Unfortunately, anything more complicated than that is beyond my knowledge in PHP. That is why I still think that an easy way to handle cookies would be beneficial.
    Thanks again for the tip. Steff-X (talk) 12:11, 18 June 2024 (UTC)Reply
    • If you get your cookie outside of MediaWiki, your method is overcomplicated. Just set $wgExternalDataSources['https://your.wiki/api.php']['options']['headers']['Cookie'] = 'mediawikiUserID=5;mediawikiUserName=myUserName;mediawikiToken=(censored)';
      Alexander Mashin talk 12:56, 18 June 2024 (UTC)Reply
      Well I can't get it to work.
      I always get error: Error sending API request: SyntaxError: JSON Parse error: Unrecognized token '<'
      And I don't know where it comes from because the error shows up even if I disable "format" = "json" in the request and everything related to json in my code. Steff-X (talk) 14:32, 18 June 2024 (UTC)Reply

This extension documentation really lacks some examples in Lua

[edit]

Hi everyone. This extension is great in its capabilities but I think it really lacks some code examples in Lua/Scribunto.

The usage notice says that "there is one-to-one correspondence between parser functions retrieving data and Lua functions evident from their names". OK cool, but the syntax is entirely different between the two, especially when you just start using Scribunto/Lua because your wiki template keeps failing!

I think the documentation should have some advanced examples in Lua/Scribunto (to show the power of Lua vs Templates) and I'm ready to participate. Steff-X (talk) 13:47, 16 June 2024 (UTC)Reply

  • A couple of examples, perhaps, too advanced:
    • Module:Chrono used to show this. Funnily, it drills into Cargo tables;
    • Module:Tzdata, which accesses timedatectl and is used to parse data / time strings;
    • Module:WikiList that parses a JSON from GitHub;
    • My own example from Phabricator;
    • Module:External_data/new -- this module assembles external data split over several web pages with links to each oher, e.g. a long list, from only one URL, linking the first page (or any page, from which there is a path to other pages).
Alexander Mashin talk 14:34, 16 June 2024 (UTC)Reply

Issue with caching of data

[edit]

Hi there,

Great extension, we use it a lot. We have somewhat recently adopted LUA to fetch data from another private wiki onto a public wiki, via a whitelisted Special:Ask page and the use of the mw.ext.externalData.getExternalData function. It's been working well until recently. We realized that any changes on the private wiki were taking quite a lot of time to refresh on the public wiki (many hours, to a few days). We've tried everything we could think of to reduce any kind of caching to 0, but no luck. And it seems to be compounded. ExtData has some weird invisible level of caching and then LUA does too, it seems. We tried:

  • wgExternalDataSources['*']['min cache seconds'] = 0
  • We've tried the standalone, non-lua function: #external_value with cache seconds=0
  • In LUA we first had mw.LoadData() with a big call to mw.ext.externalData.getExternalData to bring the data once to the page and then parse into a large template.
  • Then in LUA we tried individual little calls to mw.ext.externalData.getExternalData without using mw.LoadData().
  • We've turned off all kinds of cache everywhere we could think of, including client side, etc.

No luck! Any ideas?

Jeremi Plazas (talk) 16:03, 22 July 2024 (UTC)Reply

    1. Setting cache period to zero would be a bad idea, because it will make both wikis vulnerable to DOS attacks. Set, at least, to one minute, or more if it takes Special:Ask more time to run;
    2. Whether or not you use Lua or mw.LoadData() is not relevant.
    3. I suggest that you set wgExternalDataSources['(Full URL of Special:Ask with query; or that wiki's hostname, e.g., example.org)']['min cache seconds'] = 60;. That should be enough.
    4. The only additional caching level that External Data introduces is the table ed_url_cache. It can contain cache entries with expitation time in the future, set when the cache expiration period was longer. You may want to have them removed.
Alexander Mashin talk 02:01, 23 July 2024 (UTC)Reply
Thanks for the reply! We tried your suggestions but our problem persists. We set wgExternalDataSources as per your example and also truncated the ed_url_cache table. Still we have pages with data that is stale of a couple of days. We can't quite figure out at what level this is happening. DB? Jeremi Plazas (talk) 09:39, 23 July 2024 (UTC)Reply
Our current config is:
wfLoadExtension( 'ExternalData' );
$wgExternalDataSources['https://research.tsadra.org/index.php']['min cache seconds'] = 5;
$wgExternalDataSources['*']['always allow stale cache'] = false;
But even with 5 seconds, the entry in ed_url_cache does not get updated.
Jeremi Plazas (talk) 10:29, 23 July 2024 (UTC)Reply
This may be an issue of MediaWiki cache. To make sure, open https://your-wiki.org/wiki/Page_with_ED?action=purge.
Alexander Mashin talk 11:50, 23 July 2024 (UTC)Reply
Also, the key to $wgExternalDataSources should be either the full precise url of the data source, including what goes after ?, or the domain name (e.g., research.tsadra.org) or the second-level domain (tsadra.org) or asterisk.
Alexander Mashin talk 11:53, 23 July 2024 (UTC)Reply
Thanks, yes we tried purging the page many times. Also we tried all the different URL types, full, just domain, nothing does it. It looks to be a deeper lever of caching going on possibly the ed_url_cache table not updating...
So we have a little test page that we whitelisted on this wiki site we're working on to illustrate:
https://bca.tsadra.org/index.php/Test_external_data
If it helps.
Jeremi Plazas (talk) 14:18, 23 July 2024 (UTC)Reply
@Alex Mashin - Any new thoughts on this? Thanks! Jeremi Plazas (talk) 15:35, 1 August 2024 (UTC)Reply

working with self signed certificates

[edit]

is there any way to allow self signed certificates while querying webpages?

I get the error

Fehler beim Abruf der URL: SSL certificate problem: self-signed certificate in certificate chain

130.149.196.182 15:01, 30 July 2024 (UTC)Reply

Bug when data contains "umlauts"?

[edit]

mediawiki 1.39, newest source of Extension.

I'm using

{{#get_db_data:
   db=db
   |from=table
   |where=wiki_pageid=43
   |data=Titel=title
}}

title contains

AA - some Text überprüfen

error is:

[a718c169286b5857d6612120] /index.php/Test_ExternalData_error ValueError: mb_convert_encoding(): Argument #3 ($from_encoding) must specify at least one encoding

Backtrace:

from /var/www/html/extensions/ExternalData/includes/connectors/EDConnectorDb.php(159)

  1. 0 /var/www/html/extensions/ExternalData/includes/connectors/EDConnectorDb.php(159): mb_convert_encoding(string, string, string)
  2. 1 /var/www/html/extensions/ExternalData/includes/connectors/EDConnectorDb.php(140): EDConnectorDb::processField(string)
  3. 2 /var/www/html/extensions/ExternalData/includes/connectors/EDConnectorDb.php(102): EDConnectorDb->processRows(Wikimedia\Rdbms\MysqliResultWrapper, array)
  4. 3 /var/www/html/extensions/ExternalData/includes/EDParserFunctions.php(90): EDConnectorDb->run()

probably a problem with the "umlauts"? 91.64.64.183 09:25, 1 August 2024 (UTC)Reply

I have a feeling that your External Data is not up to date.
Alexander Mashin talk 15:57, 1 August 2024 (UTC)Reply
a
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/ExternalData
gives me the exact same error. 91.64.64.183 11:10, 2 August 2024 (UTC)Reply

Getting faulty data with #get_web_data: / #get_file_data:

[edit]

Hello dear people. I have been having issues with this extension for a few days now and my head is soon to be bursting ))

I have the following problem. I need to get external .json files (or .csv) and use those to fill up pages on my wiki.

However the problem lies in the consistency of the data I gather. To explain this issue, I best show you an example.

First I show you how the json I gather seems to be faulty when trying to filter it.

{{#get_web_data:
    source=https://uralez.de/foaop/planner/ammo.json
    |format=json
    |data=Name,ProtoId,Weight,PicInv,Ammo_Caliber,Ammo_Caliber_EXT,Ammo_DmgMult,Ammo_DmgDiv,Tier,Poison,Ammo_AcMod,Ammo_DRMod,Ammo_DTMod,Ammo_SubShots,Weapon_DmgType_0,Weapon_DmgMin_0,NoCrit,Weapon_DmgType_1,Weapon_DmgMin_1,Weapon_BleedStr,Ammo_Radius,Special,Fuse,Weapon_Extra_0,Weapon_Extra_1,Weapon_Spread_0
    |filters=ProtoId={{{id|14017}}}
}}

In this example, the id 14017 does not have a lot of data. This is what is written inside that .json for that particular id:

{
"Name":"Paper Cartridges",
"ProtoId":14017,
"PicInv":"art/ammo/papercartridges.png",
"Weight":35,
"Ammo_Caliber":1,
"Ammo_Caliber_EXT":44,
"Ammo_DRMod":20,
"Ammo_DTMod":0,
"Weapon_DmgType_0":1,
"Weapon_DmgMin_0":250,
"Weapon_BleedStr":150,
"Tier":1
},

However, running the script on my mediawiki (

https://uralez.de/foaop/wiki/index.php/Template:Ammo_Infobox

- in case this is of any importance) will provide vastly different results for variables that ID does not have, but others do.

For example it also gives me 'Subshots=5', even though that particular ID has no Subshots variable to begin with! In other words, getting json and filtering it out for an ID fills all the "unused" variables with values from other IDs. This should not happen, but it does, and I do not know why.

Trying to fix this issue, I hoped that converting the file to a .csv and instead gathering information from there could be a solution. It only partially was. Trying to get the same data from the .csv has the issue, that the first row of the .csv is skipped and not read out for any of the variables.

So having this for example:

{{#get_web_data:
    source=https://uralez.de/foaop/planner/ammo.csv
    |format=csv with headers
    |data=Name,ProtoId,Weight,PicInv,Ammo_Caliber,Ammo_Caliber_EXT,Ammo_DmgMult,Ammo_DmgDiv,Tier,Poison,Ammo_AcMod,Ammo_DRMod,Ammo_DTMod,Ammo_SubShots,Weapon_DmgType_0,Weapon_DmgMin_0,NoCrit,Weapon_DmgType_1,Weapon_DmgMin_1,Weapon_BleedStr,Ammo_Radius,Special,Fuse,Weapon_Extra_0,Weapon_Extra_1,Weapon_Spread_0
    |filters=ProtoId={{{id|14017}}}
}}

Will fill every variable BUT the "Name" variable. If I modify the .csv to put another variable to the front, it will ignore that variable. So in this case, no matter what id I will filter for - the "Name" variable will NOT be populated.

Both issues plague me - I do not know what I am doing wrong and this extension seems to be such a core part if one wants to read out external .json / .csv files that I do not even know of a different approach. If anyone could provide me with some help - or perhaps tell me what might be wrong, I would be very grateful. Thanks to anyone reading this. Zmejaa (talk) 23:00, 20 February 2025 (UTC)Reply

  • First of all, the recommended syntax now is like this (I left only the first few columns and skipped the filter). Note :|; that there are only one parser function, and no data parametre:
{|
! Name !! ProtoId !! Weight !! PicInv !! Ammo_Caliber !! Ammo_Caliber_EXT !! Ammo_DmgMult !! Ammo_DmgDiv
{{#for_external_table:|
{{!}}-
{{!}} {{{Name}}} {{!}}{{!}} {{{ProtoId}}} {{!}}{{!}} {{{Weight}}} {{!}}{{!}} {{{PicInv}}} {{!}}  {{!}} {{{Ammo_Caliber}}} {{!}}{{!}} {{{Ammo_Caliber_EXT}}} {{!}}{{!}} {{{Ammo_DmgMult}}} {{!}}{{!}} {{{Ammo_DmgDiv}}}
     |source = https://uralez.de/foaop/planner/ammo.json
     |format = json
}}
|}
  • From the output of this wikicode the problem will be obvious. On heterogenous records (that is, with optional fields) the extension always worked counter-untuitively, since it emulates a sort of column-based storage, in which columns are not dependent on each other. Emply fields are filled with values from next records.
    Since you want only one record, a possible solution is to use JSONpath. None the default em dashes for missing properties:
{|
! Name !! ProtoId !! Weight !! PicInv !! Ammo_Caliber !! Ammo_Caliber_EXT !! Ammo_DmgMult !! Ammo_DmgDiv
{{#for_external_table:|
{{!}}-
{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Name|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].ProtoId|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Weight|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].PicInv|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Ammo_Caliber|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Ammo_Caliber_EXT|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Ammo_DmgMult|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Ammo_DmgDiv|—}}}
    |source = https://uralez.de/foaop/planner/ammo.json
    |format = json with jsonpath
}}
|}
  • Also, you could use Lua:
local id = 14017
local record = mw.ext.externalData.getExternalData {
	source = 'https://uralez.de/foaop/planner/ammo.json',
	format = 'json with jsonpath',
	data = { record = '$[?(@.ProtoId == ' .. tostring (id) .. ')]' }
}.record
  • Alexander Mashin talk 08:26, 21 February 2025 (UTC)Reply
    Hello! Thank you for your input! This definitely seems to be the solution I was looking for. There is only 1 question I have remaining - Using {{{id}}} seems to be not supported inside the #for_external_table: function.
    $[?(@.ProtoId == {{{id (Illegal JSONpath $1)
    What are the options to make the variable useable inside #for_external_table ?
    Using the ID like this works of course:
    {{{$[?(@.ProtoId == 14017)].Name|—}}}
    But then I can not have it generated (Unless I take the Lua approach of course) Zmejaa (talk) 09:52, 21 February 2025 (UTC)Reply
    Small addendum. I have managed to resolve the issue with a small hack.
{|
! Name !! ProtoId
{{#for_external_table:|
{{!}}-
{{!}} {{{$[?(@.ProtoId == {{ProtoId|{{{id|14017}}}}})].Name|—}}} {{!}}{{!}} {{{$[?(@.ProtoId == {{ProtoId|{{{id|14017}}}}})].ProtoId|—}}}
    |source = http://localhost/foaop/planner/ammo.json
    |format = json with jsonpath
}}
|}
  • Creating a ProtoId Template which just shows the id I give to him.
{{{1}}}
  • This basically.
    It looks extremely wrong and I am sure I might be able to fix the syntax, but for now this works. I hope. I will do some tests now.
    I thank you for your time and effort for going through my issue and providing me with help and support. Thank you again.
    I would of course not mind a solution to the syntaxing problem, but, at least I can work with the hacky solution I have right now. <3 Zmejaa (talk) 11:03, 21 February 2025 (UTC)Reply
    I have managed to fix all the issues! I am very thankful for your input once again! I am very happy :)
    In case anyone has a similar problem like I do, this is basically what I did:
{{#get_file_data:
    source=planner
    |file name=ammo.json
    |format=json with jsonpath
    |data=Name=$[?(@.ProtoId == {{{id|14017}}})].Name,ProtoId=$[?(@.ProtoId == {{{id|14017}}})].ProtoId,Weight=$[?(@.ProtoId == {{{id|14017}}})].Weight,PicInv=$[?(@.ProtoId == {{{id|14017}}})].PicInv
}}
  • Of course you can also do the same with get_web_data
    Basically the "filtering" happens in "data" and it works flawlessly - remeding the issues I had beforehand which Alexander Mashin explained (Values in optional fields are filled with other .json entries) Zmejaa (talk) 12:32, 21 February 2025 (UTC)Reply
  • First of all, you should no longer use the {{#get_…_data:}} functions. They are deprecated, because in new versions of MediaWiki these functions can be called after {{#external_value:}}, etc.
    I have tested the following syntax for a template, using the otherwise also deprecated parametre data:
{|
! Name !! ProtoId !! Weight !! PicInv !! Ammo_Caliber !! Ammo_Caliber_EXT !! Ammo_DmgMult !! Ammo_DmgDiv
{{#for_external_table:|
{{!}}-
{{!}} {{{Name|—}}} {{!}}{{!}} {{{ProtoId|—}}} {{!}}{{!}} {{{Weight|—}}} {{!}}{{!}} {{{PicInv|—}}} {{!}}{{!}} {{{Ammo_Caliber|—}}} {{!}}{{!}} {{{Ammo_Caliber_EXT|—}}} {{!}}{{!}} {{{Ammo_DmgMult|—}}} {{!}}{{!}} {{{Ammo_DmgDiv|—}}}
    |source = https://uralez.de/foaop/planner/ammo.json
    |format = json with jsonpath
    |data =
         Name = $[?(@.ProtoId == {{{id|14017}}})].Name,
         ProtoId = $[?(@.ProtoId == {{{id|14017}}})].ProtoId,
         Weight = $[?(@.ProtoId == {{{id|14017}}})].Weight,
         PicInv = $[?(@.ProtoId == {{{id|14017}}})].PicInv,
         Ammo_Caliber = $[?(@.ProtoId == {{{id|14017}}})].Ammo_Caliber,
         Ammo_Caliber_EXT = $[?(@.ProtoId == {{{id|14017}}})].Ammo_Caliber_EXT,
         Ammo_DmgMult = $[?(@.ProtoId == {{{id|14017}}})].Ammo_DmgMult,
         Ammo_DmgDiv = $[?(@.ProtoId == {{{id|14017}}})].Ammo_DmgDiv
}}
|}
  • The root of the issue is that the {{{…}}} macros in the now recommended data-less syntax are not processed correctly when they contain template parametres. I will try to fix it.
    Alexander Mashin talk 05:29, 22 February 2025 (UTC)Reply
    Thank you very much! I have used your approach to do my templates and it works wonderful! Just as expected. I do not use the deprecated function, (Apart from the data parametre due to the obvious issue you have outlined) and get exactly the results I want. As seen here:
https://uralez.de/foaop/wiki/index.php?title=Template:Weapon_Infobox
https://uralez.de/foaop/wiki/index.php?title=Template:Armor_Infobox
https://uralez.de/foaop/wiki/index.php?title=Template:Ammo_Infobox

For this once again a very big thank you!

  • While my initial question have been answered, I would like to ask you, if it is possible to get a #for_external_table function to make a list of the .json entries without the afromentioned issue of values going into different entries. (Basically when trying to get all the "weapons" (in my case), it once again has the same issue of behaving unintuively.

To show with an example, I mean the following:

{{{!}} class="wikitable sortable"
! Tier
! Type
! Name
! Picture
! Strength
! Damage Type
! Max Distance<br>(Attack 1)
! Max Distance<br>(Attack 2)
{{#for_external_table:|
{{!}}-
{{!}} {{{Tier}}}
{{!}} {{{Weapon_Skill_0}}}
{{!}} {{{Name}}}
{{!}} https://uralez.de/foaop/planner/{{{PicInv}}}
{{!}} {{{Weapon_MinStrength}}}
{{!}} {{{Weapon_DmgType_0}}}
{{!}} {{{Weapon_MaxDist_0}}}
{{!}} {{{Weapon_MaxDist_1}}}
    |source=https://uralez.de/foaop/planner/weapons.json
    |format = json with jsonpath
    |data =
Name=$[?(@)].Name,
PicInv=$[?(@)].PicInv,
Weapon_MinStrength=$[?(@)].Weapon_MinStrength,
Tier=$[?(@)].Tier,
Weapon_Skill_0=$[?(@)].Weapon_Skill_0,
Weapon_DmgType_0=$[?(@)].Weapon_DmgType_0,
Weapon_MaxDist_0=$[?(@)].Weapon_MaxDist_0,
Weapon_MaxDist_1=$[?(@)].Weapon_MaxDist_1
}}
{{!}}}

(For clarity - Weapon_MaxDist_1 is a value not every entry has. And if it does not have it it takes one from another entry.) Is there a way to just use for_external_table and still mitigate the issue of having values jump into other entries due to the way the extension stores them? Or would I here once again have to resort to a hack. (Either converting the .json to a .csv specifically for the whole table or... Do some other shenanigan like pulling a template for each entry which however would result in a MASSIVE amount of calls on the target webserver) Zmejaa (talk) 13:16, 24 February 2025 (UTC)Reply

  • For a set of records not of all of which have the same attributes set, the safest would be to use Lua: mw.ext.externalData.getExternalData 'https://uralez.de/foaop/planner/weapons.json'.__json. Note the syntactic sugar.
    Also, you definitely need to set up this extension's cache: it not only reduces the load on your and remote servers, but also helps if the latter goes offline.
    Alexander Mashin talk 15:30, 24 February 2025 (UTC)Reply
    Understood! I had the extension cache already set up so I should be good in that regard.
    I tried to avoid Lua, not because I am not proficient with it, quite the contrary, but moreso because I thought it would add unnecesarry risks/data handling/stress. But I guess the usage of Lua is quite well-spread in the mediawiki community and shouldn't pose any grave issues :)
    I can't stress this enough but give you a warm thanks once again. You have helped me out with all the issues I had and allowed me to make a solution that works out very well for me!
    o7 Zmejaa (talk) 15:37, 24 February 2025 (UTC)Reply
    If you upgrade the extension, you can now use the consize syntax for {{#for_external_table:}}, i.e., without data even with template parametres in external variables' names and default values.
    Alexander Mashin talk 06:09, 26 February 2025 (UTC)Reply
    Thank you very much! )) Zmejaa (talk) 09:28, 28 February 2025 (UTC)Reply
    I noticed a small bug with using Template Parameters in external variables' names with default values.
    For example I have this code:
{{#for_external_table:|
{{{!}} class="wikitable" style="width:30%; float:right"
! colspan="1" {{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Name|—}}}<br>Tier {{{$[?(@.ProtoId == {{{id|14017}}})].Tier|—}}}
{{!}}}
    |source = https://uralez.de/foaop/planner/ammo.json
    |format = json with jsonpath
}}
  • The last `Tier` will result in a dash-line (-) even though that value exists. If I will then ask for a third variable (Or even the same) it will work. It's like the last variable call is not handled properly.
    This for example would now have a working Tier, but now ProtoId will be a dash line because it is the last query.
{{#for_external_table:|
{{{!}} class="wikitable" style="width:30%; float:right"
! colspan="1" {{!}} {{{$[?(@.ProtoId == {{{id|14017}}})].Name|—}}}<br>Tier {{{$[?(@.ProtoId == {{{id|14017}}})].Tier|—}}}<br>ProtoId {{{$[?(@.ProtoId == {{{id|14017}}})].ProtoId|—}}}
{{!}}}
    |source = https://uralez.de/foaop/planner/ammo.json
    |format = json with jsonpath
}}

Zmejaa (talk) 13:33, 7 March 2025 (UTC)Reply

Small Addendum, having if requests inside the function will not work outright:
{{#if: {{{$[?(@.ProtoId == {{{id|14017}}})].PicInv|}}} | https://uralez.de/foaop/planner/{{{$[?(@.ProtoId == {{{id|14017}}})].PicInv}}} |No Image}}
even if after the if statement you ask for different kind of data. I suspect it has something to do with the order the statement is processed. (First the if, the the actual variable call, maybe?) Zmejaa (talk) 14:04, 7 March 2025 (UTC)Reply
I'm afraid I'm having the same kind of issue here. I'm querying JSON data with JSONPath and use #for_external_table to output them to the page. But at the end of the table, it appears that the final rows are wanting data that happen to be wrongly assigned to some previous rows (two rows off). I updated ED to the latest master, but that did not solve anything. Rand(1,2022) (talk) 08:03, 3 April 2025 (UTC)Reply
  • You might want to provide a reproducible publically available example, in case there's another regression. But the chances are small. There is a fault in underlying legacy architecture of the extension: when the data records' structure is not the same, as is the case with JSON and XML, as opposed to CSV, the data columns which are build for each variable independently, will be of different height. It is not likely that it is going to be fixed soon. You should use Lua and the __json variable that will preserve the structure of the JSON file.
    P.S. Now that I think of it more, I realise that it's not even a bug. When we have several Xpaths, JSONPaths or plain field names, there is no reason to assume that they describe fields of the same data structure. The implied data record is a phantom; for none has ever been queried for. External Data has no mechanisms to query data records and then, fields for each record, -- other than Lua, where either __json table can be indexed, or, indeed, the required rows can be queried as a whole, and each returned as a table.
    Alexander Mashin talk 12:01, 3 April 2025 (UTC)Reply
It happens when at least one of the items in the array lacks a value or key-value pair. If that's not covered by ED, then so be it. I now solved it using a temporary extension that converts JSON to a PHP array internally and some custom code to fetch the relevant data. Rand(1,2022) (talk) 11:02, 4 April 2025 (UTC)Reply
OK, but if you have Scribunto, it can already do it.
Alexander Mashin talk 11:11, 4 April 2025 (UTC)Reply
That's true, of course 👍 Rand(1,2022) (talk) 11:49, 4 April 2025 (UTC)Reply