Extension talk:External Data/Archive 2017 to 2018

Nested XML Content
Really like the extension - so much potential. I looked every where I could think of, but have not found a more detailed discussion of using external XML data. Perhaps somebody has figured out how to do this?

  red  purple  brown

Any way to specify local variable mappings for box color vs. fruit color? --12.167.75.11 21:16, 3 June 2010 (UTC)

External data showing non-cached data but semantic data incorrect
I'm pulling data from twitter and marking the data using semantic mediawiki, and the external data is updating, but the semantic properties aren't updating. They aren't updating even when I hit the refresh tab. See:


 * http://ec2-174-129-9-164.compute-1.amazonaws.com/wiki/Twitter_searches
 * http://ec2-174-129-9-164.compute-1.amazonaws.com/wiki/Adeletiblier
 * http://ec2-174-129-9-164.compute-1.amazonaws.com/wiki/Special:Browse/Adeletiblier

Any idea why this is going on? --Ryan lane 03:11, 17 July 2009 (UTC)

Btw, if I actually modify the page, it updates the semantic data. Simple purges don't work though. --Ryan lane 03:15, 17 July 2009 (UTC)


 * Yes, that's a definite problem when using External Data in conjunction with SMW - the data, once retrieved, is stored in the SMW tables, and after that it becomes hard to update it. Refreshing pages won't work, because that doesn't lead to any database updates. At the moment, I think the best solution (and the one that some people already use) is to create a cron job to run the SMW script "SMW_refreshData.php" (which does update the database) at a regular interval, like once a day. I should add something to the extension page about that...


 * Ah, this solves my problem, thanks! --Ryan lane 13:51, 17 July 2009 (UTC)


 * By the way, your site's a great demonstration of the possibilities of External Data and SMW together - I might end up using it for demo purposes, assuming it stays up... Yaron Koren 13:44, 17 July 2009 (UTC)


 * Amazon EC2 is expensive ;). I put it up for demo purposes for a talk I'm giving at BarCampNOLA on community software development where SMW is the backend, SF is the frontend, templates are business logic, and Plotters and Gadgets are user writable javascript. It is unlikely I'll keep the site up unless I find some people who want to co-host (and co-pay). At some point I'll probably get a VPS. I have the database and wiki files saved, so I'll let you know the new site when I bring it up. --Ryan lane 13:51, 17 July 2009 (UTC)


 * Ah, okay. I'm glad to hear that you're spreading the word about the "extended SMW" system. By the way, you can host a wiki that uses all those extensions for free on Referata. Yaron Koren 14:19, 17 July 2009 (UTC)

Allowing non-REST APIs?
I'm sure I know the answer to this, but would it be possible for this to support non-REST APIs in the future? Something like SOAP or XML-RPC? --Ryan lane 13:57, 17 July 2009 (UTC)


 * I don't see why it wouldn't be possible... it just needs to be implemented, including coming up with a good way to express all the data being sent, via a parser function call (presumably something like #get_soap_data, etc.) Yaron Koren 14:21, 17 July 2009 (UTC)


 * Awesome. I may take a crack at this at some point in the future... --Ryan lane 17:37, 17 July 2009 (UTC)

Error when running SMW_refreshData.php
I'm getting the following error when I run SMW_refreshData.php:

Fatal error: Call to a member function getText on a non-object in /var/www/w/extensions/ExternalData/ED_ParserFunctions.php on line 20

Which is: $cur_page_name = $wgTitle->getText;

Do I have something misconfigured? --Ryan lane 20:39, 17 July 2009 (UTC)


 * Oh yes, sorry about that - this has come up as an issue before. There's a fix that needs to be applied to Semantic MediaWiki to get this working. In the file "/includes/storage/SMW_SQLStore2.php", under line 1278 ("foreach ($titles as $title) {"), you should add the following two lines:

global $wgTitle; $wgTitle = $title;
 * I should add that to the documentation as well; and of course add the patch to SMW. Yaron Koren 15:30, 18 July 2009 (UTC)


 * Cool. That works thanks! --Ryan lane 03:10, 19 July 2009 (UTC)

Limit #get_external_data to specific URLs
For security reasons, it would be nice to limit the external data to trusted sites only. Would this be possible? If I gave a patch for this, would you accept? --Ryan lane 19:34, 27 July 2009 (UTC)


 * Hm, that's a good idea - how about an "$edgAllowExternalDataFrom" variable, that holds an array of domain names, akin to MediaWiki's $wgAllowExternalImagesFrom setting? I'd definitely accept that. Presumably, if the value were null, all URLs would be accepted. Yaron Koren 19:42, 27 July 2009 (UTC)


 * That sounds perfect to me. --Ryan lane 20:05, 27 July 2009 (UTC)

Internationalization
Looks like the error messages are echoed directly in english without i18n support. Can I send you a patch adding support?

Btw, I have commit access, do you prefer that I send patches, or commit and let you revert things that don't fit properly? Either way I'll ask first about changes I'd like to make. --Ryan lane 20:55, 27 July 2009 (UTC)


 * Yeah, that's true, I never translated all the error messages; which would be a good idea. If you have commit access, I think it's definitely better if you check in changes directly; although it's good of you to ask first (most people don't bother :) ). Yaron Koren 22:39, 27 July 2009 (UTC)

Nothing Displays
When I use the example on your site I get nothing displaying.

I tried running the SMW_refreshdata.php and had to apply the above patch, but I now get the following error: PHP Notice: Undefined offset:  1 in /opt/apps/itwiki/itwiki/extensions/ExternalData/ED_ParserFunctions.php on line 56

I have also tried accessing a json source and I get the following string displayed on the page: UNIQb8b7293751eb2cb-pre-00000000-QINU


 * My guess is that there's an error in your #get_external_data call - all parameters after the first two need to contain either '=' or '=='. You've definitely uncovered a bug, though - the error-handling should be more graceful. The second problem sounds bad, too - maybe it's caused by the same error? I hope so. Yaron Koren 13:42, 28 August 2009 (UTC)

I upgraded the version of SemanticMediaWiki to 1.4.3 from 1.4.2 (mediawiki 1.14), it works with json type external data but not with the csv germany example. Is there a page with tested versions on dependent apps i can check? -JS 12:42, 31 August 2009 (UTC)


 * Sorry, what do you mean by tested versions and dependent apps? The Discourse DB Germany page is, of course, a demo page... Yaron Koren 19:57, 31 August 2009 (UTC)

eg. for version x of external data, the following versions of mediawiki, semanticmediawiki etc are needed/have been tested. I also had to add JSON support for php on my server. -JS 20:38, 31 August 2009 (UTC)


 * I see. No, there isn't... the CSV format is a special case, since in SMW 1.4.3 that format began to be printed out with a header row; so the format in #get_external_data should be 'csv with header' instead of 'csv'. Yaron Koren 21:53, 31 August 2009 (UTC)

I was experiancing this error earlier and was able to find out the source of my dilemma, durring my call for data from mysql, on the last like of the call there is "|data=your_variable=table_column,etc=etc..." my problem was I had removed the "data=" in call so it was "|your_variable=table_column,etc=etc..." I hope this might help you -Brandon Collins

HI back with another error, trying to use the LDAP source: [Thu Oct 15 22:56:59 2009] [error] [client 10.0.3.213] PHP Notice: Undefined offset: 1 in /wiki/extensions/ExternalData/ED_Utils.php on line 46, referer: http://wiki/index.php/External_Data_Testing

[Thu Oct 15 22:56:59 2009] [error] [client 10.0.3.213] PHP Warning: ldap_search [function.ldap-search]: Search: No such object in /wiki/extensions/ExternalData/ED_Utils.php on line 103, referer: http://wiki/index.php/External_Data_Testing

[Thu Oct 15 22:56:59 2009] [error] [client 10.0.3.213] PHP Warning: ldap_get_entries: supplied argument is not a valid ldap result resource in /wiki/extensions/ExternalData/ED_Utils.php on line 104, referer: http://wiki/index.php/External_Data_Testing thanks JS
 * Hi.... can you paste your #get_ldap_data call here? - Borofkin 06:03, 26 October 2009 (UTC)

thanks -JS

CSV example table is garbled
I am trying to use the csv example below and the rendered tables is garbled as shown below. --RichardMcMahon 00:48, 29 November 2009 (UTC)

http://discoursedb.org/wiki/Fruits_table

Data was retrieved from the fruits data page, using the External Data extension.

All fruits (retrieved from the URL http://discoursedb.org/wiki/Special:GetData/Fruits_data, with links added to the fruit names as a demonstration):


 * Hi - the instructions should be clearer about this, but all you need to do is to create a page called "Template:!" that contains a single character = "|". Yaron Koren 14:19, 29 November 2009 (UTC)


 * Thanks, That fixed my problem. --RichardMcMahon 15:28, 29 November 2009 (UTC)

Special characters in data values
Hi Yaron

I am running (again :) ) into a unique situation.

I need to create links using values from an external data source. Problem is : these values contain both spaces and '[]' characters. This breaks the wiki syntax for creating links.

I tried using '{{urlencode' on values, but it seems that command is not interpreted inside '#for_external_table:' (all I get is the literal string '{{urlencode:{{{term}}}}}' if 'term' is my variable from 'get_external_data'.

Any idea on how to resolve this ? or is that something missing in External data itself ?

- Laurent Alquier


 * I ended up adding a piece of code to ED_ParserFunctions.php to apply 'urlencode' to values of variables with a '.url' prefix in their name. So '|Term.url=term' will encode values of 'term' while '|Term=term' will not. I can send you the patch if you want it. - Laurent Alquier


 * Thanks for the idea! I implemented the same concept in the latest version, 0.9.2, but in a different way - variables are mapped normally, but then called using "{{{term.urlencode}}}". Yaron Koren 03:01, 11 January 2010 (UTC)


 * Cool. I prefer this approach a lot more ! Thanks for including it in your code. - Laurent

JSON
Hello! I'm trying to import some data from google translate api, which returns the results in JSON format. This is my code:

I don't know if the URL is well encoded for External Data. I have tried all the posibilities I can think of, and nothing showed. Could you give me any hints? thank you very much

-- Jaime


 * Ooh, sorry about that - you've encountered a bug, for JSON pages that contain tags that aren't all lowercase. This will be fixed in the next version of External Data, but for now you can fix it in your local version by adding a line to ED_Utils.php - above line 258, which is:

if( array_key_exists( $key, $retrieved_values )

...you should add the following:

$key = strtolower( $key );

Yaron Koren 04:21, 7 December 2009 (UTC)


 * thank you very much Yaron!! it worked :) -- Jaime

Conflict with NTLM authentication
Hi Yaron

I discovered something you may want to add to 'Common problems'.


 * 1) get_external_data fails if you are using a remote wiki protected by Windows NTLM authentication scheme.

I had to exclude '/wiki/Special:GetData' from authentication in order to get it to work correctly.

This is probably due to the way the NTLM handshake works (it first sends two HTTP requests with status 401 before getting to the actual page).

- Laurent Alquier


 * Sorry, is this for any #get_external_data calls, or just ones to Special:GetData? Yaron Koren 04:03, 11 January 2010 (UTC)

Loading data from a JSON api
Hello Yaron!

I'm trying to load data from a json response but don't know how to manage an array. Here is an example of response:


 * {"responseData": {"results":[{"url":"http://www.koiora.net/wp-content/uploads/2009/04/perro.gif","visibleUrl":"www.koiora.net"},{"url":"http://zaragozaciudad.net/tejeros/upload/20080414122439-perro25.jpg","visibleUrl":"zaragozaciudad.net"}],"estimatedResultCount":"890000"}}, "responseDetails": null, "responseStatus": 200}

1) How can I get the second url field?

2) With the following code I get one url field:



but it ends showing this way:


 * 

... When are the html tags added???

Thanks!!! - Jaime


 * Hi - for 1), you should use #for_external_table instead of #external_value. For 2), do you mean that that HTML is literally what you see on the screen? If so, that's bad... Yaron Koren 00:57, 22 January 2010 (UTC)


 * 1) perfect!!! 2) I've just realized that the html tag is there because mediawiki wraps the image url to show it; I'll have to avoid this behavior in order to load the data into a semantic form field.... Thank you very much! :) - Jaime

Not working? CSV #for_external_table
Hello - I've tried the following:

and then

We've created the Template:! and still we see no data - just an empty table. We've tried no local variable and name='1'; name=1; name="1"; etc. Nothing works. Are we doing something silly? Installation is confirmed and our MW is 1.13.2 --Robinson Weijman 14:37, 28 January 2010 (UTC)


 * Hi - it should be "name=1", without quotation marks - I should clarify that in the documentation. I would try simplifying the call - get rid of all the formatting, and see if that works. If it doesn't, what happens if you just call ""? If that doesn't work either, my guess is that the URL somehow isn't getting accessed. Yaron Koren 15:50, 28 January 2010 (UTC)


 * Thanks for the quick reply. I removed the quotes - did not help.  I tried just  - again nothing.  The URL did work with a previous implementation when we used iFrame - so it should be accessible via the wiki.  I just tried $wgHTTPTimeout = 20; - did not help either.  What formatting should I get rid of?  --Robinson Weijman 16:13, 28 January 2010 (UTC)


 * Okay, my strong guess it that that URL can't be accessed. It's the PHP code, i.e. the server on which the wiki sits, that has to access that URL - and just because you can see the URL doesn't mean that the server can. My guess is that that's somehow the issue. Yaron Koren 21:02, 28 January 2010 (UTC)


 * Thanks again. It's a UNIX server - how can I test your guess?  Strange that it worked fine via the iFrame extension.  --Robinson Weijman 08:16, 29 January 2010 (UTC)

With the iFrame extension, the server isn't accessing the URL. You could try running a PHP script on the server like the ones on this page... Yaron Koren 16:55, 29 January 2010 (UTC)


 * Thanks again! I'll check it out.  --Robinson Weijman 08:41, 1 February 2010 (UTC)

Retrieving Data via ssl - possible?
Hi,

I am trying to retrieve data from an external source using a non-trusted certificate via ssl (https://example.com/xyz.csv), thus generating no output. Are there any known issues or hints you can give me?

Thanks a lot for your efforts!

--82.113.106.207 20:27, 23 February 2010 (UTC)


 * Sorry, I don't know enough about SSL to answer that question, though you're probably right that that's what's causing the problem. I would try creating a PHP script, on the server running the wiki, that just retrieves the text of that URL. If you can get that working, send me the code of that script and I'll see if I can fit it into the extension somehow. Yaron Koren 00:32, 24 February 2010 (UTC)


 * Hi, solution was quite easy: I set $edgAllowSSL = false; in the ExternalData.php.


 * Yet another problem arose afterwards: the script needs to log into the wiki, as I have a restrictive rights management. I am trying to realise that using curl hardcoded into your extension, as soon as I got it working, I will post the solution here. --80.153.59.17 09:54, 24 February 2010 (UTC)


 * Ok, to access a mediawiki with SSL and restrictetd readaccess only for registered users, I successfully did the following:
 * Created a user 'ScriptDummy' in the wiki
 * Created a file 'mwcookies.txt' and gave the apache (using Linux) rights to read and write the file
 * Replaced in ED-Utils.php (ED Version 0.9.2)

return Http::get( $url, 'default', array(CURLOPT_SSL_VERIFYPEER => false) );
 * with

Http::post( "https:///index.php?title=Special:Userlogin&action=submitlogin&type=login", 'default', array(CURLOPT_SSL_VERIFYPEER => false, CURLOPT_POST => 1, CURLOPT_POSTFIELDS => "wpName=&wpPassword=&wpLoginattempt=Login", CURLOPT_COOKIEJAR => "//mwcookie.txt", CURLOPT_COOKIEFILE => "//mwcookie.txt", CURLOPT_FOLLOWLOCATION => true, CURLOPT_RETURNTRANSFER => 1)); return Http::get( $url, 'default', array(CURLOPT_SSL_VERIFYPEER => false, CURLOPT_COOKIEFILE => "//mwcookie.txt") );
 * Working fine - thanks Yaron! --80.153.59.17 11:05, 24 February 2010 (UTC)


 * Wow, okay. That's great that you got it working. I'll try to come up with some generic solution that allows for approaches like yours. Yaron Koren 17:56, 24 February 2010 (UTC)

{{#for_external_table:
Could this also be used to return multiple likes from an #ask query into different rows of a table? I can for the life of me figure out how to template a #ask query through a template into a real tabe. Only thing availabe seems to be the default. Having more options for a ResultFormat for custom tables would be nice.Dmoorevtedu 01:08, 18 March 2010 (UTC)


 * Sure, there's no reason why a wiki couldn't retrieve its own data. Yaron Koren 13:14, 18 March 2010 (UTC)

error "" is not a valid magic thingie for "get_external_data", "get_ldap_data", ...
Running the latest SVN version (r64385) on MW 1.16.0beta1, first I noticed that the assignment $edgIP = $IP. makes the assumption that ExternalData is installed as a subdirectory of $IP (=normally mw/extensions/). This assumption is not made in other extensions. Also, the error_log is full of "" is not a valid magic thingie for "get_external_data" etc. errors. Probably something with the parser or internationalization messages? I cannot track it down, any help very much appreciated. Thanks, Wolfgang Spraul 88.198.75.226 12:33, 30 March 2010 (UTC)

Update: I think the 'magic thingie' error messages are a consequence of the $edgIP setting. I had a symlink from extensions/ExternalData but probably symlinks aren't followed elsewhere. I think $edgIP should not be based off of $IP, it should be determined from the path of the current file (ExternalData.php). Hard-wiring $edgIP to my ExternalData folder fixes this for me for now... Wolfgang Spraul 88.198.75.226 13:14, 30 March 2010 (UTC)


 * Thanks for this bug report - this is now fixed (I hope) in SVN. Yaron Koren 15:57, 30 March 2010 (UTC)


 * Great! I can confirm that svn r64415 fixes the hardcoded path issue and the magic thingie errors that may result from it. Wolfgang Spraul 88.198.75.226 01:37, 31 March 2010 (UTC)


 * Excellent. Yaron Koren 16:43, 31 March 2010 (UTC)

str_getcsv can only parse one line, not an entire file (multi-line)
This might be another bug (r64415 now). In ED_Utils.php, if str_getcsv is present (>=PHP 5.3.0), it's called to parse the entire multi-line csv. However, according to the documentation and in reality, it can only parse one line. See http://php.net/manual/en/function.str-getcsv.php. For now I have just commented out that function and the code goes back through the old write-to-file loop. str_getcsv can be used but some sort of loop needs to go around it I think. There are many examples at the bottom of the PHP documentation page. Wolfgang Spraul 88.198.75.226 09:25, 1 April 2010 (UTC)


 * Thanks for letting me know about that - this has now been fixed, in the new version (1.0). Yaron Koren 20:56, 3 May 2010 (UTC)

PHP warning when Caching is turned on.
Hi Yaron

I am using the latest version (1.0) and I am getting these errors from PHP when trying to retrieve cached external data queries : Notice: Undefined variable: res in $IP\extensions\ExternalData\ED_Utils.php on line 334 Notice: Trying to get property of non-object in $IP\extensions\ExternalData\ED_Utils.php on line 335 I am looking at the code but so far, I am not sure what $res refers to.

- Laurent Alquier


 * That is odd - apparently that bug has been there for a while. What happens if you just comment out line 334? Yaron Koren 19:46, 5 May 2010 (UTC)
 * The cache seems to be working without warning when that line is commented out :) I guess not a lot of people are using cached queries (they should).
 * Another point worth mentioning is that the parameter for cached table name is the name of the table *before* adding any database prefix defined in LocalSettings.php - Laurent


 * Alright, cool. Yes, that's a good point as well. Yaron Koren 22:12, 5 May 2010 (UTC)