Extension talk:External Data/Archive 2014

#display_external_table:... lower case parameter names
Hi, I was in the process of updating the documentation for External Data but I got stuck with a question. It seems that  can only handle lower case parameter names that are defined in. Is this by design or is this a "bug"?

When you alter any of the letters in parameter to upper case like paraMeter  works just fine but   stops working. Example:

--Jongfeli (talk) 13:22, 30 January 2014 (UTC)


 * That definitely sounds like a bug, yes. Yaron Koren (talk) 16:49, 30 January 2014 (UTC)


 * Well Yaron, I was a little to fast with this one. We where not running the latest version of External Data and it was already fixed "ages" ago, thanks. Sorry for the "trouble" :) --Felipe (talk) 07:24, 31 January 2014 (UTC)


 * Great, that's a relief! Yaron Koren (talk) 13:02, 31 January 2014 (UTC)

Basic auth in conjunction with External Data?
First of all: great extension! So many possibilities! :-) As of now, I’m using External_data to access Semantic data on another wiki. Took me some time, but I think I tackled all of my problems - at least most of them. Next step for me is accessing xml and json data from some of our university’s web services. Problem is, I need basic auth to access these files. Is this even possible? If so, could anyone give me a clue as to where to start looking?

-- Tobi (talk) 10:43, 19 February 2014 (UTC)


 * Sorry for the very long delay on this. No, it's not possible - see below for one idea on how it could be implemented. Yaron Koren (talk) 12:27, 14 March 2014 (UTC)

ldap tabular result
Extension:External_Data says: It currently only handles a single row of results, and so is most appropriate for querying directories for data about individual users. For our use case we'd need tabular results. What needs to be done to modify the extensions behavior appropriately? --Seppl2013 (talk) 10:31, 22 February 2014 (UTC)


 * Sorry for the long delay. Unfortunately, I don't know much about the LDAP-accessing portion - all that code was written by someone else. Hopefully it's not hard to do. Yaron Koren (talk) 12:28, 14 March 2014 (UTC)

Using #get_web_data to retrieve data from file on the same wiki where login is required to read
If I understand things correctly, then if you have a wiki where users must log in in order to read, then the External Data won't be able to read the files that have been uploaded. I've seen the discussion here where you mention the only ways around this are to either create a web page to log into the wiki or use #get_db_data.

Is there some way I could patch the extension to get around this? One possibility that comes to mind is to create a separate account for the extension and then use these credentials for it to log in. This seems like a bit of a hack since the extension is part of the wiki but doesn't have permission to actually access it. Is there some other approach I should be looking into?

--ChrisDavis (talk) 12:56, 12 March 2014 (UTC)


 * It would be great if External Data could support something like that. It would most likely require using an account to log in, yes, as hack-ish as that sounds. I'm imagining something like the setup for #get_db_data, where the username and password, and possibly a login URL as well, are stored in LocalSettings.php for each site/wiki being accessed. I don't know hard or easy that would be to implement, especially given the different kinds of authentication setups different kinds of sites have. Yaron Koren (talk) 12:27, 14 March 2014 (UTC)

Use as RSS reader
I am thinking of using this as an RSS reader since existing RSS readers have various issues.

What's is the method to limit the number of posts with this extension? Acnetj (talk) 16:54, 23 March 2014 (UTC)


 * That's an interesting usage; if you can get it working, that's great. Unfortunately, there's no current way to limit results, although this would be easy to accomplish, by just adding a "limit=" parameter to #get_web_data, in the same way that #get_db_data has one now. Is that the only obstacle you know of to allowing External Data to serve as an RSS reader? Yaron Koren (talk) 17:49, 23 March 2014 (UTC)


 * I can get it working except that I cannot limit the data. The other thing is that the date and time from the RSS cannot be reformatted through existing functions.203.185.14.234 06:05, 24 March 2014 (UTC)


 * Right now I am switching to js based RSS reader through Widget and that gives me a better result.Acnetj (talk) 16:06, 24 March 2014 (UTC)


 * Alright, cool. Adding a "limit=" parameter sounds like a good idea in any case, though. Yaron Koren (talk) 16:26, 24 March 2014 (UTC)

Oracle database
I'm having problems setting up/verifying my connection to an external Oracle database. The database is local, Oracle XE 11g. I'm using BitNami MediaWiki 1.22.5. (And I'm a newbie at this)

This is what I have in LocalSettings.php: $edgDBServer['XE'] = "jdbc:oracle:thin:@//:1521/XE"; $edgDBServerType['XE'] = "oracle"; $edgDBName['XE'] = "XE";

This is my query:

I'm sure that the $edgDBServer value is wrong, but I have tried all kinds of combinations, with and without jdbc:oracle:thin, with and without /XE, and can't figure out what this should be. I either get an error about compiling php --with-oci8 option or "(Cannot contact the database server)". I can log into the database using PL/SQL Developer and SQL*Plus using the same user and password.

How do I figure out what the structure of this value should be?

Thanks! Parrotheadcali (talk) 15:46, 17 April 2014 (UTC)

#get_web_data - CSV Filters
So this might be a naive question but I have a CSV file with the following structure and data:

Here is what I have on the Wiki page:

So my question:

Is there a way to use a greater than; less than, etc filter instead of a filter that shows an equal comparison?

Thank you. Clifford.kasper (talk) 21:09, 8 May 2014 (UTC)


 * Unfortunately, no - the filtering capability of #get_web_data is quite limited. You'd have to do that kind of filtering on the server side (i.e., some kind of dynamic API to generate the CSV), or via SMW, by storing the CSV data as internal semantic data and then querying on that. Yaron Koren (talk) 21:26, 8 May 2014 (UTC)


 * Thanks for the information Yaron. --Clifford.kasper (talk) 13:50, 23 May 2014 (UTC)

Just copying every line of the input file to the page?
This is a great extension, but I've found some problems to make it work properly. I have a mediawiki which I'm feeding with the results of my PhD. I'd like to include the files with the raw results (csv), and then having them shown in a chart in the the page. So, I need just to copy every line in the input file into the page. To do this, I'm doing:





///////////example lines contained in the csv: ,80%,60%,40%,20% 0,1,3,5,8 ///////////

Which is not working. However, if I input directly the lines, it works (so, looks to me like a problem with externalData :S). In the page of the extension is stated that copying every row is supported directly by for_external_table, but nowhere is explained how to do it. Could somebody shed some light on this point?

Thank you.


 * I'm not familiar with &lt;pLines&gt;, but I'm guessing that it's a tag defined by a MediaWiki extension. The problem with such tag functions is that they're called before their contents are parsed - so what you're passing in to it is literally the string "#get_web_data...". Thankfully, there's an easy solution: you can use MediaWiki's built-in parser function #tag to call tag functions in a way so that their contents will be parsed correctly. Yaron Koren (talk) 12:35, 15 May 2014 (UTC)

There is a section on the pChart4mw extension page that shows how to use the charting functionality with parser functions. I think this will help you:


 * 1) Use the External Data call outside of the charting
 * 2) Use the Parser Function syntax

That should work for you. --Clifford.kasper (talk) 14:03, 23 May 2014 (UTC)

Issue with internal CSV lists since upgrade [RESOLVED]
My host just upgraded to PHP 5.4.4 and MySQL 5.5.37 and something odd happened after that. I'm using to get values internally from a CSV list on a wiki page (between includeonly tags). Those values are called through or  to produce a tabular output. This was working fine before the upgrade, but soon after this all values in the first column of each table went missing. It's possible to cheat one's way out of it by adding a dummy column in front, so in this case by adding a comma at the beginning of every row of the comma-separated list. Clearly something goes wrong with the way that list is read or parsed. Cavila (MW 1.22, MySQL 5.5.37-0, Php 5.4.4-14 squeeze, SMW 1.9.2, SF 2.7) 08:55, 20 May 2014 (UTC)


 * No idea - could you include here the calls you're making, and/or the data set? Yaron Koren (talk) 15:14, 20 May 2014 (UTC)

Yaron has looked into this and submitted a fix to ED_Utils.php (git link). Thanks for this! Cavila (MW 1.22, MySQL 5.5.37-0, Php 5.4.4-14 squeeze, SMW 1.9.2, SF 2.7) 10:38, 24 May 2014 (UTC)

{{#iferror: broken with this extension
@Yaron_Koren The error message is not wrapped in something with class="error" so I can't feed it through {{#iferror: and get useful results (demonstration).

Also, https://git.wikimedia.org/metrics/mediawiki%2Fextensions%2FExternal_Data which is linked from the main page does not work, and https://github.com/wikimedia/mediawiki-extensions-ExternalData is a mirror which I can't add issues to, so it's slightly tricky to find a better place to report this.


 * Oh, I never thought about that #iferror thing - that's a good point. The places to report bugs are here, or on Bugzilla, or on the SMW mailing list. Yaron Koren (talk) 01:08, 16 July 2014 (UTC)


 * Okay, reported on bugzilla (and signed up to SMW mailing list, but I figure reporting on both is excessive). --etesp (talk) 11:36, 16 July 2014 (UTC)


 * Yes, reporting it three times would probably be excessive. Yaron Koren (talk) 13:58, 16 July 2014 (UTC)

History
Does this extension offer a standard way to preserve old external data upon updating it, that is, to make  add new internal objects rather than overwrite old ones? If it does, will old objects be preserved after running ? Alex Mashin (talk) 08:27, 12 August 2014 (UTC)


 * No. Yaron Koren (talk) 14:48, 12 August 2014 (UTC)

Caching (probably) issue
Hi,

I just set up this extension on my wiki (MW 1.17.0). All works well, but I ran into an issue. I've set up a template with the following content


 * Template:GetIfSpeed

I then call to this template from within a normal page as. This returns to me the corresponding value for the id from backend db. No issues there.

However... if I call to this template twice within the same page, with two different IDs — IDs that should return two distinct values — I get the value returned by initial call as two output values. For example, consider the following.

The above inclusion in any page should yield two different values. Instead, I get the value returned by the first inclusion as two output values. Use one at time, I get different values.

I do have MagicNoCache (v1.3.0) and did try using __NOCACHE__ in both the template and the page in which I invoked the template. Result remains the same.

Any clues as to what is causing this, and how it can be fixed? --Wikimantis (talk) 18:16, 13 August 2014 (UTC)


 * No, it's not a caching issue - rather, the issue (I'm pretty sure) is the 2nd #get_db_data call doesn't override the data from the 1st one, but just appends to it; so the first value for that variable will remain the same both times. The way to get around it is to add " " to the end of the template. Yaron Koren (talk) 20:18, 13 August 2014 (UTC)


 * Oh this information is actually in the extension page itself also. Sorry didn't go through that completely. Thanks anyways. Works like a charm now :) --Wikimantis (talk) 04:53, 14 August 2014 (UTC)

This could be a bug.
I have encountered something weird.

Consider the following:

Which yields the following valid SQL query

This is fine. Nothing wrong here.

But watch what happens when I throw in a second column.

Which yields below SQL query. Obviously this SQL query would fail against the database SELECT CASE WHEN ps.ifInOctets_rate < 125056 THEN CONCAT(ROUND((ps.ifInOctets_rate*8)/1024,2)," Kbps") END, CASE WHEN ps.ifOutOctets_rate < 125056 THEN CONCAT(ROUND((ps.ifInOctets_rate*8)/1024,2)," Kbps") END FROM devices dv INNER JOIN `ports` p ON dv.device_id=p.device_id INNER JOIN `ports-state` ps ON p.port_id=ps.port_id WHERE dv.device_id=4 and p.ifType in ('ethernetCsmacd', 'pos' )

What I noticed is if I remove the nested parentheses from the second SQL field, extension variable (highlighted above in yellow) no longer exists in the generated SQL statement. Change this: CONCAT(ROUND((ps.ifInOctets_rate*8)/1024,2)," Kbps") To:  CONCAT(ROUNDps.ifInOctets_rate*8/1024,2," Kbps")


 * I'm surprised that any of that worked, actually - I haven't heard before of someone embedding SQL logic into #get_db_data. I'm glad it works to any extent. Just to clarify - once you make that parentheses change, does the whole thing work? Yaron Koren (talk) 15:29, 20 August 2014 (UTC)


 * Well no, it wouldn't work... because SQL would then have syntax errors without those parenthesis. Oh and this is just a tiny portion of SQL I'm parsing. All of which works, except for this particular bit. --Wikimantis (talk) 15:54, 20 August 2014 (UTC)


 * Oh, I see. I have no idea, then - maybe create stored procedures for some or all of that, to make the #get_db_data call simpler? Yaron Koren (talk) 20:55, 20 August 2014 (UTC)


 * Oh yeah I have already opted in for custom functions within MySQL to handle that.


 * In case you (or anyone else for that matter) are wondering what's up with these these queries, I am actually using External Data extension alongside graphviz extension to visualize the network devices we monitor with Observium monitoring tool. Basically I use SQL to generate dot language output and then pass the info to graphviz. For instance, I would have a template in wiki, which takes in the device_id of any of the devices I have in Observium, then uses SQL to construct Dot statements.




 * Something like above gives me this output within my wiki. And so long as Observium is doing it's 5 minute SNMP walks, I am able to get a near-realtime visualisation of the device. Outward color is port output, the other half is port input. Hovering mouse over the link gives additional info. External Data extension is wonderful :) --Wikimantis (talk) 15:46, 21 August 2014 (UTC)


 * Wow! That's a very cool usage. Yaron Koren (talk) 23:38, 21 August 2014 (UTC)

Bug with CSV file processing
I think I've discovered a bug with the CSV file processing. It looks like #get_file_data wants each line of data to contain the specific number of commas, or else it doesn't write data to each of the data columns.

For example, consider the following CSV file:

When we look at the data (conveniently here as a table), it displays as expected: {{{!}} class="wikitable" ! Header 1 ! Header 2 ! Header 3 ! Header 4 ! Header 5
 * || 1 || 1 || 1 ||
 * 2 || 2 || 2 || 2 ||
 * 3 || || 3 || 3 || 3
 * 4 || || 4 || || 4
 * || 5 || || 5 ||
 * 7 || 7 || || 7 ||
 * }
 * 4 || || 4 || || 4
 * || 5 || || 5 ||
 * 7 || 7 || || 7 ||
 * }
 * 7 || 7 || || 7 ||
 * }
 * 7 || 7 || || 7 ||
 * }

However, when we modify the CSV file to make it less strict (i.e. have lines that contain fewer than 4 commas), the following occurs (notice the blank line between row "1" and "2"):

When imported as a table, it displays a little different (notice the first column): {{{!}} class="wikitable" ! Header 1 ! Header 2 ! Header 3 ! Header 4 ! Header 5
 * || 1 || 1 || 1 ||
 * || 2 || 2 || 2 ||
 * 2 || || 3 || 3 || 3
 * 3 || || 4 || || 4
 * 4 || 5 || || 5 ||
 * || 7 || || 7 ||
 * 7 || || || ||
 * }
 * 4 || 5 || || 5 ||
 * || 7 || || 7 ||
 * 7 || || || ||
 * }
 * || 7 || || 7 ||
 * 7 || || || ||
 * }
 * 7 || || || ||
 * }

It appears that only the blank line was "seen" (for lack of a better word) only by the Header 1 data, which caused that column to shift down, while the other columns were unaffected. This is inconsistent with the recommendations for CSV processing in RFC4180. You can see in Excel, or other CSV viewer, that when you open the second CSV, the blank line causes all columns to display a blank line, even though no commas are present in that line.

Furthermore, this problem isn't related to just blank lines. If you leave off trailing commas:

Then the table shows up as: {{{!}} class="wikitable" ! Header 1 ! Header 2 ! Header 3 ! Header 4 ! Header 5
 * || 1 || 1 || 1 ||3
 * 2 || 2 || 2 || 2 ||4
 * 3 || || 3 || 3 ||
 * 4 || || 4 || ||
 * || 5 || || 5 ||
 * || 7 || || 7 ||
 * 7 || || || ||
 * }
 * || 5 || || 5 ||
 * || 7 || || 7 ||
 * 7 || || || ||
 * }
 * 7 || || || ||
 * }
 * }

Notice the first and last columns. However, this table should look just like the first table. --Nick2253 (talk) 16:02, 5 September 2014 (UTC)


 * So External Data's processing of CSV is less "forgiving" than other CSV parsers'. Is that a problem? Yaron Koren (talk) 16:32, 5 September 2014 (UTC)


 * I probably shouldn't have used "strict"; it's not exactly the right word. There is no "formal" definition for the CSV format.  However, the de facto standard that's allowed by every CSV viewer I've been able to find processes lines that contain insufficient separators as shown in the first example.  And a number of older programs that I've used (including some used by my company today) don't export trailing commas.  In some way, External Data needs to handle this possiblity:  either processing the CSV correctly, refusing to process the CSV, or warning the user.  And, unfortunately, a warning only comes up if the number of data entries is greater than the number of header rows.


 * Now, I've been looking over the code, and I think a small tweak to the  function would fix the problem.  I've tested this, and it works for my setup, but I'm not an expert at PHP programming, so please double check this. --Nick2253 (talk) 22:06, 5 September 2014 (UTC)


 * Oh, I didn't know that some CSV-exporting programs don't output trailing commas. That seems like a mistake, but if that's the way it is then so be it. That indeed is not very good PHP (no offense), in that it would produce lots of "notice" messages, but I know how to change the code to get it to work without trailing commas. Yaron Koren (talk) 22:43, 5 September 2014 (UTC)
 * No offense taken! I don't have a whole lot of experience with PHP, so I'm happy to defer to the experts like yourself.  And you're right about it being a mistake.  The RFC for CSV says that lines "should" have the same number of delimiters, but unfortunately "should" != "must", and lazy programmers rejoice.  Thanks for following up! --Nick2253 (talk) 22:57, 5 September 2014 (UTC)


 * Alright, I just added this capability into the code in Git, so hopefully it will work now. Thanks for the patch you created - it definitely saved me some time when I went to make the change. Yaron Koren (talk) 01:11, 8 September 2014 (UTC)

Setting a semantic property using an external value and reflecting it on Semantic Form
Hi Yaron, I'm calling a webservice which returns a value, which I then #set as a semantic property. It works great and the factbox reflects the change.

However, when I edit the page, the associated semantic form does not reflect the updated value of the semantic property. Any hints?

Thanks! --Jnatividad (talk) 02:06, 5 December 2014 (UTC)


 * Hi - sorry, I don't understand; what should the form show? Yaron Koren (talk) 02:56, 5 December 2014 (UTC)


 * I have a template that uses ExternalData to call a webservice that auto-tags a PDF. In the associated Semantic Form, there are classification fields that with with certain allowed values that are displayed as checkboxes.


 * Right now, I parse the webservice result and do a #set to these classification fields (e.g. Region: has valid values of Global, Africa, Arab States, Europe, etc.). And it all seems to work, as the Factbox displays the values I set.


 * However, when I click Edit, the checkboxes are not checked.


 * To be specific, in the template, I do:




 * where tagmappings is "Region=Asia and the Pacific;Africa;Arab States;Global" returned by the auto-tagger webservice.


 * It displays perfectly in the Factbox, but when I edit the article, the "Asia and the Pacific; Africa; Arab States and Global" checkboxes are not set.


 * Any hints? Thanks in advance! --Jnatividad (talk) 19:06, 5 December 2014 (UTC)


 * I'm still not sure I understand - do you want this region data to come from the web service, or from the wiki page? If it's the former, why is there a field for it in the form at all? Or do you want the form/template to be able to override the external data? Yaron Koren (talk) 21:13, 5 December 2014 (UTC)


 * They want the ability to specify values for the fields manually. But if the auto-tagging service detects tags with a high degree of confidence, they want it to set those tags automatically.  For example, they manualy tagged an article as in Region "Global", but the auto-tagging service detected mentions of several countries in Africa.  The desired behavior is for the Region "Africa" to be set as well (which works per the factbox).  But then, when they edit the form, the field Region is only set to "Global".  --Jnatividad (talk) 23:04, 5 December 2014 (UTC)


 * Okay, so it's a combination of data from the page and from outside. But let me remind you that a form is just a mechanism for editing wikitext. Perhaps the solution is just to put in more explanatory information into the form, above or below the input, saying something like "the region values you input will be combined with data from X"; you could even have a #get_web_data call in the form, so that the user can see what their inputted data will be combined with. Yaron Koren (talk) 23:29, 5 December 2014 (UTC)

Problem with "CSV with header"
Hello there, I have a problem with the Arrays created by using the "CSV with header" function at #get_file_data. Is there any possibility to make a connection between the values of a single line or at least extract a single value out of the whole table? Because right now I can only use the Arrays of the column values. --Miles Fides (talk) 14:42, 16 December 2014 (UTC)


 * You can simply put all the values together within the #for_external_table call, if I understand the question correctly... Yaron Koren (talk) 16:05, 16 December 2014 (UTC)


 * I don't only want to display the values in a table but use the lines of this table for further tasks. For this, I have to declare new variables as an array for each line but I don't know how to do it. Right now, the local variables just make an array with the values of a column, not a line. Do you know, what I mean? --Miles Fides (talk) 17:10, 16 December 2014 (UTC)


 * I guess the solution might depend on what you want to do with those variables, but the best solution might be to store all the data via SMW, using #store_external_table - then you can really do a lot of processing and manipulation of the data. Yaron Koren (talk) 18:16, 16 December 2014 (UTC)

Directory Path doesn't work
I got another Problem by the way. Using the #get_file_data with the file-path, I have no problems with opening a csv-File. With the directory path though, I always get an error message that the File I want to open is not allowed for the directory. For example: When I write ..."file=Example"... with the Declaration $edgFilePath['Example'] = "$IP/any_folder/test.csv"; it works But When I write ..."directory=Example |file name=test.csv"... with the Declaration $edgDirectoryPath['Example'] = "$IP/any_folder"; it doesn't work anymore...so how could I use the Directory Path? --Miles Fides (talk) 15:57, 16 December 2014 (UTC)


 * Sorry for the delay - I just looked into this now. The documentation should probably spell this out, but you need to have the directory name be the full path that shows up when you call "pwd" within that directory. If your directory path in LocalSettings.php starts with "~", for instance. it won't work. I'm guessing that's the issue for you - could that be it? Yaron Koren (talk) 04:37, 24 December 2014 (UTC)

sql literals and #switch: or #ifeq:
When selecting a literal e.g. date=formality='Sie' and then trying to use a parserfunction to work on the result with a parser function like switch:

I get "unknown formality" - which is strange since the formality retrieved should be "Sie" or "'Sie'". It's even displayed as "Sie" but the content seems to be something else. What might be going on here? I tried to debug this by switching on $wgDebugDumpSql = true; but this has no effect. How can I find out what the actual sql statement is that is produced?

I tried debugging using the #len: string function after setting $wgPFEnableStringFunctions = true; and I get a length of 0 for - so the string function thinks the value is empty ... --Special:Contributions/Wolfgang Fahl 12:43, 23 December 2014 (UTC)


 * I don't think you can set a variable to a literal in any of the #get_..._data functions, by using quote marks - it's an interesting idea, but as far as I know it's not supported. Yaron Koren (talk) 14:24, 23 December 2014 (UTC)
 * Hi Yaron - thanks for answering column='value' is perfectly valid SQL - i assume this is not the base issue. I worked around the problem by looping diffently - i am using {{#display_external_table: now and there the value is transferred correctly. The trouble was in {{#for_external_table: the two seem to behave differently - Wolfgang Fahl--91.40.113.127 18:18, 23 December 2014 (UTC)


 * I know "column='value'" is valid SQL, but #get_web_data is not simply a wrapper around SQL. Anyway, I'm glad you found a solution. Yaron Koren (talk) 20:38, 23 December 2014 (UTC)