Extension talk:External Data/Archive 2011

From mediawiki.org
Latest comment: 11 years ago by Yaron Koren in topic Not Working

PHP Offset error

Hi, I installed 1.3.2 and on my MAC with the MAMP stack all went well.

Using the same pages on windows 2008 r2 with iis7.5, with Mediawiki 1.16.4, MySQL 5.2.17 (fastCGI), php 5.3.1 I changed the url to use the fullurl parser function and when submitting the changes I get the following:

PHP Notice: Undefined offset: 1 in D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php on line 343 PHP Notice: Undefined offset: 1 in D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php on line 343 PHP Notice: Undefined offset: 1 in D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php on line 343 PHP Notice: Undefined offset: 1 in D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php on line 343 PHP Notice: Undefined offset: 1 in D:\www\IIS7\http-root\ict-test-.... . . . wiki\extensions\ExternalData\ED_Utils.php on line 343 PHP Warning: Cannot modify header information - headers already sent by (output started at D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php:343) in D:\www\IIS7\http-root\ict-test-wiki\includes\WebResponse.php on line 16 PHP Warning: Cannot modify header information - headers already sent by (output started at D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php:343) in D:\www\IIS7\http-root\ict-test-wiki\includes\WebResponse.php on line 16 PHP Warning: Cannot modify header information - headers already sent by (output started at D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php:343) in D:\www\IIS7\http-root\ict-test-wiki\includes\WebResponse.php on line 16 PHP Warning: Cannot modify header information - headers already sent by (output started at D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php:343) in D:\www\IIS7\http-root\ict-test-wiki\includes\WebResponse.php on line 16 PHP Warning: Cannot modify header information - headers already sent by (output started at D:\www\IIS7\http-root\ict-test-wiki\extensions\ExternalData\ED_Utils.php:343) in D:\www\IIS7\http-root\ict-test-wiki\includes\WebResponse.php on line 16

The changes are saved I believe because I can navigate back to that page, edit again and see the changes in the page.

Hope you can help.


Someone else had essentially the same problem on the mailing list - the problem is coming because not every line in the CSV being accessed has the same number of values - essentially, it's malformed CSV. Check to make sure that the URL specified is accessible, and contains only CSV. Yaron Koren 19:12, 11 September 2011 (UTC)Reply[reply]

strange error message

Notice: Use of undefined constant CURLOPT_SSL_VERIFYPEER - assumed 'CURLOPT_SSL_VERIFYPEER' in /var/lib/mediawiki/extensions/ExternalData/ED_Utils.php on line 400

I have allowed the ip adres of the csv file in LocalSettings.php

using this code

{{#get_web_data: url= |format=CSV with header |data=Incident ID=Incident ID,Open Time=Open Time,Update Time=Update Time,Resolved Time=Resolved Time,Alert Status=Alert Status,Asset Tag=Asset Tag,Assignment Group=Assignment Group,Brief Description=Brief Description }} {| class="wikitable sortable" width="95%" ! Incident ID ! Open Time ! Update Time ! Resolved Time ! Alert Status ! Asset Tag ! Assignment Group ! Brief Description {{#for_external_table:<nowiki/> {{!}}- {{!}} {{{Incident ID}}} {{!}} {{{Open Time}}} {{!}} {{{Update Time}}} {{!}} {{{Resolved Time}}} {{!}} {{{Alert Status}}} {{!}} {{{Asset Tag}}} {{!}} {{{Assignment Group}}} {{!}} {{{Brief Description}}} }} |}

It seems to happen when I try to put three or more imports in one page. It blanks the article completely as well. Anyone have any ideas?

Are you using the latest version (1.2.3)? I thought that problem was fixed in 1.2.3. Yaron Koren 14:35, 23 March 2011 (UTC)Reply[reply]

Not Working

any idea why? it's installed in Special:Version, but it doesn't show anything (xml works fine): http://www.komentarz.org/pl/test

{{#get_web_data: url=http://www.komentarz.org/smw/api.php?action=query&list=threads&thpage=Dyskusja:oceny_i_opinie&format=xml |format=xml }}

You also need to set the "data" parameter. Yaron Koren 22:36, 2 February 2011 (UTC)Reply[reply]

so if this is my xml feed, what data should i type? name of the xml node? http://www.komentarz.org/smw/api.php?action=query&list=threads&format=xml

{{#get_web_data: url=http://www.komentarz.org/smw/api.php?action=query&list=threads&format=xml |format=xml |data=thread=subject }}

I don't want to sound rude, but - have you read the documentation? Yaron Koren 00:59, 4 February 2011 (UTC)Reply[reply]

sure, i found this fragment but i think a clear example of getting rss xml feed in documentation would be helpful. i still can't get it to work.your extension would be a perfect rss reader (other extension doesn't work or are unstable): For data from XML sources, the variable names are determined by both tag and attribute names. For example, given the following XML text:

   <fruit type="Apple"><color>red</color></fruit>

the variable type would have the value Apple, and the variable color would have the value red. Similarly, the following XML text would be interpreted as a table of values defining two variables named type and color:

       <fruit type="Apple"><color>red</color></fruit>
       <fruit type="Kiwi"><color>brown</color></fruit>
Well, in any case, a more reasonable value for "data=" would be "data=thread=thread,subject=subject". Yaron Koren 23:37, 4 February 2011 (UTC)Reply[reply]
How about mediawiki's api?
<?xml version="1.0"?>
      <n from="Template:Warrior_Roots_EO_SNP_comparison_column" to="Template:Warrior Roots EO SNP comparison column" />
      <page ns="0" title="API" missing="" />
      <page pageid="1269" ns="10" title="Template:Warrior Roots EO SNP comparison column">
          <rev timestamp="2012-02-09T00:10:26Z" />

I want to pull rev_timestamp, but if I am understanding the documentation right, I will not be able to do it. When I use |data=rev_timestamp=rev_timestamp - No value is reported and I get an array error. I think this is because I would need the xml to formed like this.

          <rev timestamp>
          </rev timestamp>

Is that correct? Thanks Hutchy68 14:45, 9 February 2012 (UTC)Reply[reply]

I believe tag attributes are pulled too, not just tag values - did you try it? Yaron Koren 14:51, 9 February 2012 (UTC)Reply[reply]
I think I did,

it returns nothing except an error. XML error: 1 at line Array. Thanks Hutchy68 16:21, 9 February 2012 (UTC)Reply[reply]

It looks like you're using the wrong URL - this is what you have, but that just shows an error message. It looks like you should change "XML" to "xml". Yaron Koren 18:03, 9 February 2012 (UTC)Reply[reply]
Oops that's my fault for posting that here. In the actual query it is a lowercase xml at the end of the |url= which throws the 1 at line Array error. I've also tried the json format, which doesn't throw the error, but doesn't put the value into the variable either. 18:31, 9 February 2012 (UTC)
GOT IT! The api call was confusing it. I think it was trying to put the information into an array. Here is what I came up with and the value is now being passed to the local_variable.

Now the question is if I call this 4 or 5 times for a page with 4 or 5 templates, how bad will the load time be. Thanks Hutchy68 19:30, 9 February 2012 (UTC)Reply[reply]

Cool! Yaron Koren 19:31, 9 February 2012 (UTC)Reply[reply]

Sort results

Any way to sort results returned from #get_db_data?

I don't think so, but it might definitely be worthwhile to have a way for users to add an "ORDER BY" command to the SELECT call... Yaron Koren 01:00, 4 February 2011 (UTC)Reply[reply]
It certainly would! Could you provide that? Matheus Garcia 17:53, 15 February 2011 (UTC)Reply[reply]
This option was added, in version 1.3. Yaron Koren 23:22, 3 May 2011 (UTC)Reply[reply]


I am not sure how to ask this correctly. I have upgraded to PHP 5.3.5 which no longer supports mssql (from what I can tell). So I have downloaded the Microsoft support for SQL Server and PHP and added the necessary information to php.ini.

Does MediaWiki support the interface to SQL Server through the new library sqlsrv. The mediawiki includes/db directory has the DatabaseMssql.php file, but the interface is mssql_function calls and not sqlsrv_function calls.

So am I out of luck using PHP 5.3.5 and MS SQL Server?

Wolcott 20:32, 18 February 2011 (UTC)Reply[reply]

Hi, I really don't know anything about MediaWiki's suppport for SQLServer, but it sounds like you might be out of luck - at least until MediaWiki's support gets improved. You could always submit a bug report for it... Yaron Koren 22:27, 18 February 2011 (UTC)Reply[reply]

Well a little further research this morning and I found a Bugzilla Report 22093 which discussed the new interface to Microsoft SQL Server. . With my limited ability I was able to find the new code in the 1.17 trunk (includes/db/DatabaseMssql.php. It is the same file name as before, but different guts. I have not been able to find a 1.16 version and I can't get the 1.17 version to work under 1.16 because of some interface changes. I have emailed Ryan Biesemeyer for any further suggestions with 1.16... Wolcott 15:25, 21 February 2011 (UTC)Reply[reply]
Ryan Biesemeyer responded that MediaWiki team still has some concerns with the compatibility. He will attempt to see what it would take to back port to 1.16, but no promises. I have backport a minimalistic number of functions to make it work with my requirements ... Wolcott 17:19, 22 February 2011 (UTC)Reply[reply]

get_db_data - Data parsing

I wasn't thinking how you parsed the data section before trying to attempt and format a date via the database. Issuing the query below via SQL Server produces Feb 22, 2011 instead of 2011-02-17T12:04:00Z

SELECT Title_Txt, CONVERT(VARCHAR(12), Documentation_Rev_Dt, 107) AS Rev_Dt FROM Tool WHERE Acronym_Txt ='ITEM'

But obviously trying to perform it via get_db_data the parser messes things up because it uses the comma to parse on KeyValue mappings.




|where= Acronym_Txt ='ITEM'

|data=mstcName=Title_Txt,mstcRevDate=CONVERT(VARCHAR(12), Documentation_Rev_Dt, 107) AS Rev_Dt


Any thought to allow us to define the seperator? Would it be better to define it as a localsettings token and thus make it global or add it to #get_db_data: as seperator so that it is unique to that query. Wolcott 18:49, 22 February 2011 (UTC)Reply[reply]

Ah - good point. Let me look into it. Maybe with some simple parenthesis-matching, it'll still be possible to use commas. Yaron Koren 21:01, 22 February 2011 (UTC)Reply[reply]
Yaron not a big hurry. I am able to use the #time parser to format the date the way I want to present it. Wolcott 01:41, 27 February 2011 (UTC)Reply[reply]
I would be interested in this solution, too. Volker
This should now be working in the new version, 1.3 - commas within parentheses and quotes now get ignored by the parser. Yaron Koren 22:17, 22 April 2011 (UTC)Reply[reply]

Evaluating inside #for_external_table

Can I evaluate a field within the #for_external_table parser?

I was trying to retrieve all the study names and descriptions based on my page name. That works just fine. I then was going to iterate over the data (multiple rows) and display the Study Name and if there is a description display it also.

|from=Tool_Study join Study_Ref on Tool_Study.Study_Id = Study_Ref.Study_Id
|where= Acronym_Txt ='{{PAGENAME}}'
|data=mstcStudyNm=Study_Ref.Study_Nm, mstcStudyDesc=Study_Ref.Study_Desc_Txt

== Studies (U)==
{{#for_external_table:<nowiki />
* {{{mstcStudyNm}}} {{#if: {{{mstcStudyDesc}}} | <i>[Description: {{{mstcStudyDesc}}}]</i>|}}

How can I test to see if the string is not empty inside the #for_external_table parser function... Wolcott 20:38, 2 March 2011 (UTC)Reply[reply]

I don't know if this would work - I've never tried it. Yaron Koren 22:15, 2 March 2011 (UTC)Reply[reply]
It doesn't. That is why I was asking. I was looking at the function in the code and trying to understand how you processed the information... Wolcott 00:19, 3 March 2011 (UTC)Reply[reply]

Clearing Out Data

I built a template that calls the database based on particular ID. A page calls the template 5 times passing in different ID's. The problem was the data does not clear itself out between calls and the template was not working correctly. I added a new parser funtion #clear_external_data: that resets the $edgValues to an empty array();

Thus at the end of the template I call #clear_external_data: to reset the data since I am done with it. Everything works now. Is this something that you can add in the next version? Wolcott 10:25, 3 March 2011 (UTC)Reply[reply]

Oh, interesting - that makes sense. Sure, can you send me the code, to yaron57@gmail.com? Yaron Koren 15:06, 3 March 2011 (UTC)Reply[reply]
I think this is what I may need to help with my page. I am trying to build multiple different pages based on a template page. So the only thing on the page is a variable define and a template call. The page seems to build correctly and pulls all the correct information from the differnt databases. However when the values in the database change and you reload the pages they are not updated. However if I go to the template page and just click edit then resave. Go back to the page in question it will be updated. Any help? Also I thought that this clear data function might help. However everytime I try to use it, wiki just prints it out like text. Would you mind showing an example so that I can see that I am using the correct Sytax?. --Mezada 17:39, 6 July 2011 (UTC)Reply[reply]

ED (1.3.1)

Clearing Out Data is out of order. Apache log: "" valid magic thingie etc. Reason: old ExternalData.i18n.magic.php. Default engl. entry for Clearing Out Data is missing. With adding this everything is fine. hg_supname@web.de

parser function {{#clear_external_data:}} problem

We had a small problem but got it fixed. We run MediaWiki (1.16.5) on a MS Server 2003 server with Apache 2.2

  • Problem: Same problem as other users, when {{#get_db_data:}} was called multiple times on the same page the data of the previous call was not cleared and was included in the result set of the next call. Originaly we used version 1.3.1 but the problem was that in this version the parser function {{#clear_external_data:}} was not working (yet).
  • Solution: After contact with Yaron Koren (thanks) we upgraded to External Data extension version 1.3.2 but we also needed to enable the PHP extension multi-byte string. This was done by running the PHP windows installer again and select the Change button. Then enable Multi-Byte Strings in the extensions list. When the installer is ready restart Apache. After this the parser function {{#clear_external_data:}} worked like a charm!

Jongfeli 11:56, 3 November 2011 (UTC)Reply[reply]

Parser function evaluations inside #for_external_table


I asked the question above, but have since dug a little deeper and wanted to go over it one more time to see if you think there is any way to get this to work or to explain MediaWiki process of evaluating pages.

Below is a simplified version of the page. We are retrieving multiple rows out of the database and attempting to interpret the data to build a table.

|where=grp_nm='Core Analysis Tools' and subgrp_nm='AB Tools'
{{#for_external_table:<nowiki />
{{{!}} border="1"
{{!}} [[MS:{{{toolAcron}}} | {{{toolAcron}}} ]]
{{!}} {{{PA}}} 
{{!}} {{#switch: {{{PA}}} | Campaign=C | N }}

The problem above is the the third column (Switch parser function) is not being evaluated within the #for_external_table. It is being evaluated before and the result is being passed into your parser function. I printed the value of the $expression variable and got the following.

2011-03-16 13:22:26  wikidac16-wac_: Expression - ?UNIQ760a39896f6503dc-nowiki-00000000-QINU?
{| border="1"
|- scope="col" width="130" 
| [[MS:{{{toolAcron}}} | {{{toolAcron}}} ]]
| {{{PA}}} 
| N

As you can see 'N' is being passed into your parser function and not the actual code to be executed. Obviously when the switch function is evaluated outside your funtion {{{PA}}} is not set yet so 'N' is the correct value.

Is there anyway to delay the switch function from being evaluated until we are inside your code?

How does MediaWiki process pages?

Wolcott 13:51, 16 March 2011 (UTC)Reply[reply]

After writing this I began to search MediaWiki for some possible solutions and found an extension Control Struction Functions. It no longer works after 1.12, but it mentions the wiki markup preprocessor and how code is evaluated. More research needed ... Wolcott 14:24, 16 March 2011 (UTC)Reply[reply]

MediaWiki 1.17 mssql support

If you're using ExternalData to fetch data from a Microsoft SQL Server using the mssql php module, you'll have two choice to make it work in MediaWiki 1.17+.

  1. Install Microsoft Drivers for PHP for SQL Server (only work in Windows).
  2. Install Extension:MSSQLBackCompat (will work the same way mediawiki 1.16 worked except the DBServerType must be "mssqlold").

--Solitarius 00:26, 2 April 2011 (UTC)Reply[reply]

Thanks for finding this out, and for adding it to the documentation. Yaron Koren 18:22, 3 April 2011 (UTC)Reply[reply]

#for_external_table and [[Creates pages with form::]] issue

Not sure of an easy way to explain this, but here it goes.

I use External Data to retrieve a list of tools out of the external database and display them on a wiki page.

* [[MS:{{{mstcAcronymTxt}}}|{{{mstcAcronymTxt}}}]]

That works great (see below) [left upper table "AB Tools"]

But then we added a new tool in the external database and a red link showed up. So I decided to have a page automatically created if it was a red link by adding the following code.

First when I retrieve the data I build the list and then loop through it again adding the tools to a property.

* [[MS:{{{mstcAcronymTxt}}}|{{{mstcAcronymTxt}}}]]
[[MS Tools::MS:{{{mstcAcronymTxt}}}| ]]

Know the Property "MS Tools" contains all the tools. I then edit the property page for "MS Tools" adding:

[[Has Type::Page]]
[[Creates pages with form::MS Registry Item]]

The MS Registry Form just calls a generic template MS Registry Item that uses the pagename to retrieve detailed information about the tool and display it. So know when I add a new tool in the external database I see a red link (in the background a page is automatically being created, hopefully in quick fashion), so when I click on the red link the page exists. Yea, not just yet.

When there is a red link in the "AB Tools" list additional garabge characters are also show...

Any thoughts?

Hi - any time you see printouts containing UNIQ/QINU, it means that something has gone wrong with the parser. I'm far from an expert on the MediaWiki parser, so I don't know what specifically is going wrong, but - why not have a single #for_external_table call, that displays the values and sets the property at the same time? That seems like it might have a better chance of working. Yaron Koren 22:08, 5 April 2011 (UTC)Reply[reply]
Yea I tried that also. Same result with UNIQ and an extra line feed is embedded. I will perform some debugs an alternates tomorrow at the office. I have had issues with the #for_external_table parser were I can't do extra calls from within the parse (see above).
* [[MS:{{{mstcAcronymTxt}}}|{{{mstcAcronymTxt}}}]]
[[MS Tools::MS:{{{mstcAcronymTxt}}}| ]]

So when you say "something has gone wrong with the parser" should I debug within the #for_external_table: function? If you look at the discussion "Parser function evaluations inside #for_external_table" above I had a problem with #switch inside the #for_external_table.

Wolcott 00:17, 6 April 2011 (UTC)Reply[reply]

I really don't know how to debug it. What I meant, though, was just calling:
* [[MS Tools::MS:{{{mstcAcronymTxt}}}]]
Did you try that? Yaron Koren 00:24, 6 April 2011 (UTC)Reply[reply]
Nope I will try that in the morning? Wolcott 01:31, 6 April 2011 (UTC)Reply[reply]
I still get the UNIQ tags embedded in the name when looping through the results using #for_external_table. I will add some debug tags to the parser function to try and determine what is happening. Just for kicks I performed a couple of other tests.
  • Removed the #for_external_table parser function and just hardcoded a couple of pages * [[MS Tools::MS:EADSIM|EADSIM]] and everything looked and worked fine.
  • Removed the #for_external_table parser function and just retrieved the first tool via * [[MS Tools::MS:{{#external_value:mstcAcronymTxt}}|{{#external_value:mstcAcronymTxt}}]] and everything looked and worked fine.
  • If I add back the parser function #for_external_table, but remove the command from the property page to [[Creates pages with form::MS Registry Item]] the list print fine without the UNIQ's, the link is red, but of course the page is not auto created. So what is the association with the parser function #for_external_table and [[Creates pages with form::MS Registry Item]] that is causing the issue.
Any thoughts on where to concentrate next? Wolcott 13:49, 6 April 2011 (UTC)Reply[reply]
Ah, okay. I would try using a call to #set instead, then. Yaron Koren 15:32, 6 April 2011 (UTC)Reply[reply]
Ok I got excited, but I can not get #set to evaluate within #for_external_table parser function.
{{#set:MS Tools=MS:{{{mstcAcronymTxt}}} }}
This is the same problem that I was having above trying to get another parser function to evaluate inside the #for_external_table parser function. If I hard code the #set to EADSIM the property is properly set. But by giving it the {{{mstcAcronymTxt}}} it is blank since it seems to be evaluated outside the #for_external_table and not during the iteration of the loop. Do you know of a way to delay the evaluation of parser functions or can you duplicate the problem in a simple External Data page? Wolcott 17:47, 6 April 2011 (UTC)Reply[reply]
Yeah, I can see how that wouldn't work. I guess you could try calling #store_external_table instead of #for_external_table (you need the Semantic Internal Objects extension), but I'm not sure if that'll work either, for auto-creating pages. Yaron Koren 22:40, 6 April 2011 (UTC)Reply[reply]

ldap-query (retrieving a result set)

Quote: Note that #get_ldap_data will only retrieve one result.

Is it planned to get a result set too? Example: Read all group-memberships of a specific user? (ldap-attribute: memberof).

--Rolze 12:00, 17 May 2011 (UTC)Reply[reply]

No, there's no plan to add it in, unfortunately. Yaron Koren 14:48, 20 May 2011 (UTC)Reply[reply]

problem in installation of external data extension

Dear all.

i am facing problem to configure externaldata extension, i am using oracle database.can anyone tell me how i configure it.article insert and deletion working fine. but searching and database related extension doesnot work properly. can we say mediawiki cant support oracle. please help me to get out of this problem.


i got error 942:ORA-00942: table or view does not exist.

I've never tried it myself, but I'm told that Oracle support works with MW>=1.16. What version of MediaWiki are you using? And maybe you typed the table name wrong, or something? Yaron Koren 11:49, 23 May 2011 (UTC)Reply[reply]

problem is sql file available for oracle.so i created manually.how we configure such extension which is using .sql file in oracle.i m also confusion in localsettings variables.

here what i configure in localsettings is:

 $edgDBServer['ID'] = "localhost";
 $edgDBServerType['ID'] = "oracle"; 
 $edgDBName['ID'] = "employeesDatabase";
 $edgDBUser['ID'] = "system";
 $edgDBPass['ID'] = "123";

my database instance is orcl1, table name: test can we put anything place of ID.here is one sql file called ed_url_cache in mysql so how to migrate this to oracle. how to convert mysql to oracle. you help me to solve this problem.


How do i use a property inside a #get_web_data url

I have a property named "Has symbol" on a page and would like to use this in a #get_web_data url. How should I format the url to insert the symbol into the url string of the web data query?

Very sincerely,

Mats 15:58, 21 June 2011 (UTC)Reply[reply]

I don't know what you mean by "format the URL". Yaron Koren 20:25, 21 June 2011 (UTC)Reply[reply]
if url is http://domain.com?symbol=IBM&blah=blah&format=csv I like to take the property 'Has symbol' and dynamically insert it into the url string where IBM is. I can't figure this out.
Mats 18:05, 22 June 2011 (UTC)Reply[reply]
It depends on whether you're getting a single row of values, or a whole table, but if it's the former, did you try calling #external_value? Yaron Koren 04:17, 23 June 2011 (UTC)Reply[reply]

Handling Nested Conflict XML Tags

Any ideas on how to handle a situation in an XML feed where inner elements have the same names as attributes in out elements?

<?xml version="1.0" encoding="UTF-8"?>
  <box color="brown">
      <fruit type="Apple"><color>red</color></fruit>
      <fruit type="Plum"><color>purple</color></fruit>
  <box color="green">
      <fruit type="Kiwi"><color>brown</color></fruit>

How does one specify the fruit color rather than the box color in the data mapping? --Davepa 20:02, 4 August 2011 (UTC)Reply[reply]

Hi - unfortunately, it can't be done yet. I'm planning to add a new function, probably called #get_xml_data, that lets you use an Xpath to set exactly which tag you're querying. The code for it has already been written (by others) - I just need to clean it up, and add it in - hopefully that can happen soon. Yaron Koren 06:35, 5 August 2011 (UTC)Reply[reply]

Old syntax for (LEGACY) reference

#get_external_data - CSV, GFF, JSON, XML

To get data from an external URL, call the following:


An explanation of the fields:

  • URL is the full URL of the CSV, GFF, JSON or XML file. (CSV, JSON and XML are standard data formats; GFF, or the Generic Feature Format, is a format for genomic data.)
  • format specifies the format of the file: it should be one of either 'CSV', 'CSV with header', 'GFF', 'JSON' or 'XML'. The difference between 'CSV' and 'CSV with header' is that 'CSV' is simply a set of lines with values; while in 'CSV with header', the first line is a "header", holding a comma-separated list of the name of each column.

The other parameters are divided into two types:

  • parameters containing a '==' are filters - they do additional filtering on the set of rows being returned. It is not necessary to use any filters; most APIs, it is expected, will provide their own filtering ability through the URL's query string.
  • parameters containing a '=' are mappings - they let you connect local variable names to external variable names. External variable names are the names of the values in the file (in the case of a header-less CSV file, the names are simply the indexes of the values (1, 2, 3, etc.)), and local variable names are the names that are later passed in to #external_value.

Loop Through Lines and get Values

Hi, i want to get some external Data and writing them into variables to use them later in another hook in a template ;-)

Here is my syntax:

  {{ #vardefine: i | 0 }}
  {{#for_external_table: {{#vardefine:b_{{ #var: i }} | {{{bereich}}} }} {{ #vardefine: i | {{ #expr: {{ #var: i }} + 1 }} }} }}

 {{ #vardefine: max|{{ #var: i }}}} {{#var:max}} {{ #vardefine: i | 0 }} {{
  | {{ #ifexpr: {{ #var: i }} < {{ #var: max }} | true }}
  | {{#var:b_{{#var:i}}}} {{ #vardefine: i | {{ #expr: {{ #var: i }} + 1 }} }}

 {{ #vardefine: i | 0 }} {{
 | {{ #ifexpr: {{ #var: i }} < {{ #var: max }} | true }}
 | {{check_mk_views_Bereiche|Bereich={{#var:b_{{#var:i}}}}}}{{ #vardefine: i | {{ #expr: {{ #var: i }} + 1 }} }}

But it only shows One result which is not correct... Ideas? --Dominik Sigmund 12:32, 30 September 2011 (UTC)Reply[reply]

Wow... I really couldn't say what's going wrong here, but I'd suggest creating an extension to do everything after #get_db_data - wiki-text isn't really equipped to do this kind of complex programmatic stuff. If you do create an extension (assuming that's even a possibility), you can use the data retrieved by #get_db_data using the $edgValues global variable. Yaron Koren 15:01, 30 September 2011 (UTC)Reply[reply]

Nested #arraymap does not split string.

the following template:


Location is returned as a string such as:

"Dutch Harbor,Juneau,Nome,Fairbanks,Anchorage" 

Which should be split by "," and joined by "\n\n" to produce:

Dutch Harbor

#arraymap works fine when used like this:


But in this case only the first value of location is returned for every iteration of the loop.

#arraymap breaks when I omit #external_value.

Any feedback is appreciated. --Beau B 05:38, 2 October 2011 (UTC)Reply[reply]

Hi - you've encountered the major problem with #for_external_table, which is that it can't include parser functions, because those functions get called before the variables can get replaced. So in this case, #arraymap is getting called with the literal string "{{{location}}}", then returns that same string - and then the string gets replaced. I don't know of any solution to this problem - I remember looking into it a while ago. What I would suggest, if you want to store the data via SMW, is to use the #store_external_table function, which uses SIO to store the data - which is actually probably a better solution anyway, since this is a two-dimensional array of data. Yaron Koren 13:11, 3 October 2011 (UTC)Reply[reply]
Hell Yaron - I'd like very much to develop most of the data objects and text formatting within SMW. I've found however that accomplishing these tasks is much more well defined in a programming language such as php, for example. The solution to the above problem, which I have implemented, is to format and parse the data via the external data script (php) and pass this data to MW via JSON. The result is perfectly formatted data with no need for #arraymap or any other parsing function. Beau B 18:53, 3 October 2011 (UTC)Reply[reply]
That works too. Yaron Koren 19:16, 3 October 2011 (UTC)Reply[reply]

Problem when data in database changes

Hi, we have a problem here: We created a simple database call:


{| class="FCK__ShowTableBorders sortable"  cellspacing="0" cellpadding="1" border="1"
! scope="col" | Hostname 
{{!}} {{{Hostname}}} 


But the problem is as follows: When we change the data in the database, the page in the wiki with this code is not updated after refreshing the page. Only when the page is edited and saved, the correct data is displayed...

Any solutions? Many Thanx, --Dominik Sigmund 07:30, 5 October 2011 (UTC)Reply[reply]

Hi - I'm pretty sure the issue is MediaWiki caching. After no more than 24 hours or so, the correct value should show up - but if you want it to show up sooner than that, you can add on "?action=purge" to the URL to do an immediate refresh, or use the MagicNoCache extension so that that page is never cached. Yaron Koren 10:14, 5 October 2011 (UTC)Reply[reply]
OK, i installed the Extension, now it works. Many Thanks :-) --Dominik Sigmund 13:55, 6 October 2011 (UTC)Reply[reply]

Caching Bug

Caching of remote URLs is not working properly: once the entry is expired, the extension will insert a new cache entry instead of updating the existing one; but since there is no sorting when retrieving cache entries further access will never find the most recent records, and new data is always fetched as if it was expired.

The solution is to edit ED_Utils.php changing line 508..509 from...

 // insert contents into the cache table
 $dbw->insert( $edgCacheTable, array( 'url' => substr( $url, 0, 254 ), 'result' => $page, 'req_time' => time() ) );


 // delete any existing old entry
 $dbw->delete( $edgCacheTable, array( 'url' => substr( $url, 0, 254 )));
 // insert contents into the cache table
 $dbw->insert( $edgCacheTable, array( 'url' => substr( $url, 0, 254 ), 'result' => $page, 'req_time' => time() ) );

--Angelf 12:41, 7 November 2011 (UTC)Reply[reply]

Hi - thanks for this fix! I just added it in to the code on SVN. Yaron Koren 21:24, 7 November 2011 (UTC)Reply[reply]

JSON format

I am attempting to pull data from a password protected semantic mediawiki using Extension:SMWAskAPI with the standard api and show it on another wiki. There is no example for use with the JSON format, can this be provided? Thanks. --Dgennaro 17:45, 13 December 2011 (UTC)

That extension is either obsolete, or soon to be obsolete, I think, because SMW either has or is getting its own "ask" API action. Though that might not really matter, since the JSON might be the same for both. Anyway, that's a good idea. Yaron Koren 19:04, 13 December 2011 (UTC)Reply[reply]
Is the SMW 1.7 beta stable enough to use in production? Any idea on how close is it to being released? Thanks. --Dgennaro 19:08, 13 December 2011 (UTC)
Oh, yes - it's only available in 1.7. I think SMW 1.7 is stable enough... I use it on my wikis already. Yaron Koren 20:05, 13 December 2011 (UTC)Reply[reply]
Started using SMW 1.7 beta
Could you please provide some basic documentation on how to use #get_web_data with JSON? Thanks :) Example below:
  "results:" {
          "Has Book Title": [
             "Book Title"
          "Has Book Author": [
             "Book Author"
        "fulltext": "Pagename"
        "fullurl": "http://fullurl.com"
I successfully got "Pagename" value from the "fulltext" parameter, but I am unable to pull the "Has Book Title" or "Has Book Author" values. --Dgennaro 19:41, 14 December 2011 (UTC)
That should work, in theory - though it might be failing due to a bug in JSON parsing in External Data that I just fixed a few days ago (see here). If possible, could you try getting the latest External Data code from SVN, and see if the problem still happens? Yaron Koren 19:23, 15 December 2011 (UTC)Reply[reply]
I made the changes, but it still did not work. --Dgennaro 17:22, 20 December 2011 (UTC)

Sorry - is there any way you could reproduce this problem on a public wiki? Unfortunately, Referata (like scratchpad.referata.com) doesn't have the SMWAskAPI extension, but maybe some other public wiki does... Yaron Koren 23:15, 20 December 2011 (UTC)Reply[reply]

I took your advice and switched to Semantic MediaWiki 1.7 beta. I noted the placement of the switch above. My full problem is, I am attempting to use External Data to display a different password protected wiki's data....so to do this, I must use the api to login and log out. In the api, using the XML format I run into an issue mentioned previously of different nested values with the same parameter name, in this case "value"....so this is why I switched over to JSON (and provide a snipit of the output code above). I would be happy to create an example...can "users" use the api on scratchpad and/or discourseDB? --Dgennaro 15:52, 21 December 2011 (UTC)
Oh, okay. Yes, anyone can use it. Yaron Koren 16:33, 21 December 2011 (UTC)Reply[reply]