Extension talk:External Data/Archive 2017 to 2018

Error: no local variable "X" was set
Hi Yaron, I' ve just upgraded to the latest ED 1.8.2 (8e1d91c) and now almost all my wiki pages are full of "no local variable set" errors. See it in a wiki of your own: http://discoursedb.org/wiki/Germany Note that I don't use local variables in bare form but rather enclosed in if's like this:  and I still get "Error: no local variable "X" was set". Am I missing something? --Ioannis Protonotarios 02:05, 2 January 2016 (UTC)


 * Thanks for pointing that out - the issue wasn't with External Data, but with the data URLs being used; it appears that semanticweb.org no longer supports querying. I updated the data URLs, and now everything seems to be working again. I don't know what the issue you're seeing is - are the data URLs accessible? Yaron Koren (talk) 16:05, 3 January 2016 (UTC)


 * Exactly! The issue appears when the data URLs are not accessible at all. In previous version of ED that was no problem because if there was no response from the URL then the variables would just have NULL values. So an  would be enough. But now, instead of a NULL value I get this error message for every nonexistent value and I don't know how to handle it. I have about 10,000 pages that used to work fine and now they look more or less like this! --Ioannis Protonotarios 00:02, 4 January 2016 (UTC)


 * Do you know what version of ED you were using before? I don't think there have been any recent changes related to this error message. In any case, you can turn it off by adding "$edgExternalValueVerbose = false;" to LocalSeettings.php. Yaron Koren (talk) 00:14, 4 January 2016 (UTC)


 * Yes, I was using ED 1.6.2 (1f1de17). The old site is still live so you can see the older version of the same page here.
 * OK! Problem solved! "$edgExternalValueVerbose = false;" did it! Now pages are back to normal! A million thanks for your support and for this great extension (and all of your extensions that I am heavily using for so long!) :) --Ioannis Protonotarios 00:27, 4 January 2016 (UTC)


 * Okay. That's great! Yaron Koren (talk) 01:54, 4 January 2016 (UTC)

Accessing complex objects
"Because MongoDB returns values in JSON that may be complex, and contain compound values, you can get data that is stored in such a way by separating field names with dots"

Is there any way (or any plans to support a way) of doing this for other complex values. For example, say I have this XML snippet:

 

Is there some way I could map all the attributes to dot-delineated field names? The XPath query I might use is. I'd really like to be able to access  by just referring to the external value   for example, instead of having to manually specify each attribute.

Lewiseason (talk) 21:03, 14 January 2016 (UTC)


 * It would be nice if that were possible, though the code would have to change. The XML-handling code in ED doesn't parse the XML code; it just looks for the fields that have been specified. So the XML-handling code would have to start parsing everything, and turning the whole file read into an object - which is possible, but there's no plan to do it. Yaron Koren (talk) 16:45, 15 January 2016 (UTC)

Include data in search results
Is there any way to make the parsed external data qualify in search results? I found this extension: https://www.mediawiki.org/wiki/Extension:Lucene-search but it seems difficult to install and the instructions are for Linux and we're running Windows. Perhaps there are other search extension alternatives?


 * If you want to search external data then you should store them locally first using either Semantic MediaWiki or Cargo. After that you can do all kinds of semantic search. That's the general idea more or less. --Ioannis Protonotarios 21:44, 21 January 2016 (UTC)


 * I don't understand how Semantic MediaWiki would be able to store the data locally but I have been playing with Cargo and I understand the gist of it. I have some dummy data in there but is there a way to do a semantic search or query on it automatically when the user uses the regular search box? I want the page with the external content to be searchable like any other normal page. --Patrick


 * The only problem I see here is the "regular search box". I don't think that regural search is customizable. You need to create an alternate search page. If you do that then you can do any kind of crazy search. But I don't think you can integrate semantic search into the regural MediaWiki search box. Unless you use some kind of search extension but we are passed that already. This is what I do. I can describe you my workflow but very roughly because semantic storing inside MediaWiki is a whole new world. There are several things you need to do to make it work. So the general idea (very very general) is that first you define a concept (e.g. books) which consists of a category, a template, a form, and several book properties. Inside the template is where the actual storing occurs. In there is where you put the external data function that takes data from an external source (such as a web api for example). So, instead of giving each book values manually, you retrieve them externally first, and then you assign them to properties (which is the semantic equivalent of variables), and all this happens inside the template. If you do that then each book page, besides the actual text content that may have, it also has some semantic values stored in (such as writer, year, etc). Once you've done that then you can query these values in many ways. One of those ways that concerns us here is to create a query form. This form would be the alternate search page I mentioned earlier! I hope you got the general idea. I tried in a few words to describe something that actually needs several pages to be fully understood. P.S. Semantic values cannot be seen inside page text because they are stored somewhere else inside a page, not in the content/text area. So the regular search cannot see them, as users cannot see them, as not even Google can see them. The only way to get those values is to query them. That's why regular search is useless. I hope this makes it more clear. --Ioannis Protonotarios 14:14, 26 January 2016 (UTC)


 * I think I may understand now, cargo takes the data and physically copies it into a new table where it can be queried. The semantic method is different because the external data isn't being directly stored/queried but instead feeding a system of tags and it is the tags that can be queried. Here is a super simple example, please tell me if this is accurate. I could make a "person" template and it reads in a list of users from a csv file. Here is the super simplified part: I could then write the semantic template to tag all rows and all columns with a property of "external data value". Then I would have a "People" page that uses the template and lists all the person info in a table. On the semantic search page if I searched for "John" with a property of "external data value" I would find the "People" page based on any field including first, middle, or last name... Right? I realize I am selling myself short if I tag everything with "external data value" but I'm just trying to wrap my head around it. In real life it would be properties like "name", "birthday", "address", etc. Can you semantic search *all* properties? --Patrick
 * Edit: I am not able to "load" the external data via Cargo. I think it's because I can't use the #cargo_store function inside the #for_external_table function. How do I get my data into the Cargo tables so I can query it? I plan on posting this on the Cargo talk page too. --Patrick


 * Yes, you can store each field with a different field/property. See here for how to store External Data's data via Cargo. Yaron Koren (talk) 18:23, 27 January 2016 (UTC)


 * Aha! I didn't understand how #display_external_table and the template parameter worked. For some reason I thought that example was showing you how to use Cargo to create a data source then feed it back to the external data extension. However, looking at that example again knowing it's what I wanted helped me get it to work. I created my table template and when I made a page with the CSV all the rows showed up! Very exciting stuff thank you so much for your help, I would never have put all these pieces together without your guidance. I think the hard part is over, now I need to query it. :) --Patrick


 * (I started writing this answer before Yaron replied so, since I wrote it, here you are, it may help you some more) Yes, it's something like that. Actually, Cargo and SMW are more or less the same. They both store to the db instead of page. Their difference is that a) Cargo works only with templates while SMW declarations can be used in any page, and b) SMW has more more basic queries while Cargo can create more complex queries like SQL (using an SQL-like syntax) (and Cargo is 2x faster). Besides that they do virtualy the same job. In your example, if you want to store persons then you will need a page for each person (and not one page for all. That would be the Category:Persons.) The page titles must be the person's names. Imagine that a page title is your most basic variable, the name. Each person page must contain the template call . This should be the minimum content. Now, inside the Template:Person you can either assign values manually to properties or get them from an external source, in your example from a csv file. You can do that using either Cargo or SMW, only the syntax will be different, the result will be the same. So, if you do it the manual way, then every template variable will be assigned to some property (and in that case, inside a persons page, e.g. Ioannis Protonotarios page will contain something like that:  ). If you do it with exteral data then inside the template you will use an external data parser function that will get the country of  (e.g. Ioannis Protonotarios) from the csv and assign it to the respective property. After you have created all persons pages then you can do any kind of queries you like. For example you can search for persons from Greece, aged from 20 to 40, with name starting with "P" (for this you may need to implement some string functions too, but you get the idea). --Ioannis Protonotarios 20:36, 27 January 2016 (UTC)


 * That makes sense, thank you both very much for the help. I am new to the MediaWiki world and I'm setting up a data dictionary for the Business Intelligence team. Just a few days ago these extensions (and templates in general) were very overwhelming but I'm starting to catch on and I think these tools will allow me to add functionality I had never even thought of. If I do manage to get my external data to show up in the regular search box perhaps I will write an extension and share with the community so that I can be as helpful as you all have been! --Patrick


 * Great! The only thing I would add to what Ioannis said is that Cargo is 1.5x faster than SMW (according to one set of tests), not 2x faster. Yaron Koren (talk) 01:34, 28 January 2016 (UTC)

get_db_data with special character in where clause not working
Hello, i use #get_db_data within an template with the pagename as where condition. On most of the pages this works fine, but on some pages with special characters there are no results.

This doesn't work on the Page Großbritannien. I think this is because of the Letter ß. Is there a debug-option to show me the sql-syntax generated by the extension? Any Ideas how to solve this?


 * There's no "debug" option, unfortunately, but you can do put in a typo in a field name or something, so you get an error message that includes the full SQL. That's what I tend to do. Yaron Koren (talk) 16:57, 22 February 2016 (UTC)


 * Your tip has helped narrow the error. The special character ß is transmitted inaccuracy as WHERE Country= 'Gro&Atilde;�britannien'. How can I fix this? TomyLee (talk) 07:17, 23 February 2016 (UTC)


 * So your guess was right. Switching from "PAGENAME" to "PAGENAMEE" might fix the problem, since the two are encoded differently. However, if you do switch, you'll probably have to add handling for spaces (for countries whose name contains a space), because PAGENAMEE turns those into "_". Calling REPLACE within the "where=" parameter might fix that problem. Yaron Koren (talk) 14:55, 23 February 2016 (UTC)


 * This don't help because the Name will be send as 'Gro%C3%9Fbritannien' to the sql-Server. I Test also without Pagename and write the country directly into the where-clause and it gives me the same wrong name. This hapens also with country with the german character ä ö ü like Ägypten (Egypt) TomyLee (talk) 16:00, 23 February 2016 (UTC)


 * I don't know, then... the only solution I can think of is to change the encoding of your 'page' table, from 'latin1' (which is what I assume it is now) to 'utf8' or something like that. It's probably a good idea to back up your database first, though, if you're going to do that. Yaron Koren (talk) 22:08, 23 February 2016 (UTC)


 * I Fix it! i go the long way the complete sourcecode starting from first php-call and insert a lot of prints ... into the code to see where the string goes craped. The resoulution is to insert the line  into the file ".\includes\db\DatabaseMssql.php" for the function "open" (for me it is on the line 110). TomyLee (talk) 07:27, 24 February 2016 (UTC)


 * Wow! That's great. I'm glad you were able to find a fix, based on very little information. I wonder if this is an MSSQL/SQLServer-only issue... Yaron Koren (talk) 14:59, 24 February 2016 (UTC)

Manage access to "general use" data
Hi. My name is Oscar and I'm building a wiki with a lot of information based on public data.

There some specific situations where you need "general use" data which can be properly stored and optimized to not duplicate data, for example. Just think of a list of cities, countries or states with the proper relationship, which can be used and maintained in order to save several thousands records, work, etc. I've installed the SemanticBundle package in order to manage this kind of data. So, I've set anonymous access to some data (which is perfectly legal, as indicated in http://dev.mysql.com/doc/refman/5.7/en/account-names.html) by specifying '' as the username part and no password (I've filtered the host part for the internal servers network). With these settings, I got the "incomplete" error for the ID. I've gone through the code and have commented the part in EDUtils where these were check, with this diff result:

187,188c187,188 <                      if ( $db_server ==  || $db_name ==  || <                               $db_username ==  || $db_password ==  ) { --- >                      if ( $db_server ==  || $db_name ==  ) { > //                           $db_username == ' ' || $db_password == '' ) { so I can access my data, and I don't think this is a risk (you can, of course, disagree). So, I've set these user and pass:

$edgDBUser['t'] = "''"; $edgDBPass['t'] = ""; and it works for me. Please consider the change.


 * Oh, I didn't realize that usernames and passwords could be blank. Your change makes sense, then. Yaron Koren (talk) 17:31, 29 February 2016 (UTC)

Improve management and/or security
Just a contribution to help manage sources for ED. It's just to provide a specific separate file for extra private data, i.e. ExternalData_private.php. This is the diff for ExternalData.php:

61a62,66 > > // A place to include private data > if ( file_exists(dirname(__FILE__). '/ExternalData_private.php') ) { >      include_once dirname(__FILE__). '/ExternalData_private.php'; > }

As an example, this file could have "w" pointing at mediawiki db itself, together with (or replaced by) any DB to be used:

&lt;?php

$edgDBServer['w']    = $wgDBserver; $edgDBServerType['w'] = $wgDBtype; $edgDBName['w']      = $wgDBname; $edgDBUser['w']      = $wgDBuser; $edgDBPass['w']      = $wgDBpassword;
 * 1) external data configuration


 * I think it's better to do this from within LocalSettings.php. Yaron Koren (talk) 17:20, 29 February 2016 (UTC)

Access to array on JSON
Hello, how can I read an array on JSON.

JSON Example : https://raw.githubusercontent.com/malahx/QuickSearch/master/QuickSearch.version

And I've tried this:

..

It result to: https://kerbalstuff.com/mod/472/QuickSearch 1.2.1 Error: no local variable "KSP_VERSION" was set.

How can I have access to the array MAJOR, MINOR, PATCH which are in KSP_VERSION. Thanks.

--Malahx (talk) 01:56, 1 March 2016 (UTC)


 * Unfortunately, I don't think it's possible, given that there are multiple values there labeled MAJOR, MINOR and PATCH. If this were CSV, you could use XPath, but for JSON, External Data doesn't offer anything comparable. Yaron Koren (talk) 02:58, 1 March 2016 (UTC)

Integration with Semantic Forms
Sorry if it isn't the right place. I've used ED for reading a DB with many records, but I'm not able to find the way (it might be my fault, of course) to "integrate" existing data with Semantic Forms, so you can add records to the DB with the same format. There are examples about storing, but if I've missed something about this issue, please let me know and/or provide better examples. Thanks in advance.


 * This may be what you're looking for. Yaron Koren (talk) 14:33, 10 March 2016 (UTC)

JSON format source with non setted properties
Sorry if it's not the right place to ask.

I'm using #get_web_data to get data to build a table calling #display_external_table with a template. My JSON formatted source retrieves several rows, each one with multiple properties. My problem is that each "row" not necessarily have all the properties defined in data=. And this causes properties not defined in row 1 are now defined with values in row B. For example:

Object 1: propertyA: true, propertyB: 12

Object 2: propertyA: false, propertyB: 66, propertyC: 1

What I expected:

What I get:

My temporary solution is to define ALL properties in each JSON object, but it's not an ever lasting solution for my problem. Am I doing something wrong? Is this an expected behavior?

Thanks.


 * This is expected behavior, which is not to say that it's good behavior. It would of course be better if the code could identify missing values, but the way the code does its JSON parsing, I don't think that's possible, unfortunately. (Though I could be wrong, because it's been a long time since I looked at that code.) Yaron Koren (talk) 17:36, 18 April 2016 (UTC)
 * I'll format my JSON objects in a way my wiki doesn't need to rely on exist/no exist logic. Thanks for the quick response.

how to get remote access
Hi Yaron,

I use a virtual Server (Ubuntu 14.04) at Strato (german hosting provider). I would like to access database to use get_db_data. Tried to set $edgDBServer to localhost, 127.0.0.1, xxx.strato.net with and without portnumber 3306. The result from the server is always the same: cannot connect database. User and password are like configured in phpadmin.

Can you give me an idea how I have to describe the correct URL? Thanx for your support.
 * well, this was my mistake: i used two different virtuals servers, sorry

where with text fields gives an error
Hi Yaron,

when I tried to execute get_db_data like

i get the message Error: no local variable "num" was set. Seems the SQL statement gives an empty result. Name is a field of type text, Nummer is a field of type int.

So, when i tried this

i can use the name with external_data-Statement.

What do I have to do, to succeed the first statement? Thanx for your support. --SnowlDD

Using FreeTDS: (Cannot access the database: Unknown error)
Hey guys,

My setup is MediaWiki 1.26.1 and PHP 5.4.16. I'm attempting to connect to a SQL Server 2008 database using the FreeTDS and MSSQLBackCompat extension. I'm getting the error "(Cannot access the database: Unknown error)" when trying to update a page with the following code:

My config in freetds.config is as follows: [SERVER] host = SERVER.domain.com port = 1433 tds version = 7.3

My config in ExternalData.php is as follows: $edgDBServer['people'] = "SERVER"; $edgDBServerType['people'] = "mssqlold"; $edgDBName['people'] = "DATABASENAME"; $edgDBUser['people'] = "USERNAME"; $edgDBPass['people'] = "PASSWORD";

I can login to the database using command line tsql no problem, so I know that part works. I've tried $edgDBServer as SERVER and SERVER.domain.com. Neither work.

I verified mssql is working through /phpinfo.php as well. If anyone can provide help, I would be so happy! I'm sure its something simple I'm missing. --Chazbot7 (talk) 21:58, 24 June 2016 (UTC)


 * Are you literally putting 'db_name' in those LocalSettings.php settings? Because it seems like it should be 'people' instead. If that's not it, adding "$wgShowSQLErrors = true;" and "$wgShowExceptionDetails = true;" to LocalSettings.php may help clarify the error. Yaron Koren (talk) 01:22, 27 June 2016 (UTC)


 * Thanks for replying, I appreciate the help! My apologies, I changed the names of everything when posting here. I do have 'people' in my LocalSettings.php - I updated my example so it looks correct. I already had those options enabled, but I did add error_reporting( -1 ); and ini_set( 'display_errors', 1 ); and got the following:
 * Notice: Array to string conversion in /var/www/html/extensions/MSSQLBackCompat/DatabaseMssqlOld.php on line 79
 * Not sure if this is an actual error or just an advisory, but I'll dig into the code its referring to and see what I can find. Thanks again for the help. --Chazbot7 (talk) 16:34, 27 June 2016 (UTC)


 * I ended up trying the ODBC method instead, which worked great! Thanks again for this extension, it is a lifesaver. --Chazbot7 (talk) 18:26, 27 June 2016 (UTC)

Problems with special caracters when retrieving Data from xml
Hello,

I'm having some serious problems when trying to retrieve information. I've got a webservice which is delivering information in xml format: I've now got some special chars like "ä ö ü + @" etc. inbetween some xml tags:

I can retrive all the values that have none of these characters in them, but when I try to retrive one which contains it, the wiki stops showing the content in the exact moment when a specialchar would be displayed:

code: resulting in the following Text in the Wiki-Page:

Kurzbeschrieb Projekt zur Abl --> which should be "Projekt zur Ablösung der FA SomeApplication und..."

It seems that the extension has problems retrieving utf-8 encoded data via XML or am I messing up something? I've just updated to the 1.8.3 version and am using MediaWiki 1.25.1 with PHP 5.3.29 and MySQL 5.5.43 ! Maybe my mediawiki-software is just outdated again?

Cause
The actual retrieval is fine. But the current implementation of  might not result in what you expect - whether this is intentionally or not, I don't know. is called multiple times for the same XML element if the value is non-ASCII (see xml-set-character-data-handler). So, your value ends up split up:

// pseudo code description[0] = "Projekt zur Abl" description[1] = "ösung der FA SomeApplication und..."

Current implementation of ED_Utils::getContent (14 July 2016)

Probably, the content should be added to the current value of the array's last element if the XML element is still the same.


 * If you add the line "$content = html_entity_decode( $content );" above line 50, does it make a difference? Yaron Koren (talk) 14:02, 14 July 2016 (UTC)
 * No, why would it? The passed parameter $content is only a part of the element's actual content; the first time getContent is called for an element, $content is the part up to the first non-ascii-character, and the second time up to the second non-ascii (or to the end), and so on. So, decoding the partial content still leaves the content split up. What you need is something like "$edgXMLValues[$edgCurrentXMLTag][count($edgXMLValues[$edgCurrentXMLTag]) - 1] .= $content". Christian 09:54, 15 July 2016 (UTC)


 * Sorry for the long delay. I finally tried reproducing this problem, but couldn't - retrieving non-ASCII characters works correctly for me. My guess is that this is an encoding issue (UTF8 vs. UTF16, etc.) - the code tries to handle the encoding stuff correctly, on lines 866-869 of ED_Utils.php, but that code is only called if you have the mbstring PHP extension installed. Could it be that you don't have this extension? Yaron Koren (talk) 13:48, 19 September 2016 (UTC)

Telephone number interpreted as table caption
In a field retrieved via get_web_data, I've got a telephone number "+41 XX XXX XX XX" which I want to display in a table. But somehow the telephone number gets interpreted as  tag, endig up on top of the table: code:

result:


 * Answer: The "+" of your international number in combination with the "|" creates the caption (see Help:Tables):

Adding a space between "|" an your telephone number should solve the problem.

Getting different datasets out of one file on a page
Hi! External Data is a really nice extension, I have already tested it successfully with data tables stored in a SQL database - everything works great. But I have wondered if it was possible to load one file completely on top of every page and then have access to individual values of that list.

Let me give you a quick example to make it more understandable: Let's say I have a database with a list containing many cooking recipes including their calories, preparation time, difficulty and so on. If I was interested in the data of one especial recipe, I could fetch the data using the right parser function and the |where=recipe='steak' parameter. This works fine for creating a nice infobox at the top of the page or similar and I can also create a list of all recipes by dropping the where clause or I can create a list by filtering out just the easy recipes for example without any problem.

But what if I wanted to make a list of my favorite recipes? As far as I know, I would have to make a table header and then a template where I create a single row for that table and use the get-web-data parser with the where clause searching for the required recipe. But this way, if I made a list containing a hundred recipes, I would call the parser and load the list at least a hundred times which probably would put unnecassary pressure on my database.

Is there already a method to solve my problem more efficiently or could it be added (by loading the list once at the beginning of the table or page and then having a parser function to grab the desired data out of it without communicating with the database e.g.) or is the way I described very efficiently in fact and I just don't get how the extension functions (also likely ;))? --Robohendl (talk) 06:46, 10 August 2016 (UTC)


 * One option is to just have one call with a long "where" parameter - like "where=recipe='steak' OR recipe='fish' OR ...". I think that would work - would that solve your problem? Yaron Koren (talk) 13:03, 10 August 2016 (UTC)


 * Thanks for the quick response! Yes, this indeed solves most of my problems, it's a great idea. But I also have one special case that I think I can not solve with this solution: I want to add a short commentary to every row in the form of an additional cell like this:


 * {| class="wikitable"

! Recipe ! Calories ! Prep. Time ! Difficulty ! Comment
 * - style="background: lightblue;"
 * Steak
 * 650
 * 20 min
 * 2/5
 * style="background: lightgreen;" | My favorite dish!
 * - style="background: lightblue;"
 * Fish
 * 320
 * 25 min
 * 3/5
 * style="background: lightgreen;" | Very light meal!
 * }


 * ...where the information of the blue cells is stored in the database and the content of the green ones should be added on the page directly because this varies/has to be possible to vary on different pages obviously. So I guess the input on the page could look like this:




 * Is this possible with the extension or do you have a solution in mind that helps me achieve this? If not I could try to find a different solution for this specific problem but nevertheless it would be very great if that would work! Thanks --Robohendl (talk) 15:49, 10 August 2016 (UTC)


 * This is doable, although with a hack. Here's what you can do: first, create a new wiki page, holding "CSV" (i.e., a vertical list) with all the comments, so it looks like this:

Comment My favorite dish! Very light dish!
 * There should be a blank line for any recipe that doesn't have a comment. Then, right above or below the #get_db_data call, put the following call:


 * ...though you'll have to modify that URL value to fit your wiki, and the name of the "comments" page you created.
 * Once that's done, you should be able to use the "Comment" field the same way you use all the fields from the #get_db_data call, in your #for_external_table (or #display_external_table) call. Yaron Koren (talk) 02:17, 11 August 2016 (UTC)


 * I think this can be helpful. It's probably not the best solution but it works and this it what counts, right? Again, thank you very much for your help and the quick responses! --Robohendl (talk) 11:12, 11 August 2016 (UTC)

get_db_data retrieves max 255 char
Hi,

in my SQL DB we store text up to 4k chars of text in a given field. get_db_data retrieves only 255 char. Is this something I can change?

Thanks, Reiner

Index undefined errors from ED_ParserFunctions.php
Whenever a query results in an empty set, line 494 of ED_ParserFunctions.php: throws an undefined index error. What I propose is that this statement is wrapped inside an if-clause as follows: It certainly works for me :-)

ExternalData and 2-Way SSL?
I'm currently setting up a Wiki in an SSZ (Secure service zone) but I have to receive data out of an other network zone. Businesspolicies now define, that the connection between two different network zones has to be 2-Way-SSL.

So is it possible to send a certificate alongside the get_web_data request? without it I can't start a 2-way-SSL connection.

Thanks for your answer.