Extension talk:Data Transfer

Error while uploading
Hi, My wiki uploads a file through the datatransfer extension. However, it always gives me the same error:

Error: the column 0 header, 'Gemeente[GemeenteCode]', must be either 'Title', 'Free Text' or of the form 'template_name[field_name]'

I do have a template Template:Gemeente that has a field GemeenteCode. (see code). So where can it go wrong?

< > This is the "Gemeente" template. It should be called in the following format:

Edit the page to see the template text.

Nieuws
< >

Error after installation
I have attempted to install the Data Transfer Extension, but when trying to go to the View XML page I get the following error: "Warning: Invalid argument supplied for foreach in C:\mediawiki\extensions\DataTransfer\specials\DT_ViewXML.php on line 189". This error is printed 7 times. The page looks fine below the errors. Is there an easy fix? Jascraig 12:17, 23 April 2008 (UTC)


 * Hi, thanks for the bug report. I think I know the problem - you have some pages that don't belong to any category, and Data Transfer wasn't expecting those. There's an easy fix, which I'll add to the next version of the extension. Yaron Koren 14:28, 23 April 2008 (UTC)


 * Yaron, Thanks, that fixed the problem. Now I'm having problems when I click Submit on the View XML page, instead of getting the XML I expect, I'm redirected back to my Wiki Main Page.  I am going to try and diagnose this locally, but if you have any ideas what could be causing this I would appreciate the information.  Thanks for the quick response. Jascraig 16:21, 23 April 2008 (UTC)

About dt_xml_* messages
I'm a translater in translatewiki.net. dt_xml_* software messages such as dt_xml_namespace are used in xml outputs as names of elements or attributes. They are, however, translatable. When you export xml files in some wiki, they can't be imported in different language wikis because of lacking portability of xml outputs. Is this intentional? --fryed-peach 15:01, 16 April 2009 (UTC)


 * Well, Data Transfer was never meant to help with transferring pages from one wiki to another, since Special:Export and Special:Import already handle that; it's meant only for transferring data with non-wiki systems; so the tags can say anything. Yaron Koren 16:19, 16 April 2009 (UTC)


 * Thank you for your quick response. The current documentation is not clear to me around interwiki-portability. But now I understand its design. --fryed-peach 06:03, 17 April 2009 (UTC)

CSV example needed
I am unclear on how to properly format a CSV file to have it accepted by this extention. I am attempting to import a CSV file exported from Excel or Open Office.

Without the proper headers, the import generates this error: Error: the column 0 header, 'Revised: June 23, 2009', must be either 'Title', 'Free Text' or of the form 'template_name[field_name]'


 * As the documentation says, actually it's pretty easy. All you have to do is preparing the header, in one line. IE, with the following CSV:


 * George, 22, 1977, He is a good guy
 * Julian, 24, 1979, He is a bad guy


 * The proper header would be:


 * Title, People[date], People[year], Free Text


 * You also have to create the Template with the proper variable to take it. In this case, would be Template:People, and in there you should have something like this "He is year old, was born in ". And that's it.

#ask queries treated as templates
Hi... the XML output seems to treat #ask queries as templates, so I get:


 *  TypeUserDescription 

Is that the expected behaviour? - Borofkin 04:39, 14 January 2010 (UTC)

Call to a member function getNamespace
I have a CSV-file with 5400 rows of information created in Openoffice Calc and saved as CSV. Just titles and a short free text on each row. When I try to import the CSV I get an error saying:

Fatal error: Call to a member function getNamespace on a non-object in /home/christoffer/public_html/wiki/includes/JobQueue.php on line 277

This error seems to have something to do with non-UTF-8 encoded characters, but as far as I know I don't have any "special" character and my files are always saved with UTF-8 encoding. If I take some random rows from the file it works, but the whole file won't work. I have also tried to import a file with several non-UTF-8 characters like åäö, lots of quotation marks, commas and so on, and it also works. Is it the size of the file that is the problem (to big) or is it something else? --Squall Leonhart 22:37, 8 April 2010 (UTC)


 * Hi - if possible, could you try splitting this up into smaller files, and try importing those? At the very least, it'll help determine whether the problem is the file size, or some specific row(s); and maybe help isolate the offending rows, if any. Yaron Koren 00:47, 10 April 2010 (UTC)


 * I tried that now. I think I found the error. Some of the rows did contain characters like [], {} and ~ in the title column. When I removed them everything worked like a charm! --Squall Leonhart 13:16, 10 April 2010 (UTC)


 * Okay, that's good to know, thanks. I'll try to fix the code so that it handles those characters more gracefully. Yaron Koren 21:20, 11 April 2010 (UTC)


 * This has been fixed now in SVN, and it'll be fixed in the next version of Data Transfer as well, FYI. Yaron Koren 05:32, 13 April 2010 (UTC)

Multiple templates & ImportCSV
One thing I've noticed - if you attempt to import a CSV file to create a page which has multiple instances of the same template, it silently just uploads the last one specified. eg in CSV:
 * Title, Template1[field1], Template1[field1]
 * Testpage, foo1, foo2

The workaround was to use ImportXML which does let you specify multiple versions of the same template. --92.236.50.225 17:05, 2 May 2010 (UTC)

EDIT: hmm... have also noticed that "&amp;amp;" in field values don't get re-imported by ImportXML, even if correctly escaped... --92.236.50.225 20:38, 2 May 2010 (UTC)


 * The hack I've used is to create various templates, then Renaming them by ReplaceText extension, like this:
 * Title, Template1[field1], Template2[field1]
 * Testpage, foo1, foo2
 * Template2 is exactly the same as Template1, and once you imported it you go to Special:ReplaceText and replace Template2 for Template1 (yeah, its a tiny dirty hack XD)


 * Ooh, that's not bad! Yeah, I can't think of another solution for it. Yaron Koren 16:44, 12 May 2010 (UTC)

Templates Inside of Templates
I run into a lot of XML parsing difficulties when it runs across a page with a template inside of a template. For instance, if the wikitext is:

Harmony
So we have ,

then the xml file gives

Home DemoMp3Aisumasen Home.mp3&#125;&#125;&#125;</Field><Field Name="">=Harmony==

So we have Bridge</Field><Field Name="key">I</Field><Field Name="start">ii</Field><Field Name="4"></Field><Field Name="5">,

It then seems to interpret main body text as various fields, which runs into all sorts of weird problems from then on.

Douglas (May 25 2010)


 * That looks like a bug in the code. Though Data Transfer isn't really geared for templates-within-templates, so I'm not that surprised that it doesn't work. Yaron Koren 13:20, 26 May 2010 (UTC)

Template Calls with Special Characters
For instance, I sometimes have the line

as a parameter input. Example:

The XML then looks for a matching

Dunno if this is a bug in the code or a poor convention of mine.

Okay, I fixed this, so in case anyone wants to know: the convention between HTML and XHTML is that HTML permits the unclosed line break while XHTML requires that it be closed like

One more problem, however, arises with the use of

I use this in my template input all the time when I want a box, for instance, to display some terminal text. I'll write something to ignore equal signs when its inside angular brackets.

-Douglas (May 26 2010)

overwrites existing pages
hi,

how can I prevent overwriting existing pages with this import-mechanism?

Thx. --Rolze 08:27, 15 June 2010 (UTC)


 * Unfortunately, you can't - but that would be a nice feature to have. Yaron Koren 12:21, 15 June 2010 (UTC)
 * Ok, so I - as a faithful semantic mediawiki fan and user - request that this feature gets on the roadmap of this fine extension! And that it won't be forgotten on the nice2have-wishlist! ;) --Rolze 21:29, 15 June 2010 (UTC)


 * I think this would be a nice feature, too. Cheers --kgh 22:42, 15 July 2010 (UTC)
 * still needed! Is there a possibility to get this on the roadmap? --Rolze 10:14, 22 April 2011 (UTC)


 * Hi - that feature's already in there, as of version 0.3.7 - you can "skip" pages that already exist. Yaron Koren 15:46, 22 April 2011 (UTC)

Bot flag is being ignored
Hi Yaron, so far the bot flag is being ignored by this extension. All the log actions of this extension are displayed on recent changes even though the importing user has a bot flag. This is not really a problem, but it would be nice if the extension accepts the bot flag in the future. Cheers --kgh 22:41, 15 July 2010 (UTC)

semicolons as separators
Hi, it is me again. :-) I would be very useful if semicolons are the separators of choice with this extension, since they are by far more rare in use within texts. So if texts containing a comma get imported, they are destined to ruin the respective imported page. Hmm, is it possible to use text markers like ' or " for texts. This should be a workaround. Cheers --kgh 22:48, 15 July 2010 (UTC)
 * Hi - well, CSV is a very standard format, unlike, say, "SSV". In CSV, values that contain commas are meant to be enclosed within quotes, which solves the problem. Yaron Koren 23:04, 15 July 2010 (UTC)
 * That is true and every user should stick to this to keep the format exchangeable without having to prepare the file for further use. Thank you and cheers --kgh 07:16, 16 July 2010 (UTC)

Enhancement to DataTransfer
Hi, a nice enhancement would be, if one is able to insert an individual description for the upload instead of the standard "CSV-Import". Another wish would be, that the name of the file used for import is piked up and displayed, too. To me the second request seems to be more important. Cheers --kgh 22:20, 19 July 2010 (UTC)


 * That's a good idea, and you asked at just the right time - I was in the middle of getting a new version ready for a release. This feature is now available (though only for CSV import right now, not for XML import) in the latest version, 0.3.6. Yaron Koren 16:31, 26 July 2010 (UTC)


 * Cool, thank you! Great work! I am very happy about this. I am mostly doing CSV imports, thus my urgent needs a fulfilled, which does not mean... :-) Cheers --kgh 08:51, 27 July 2010 (UTC)

SVN tag for 0.3.6 missing
The last SVN tag is 0.3.5 --RScheiber 22:40, 14 September 2010 (UTC)

Why does this extension exist?
RE: It should be noted that Data Transfer is not an ideal solution for backing up one's wiki, or transferring wiki pages from one MediaWiki site to another; for that, the much better solution is to use MediaWiki's built-in "Special:Export" and "Special:Import" pages.


 * So why have this extension at all if "Special:Export" and "Special:Import" is better? Adamtheclown 06:05, 24 November 2010 (UTC)


 * What do you do in case your data is not stored in xml? That is the reason for this extension. Cheers --kgh 11:28, 24 November 2010 (UTC)

Expected pages got mediawiki error
My error message: Since I don't see why this extension exists, since "Special:Export" and "Special:Import" already moves pages better, I will not try to get assistance to fix this problem. Adamtheclown 06:17, 24 November 2010 (UTC)


 * I think you've misunderstood the point of Data Transfer - it's for getting in data from any system other than MediaWiki. Here you seem to be using MediaWiki's XML, but Data Transfer has its own XML format that's different. Yaron Koren 15:03, 24 November 2010 (UTC)

mediawiki error
My error message: Fatal error: Call to undefined function wfloadextensionmessages in /home/zasve/public_html/oglasi/extensions/DataTransfer/specials/DT_ViewXML.php on line 18

Just when I approach http...www...zasve.net/oglasi/index.php?title=Posebno:Specialpages

wgLanguageCode = "sr-el";

Bonzo 12:45, 16 December 2010 (UTC)


 * You must be using the very latest MediaWiki version. I'll fix that soon; for now, you can just delete that line in the code. Yaron Koren 17:50, 16 December 2010 (UTC)

Bonzo 18:00, 16 December 2010 (UTC)
 * Yes. version 1.9.2 . Thanks :).


 * Oh, you're using version 1.9.2? Never mind, then. :) That version's not supported by Data Transfer. Yaron Koren 18:36, 16 December 2010 (UTC)

File size
Yaron, hi! First, I love this extension! Really makes my life easier! I´m importing 48k articles from Excel. Everything is ok, but I´m doing some tests and dividing the articles into multiple csv files everytime is painfull. How can I make your extension accept bigger csv files? Is that possible? Thanks, Filipe.


 * Hi - what's the error message you get when you try to upload a bigger file? Yaron Koren 04:59, 27 December 2010 (UTC)


 * Doesn´t give a message. Returns to the same page like nothing has happened.


 * Hi - that's odd. What versions of MediaWiki and Data Transfer are you using? Yaron Koren 04:08, 5 January 2011 (UTC)


 * I had this problem and it turned out to be a PHP timeout, increase maximum execution time in php.ini and restart apache to solve.


 * 0.3.4. If my file is TOO big, it does that. I got a message once, when I removed some lines from the CSV file, that there were too many items. And if I remove even more, leaving like 1 thousand items, it works perfectly. Filipe, 7 January 2011 (UTC)


 * Okay, but what MediaWiki version? Yaron Koren 16:00, 7 January 2011 (UTC)


 * 1.16 (http://www.wikireceitas.com.br/Especial:Vers%C3%A3o) Filipe, 8 January 2011 (UTC)


 * Yaron can you help me woth the default tables I have to populate? I´m trying to make and excel macro to do that for me. Filipe, 8 January 2011 (UTC)


 * Sorry, I don't know what you mean by that. Yaron Koren 15:00, 10 January 2011 (UTC)

Enhancement to DataTransfer
Hi, since the last version of this extension it is possible to add data to an existing page. However, it is not possible to add data to an existing template.

E. g. in case you want to add "World" to

you will get

instead of

since this extension does not identify, that the same extension is already on the page.

There is a quite dirty workaround to do it Remove }} from the respective pages with extension ReplaceText and then import The disadvantage of this is, that you have two new revisions per page and that it is only possible once.
 * Property2=World}}

Perhaps it is possible to enhance this extension to avoid this. Hopefully this is not a daydream. ;)

Cheers --&#91;&#91;kgh&#93;&#93; 14:34, 9 February 2011 (UTC)
 * This feature is very needed ;) Cheers, --Rolze (talk) 12:15, 4 February 2013 (UTC)

Importing Category pages
How do I import Category pages? I have a large number of these externally generated. If I try to generate my pages with a block beginning: <Namespace Name="Category"> <Page Title="catname"> <Free_Text> .... etc DataTransfer strips the Namespace information and creates the page within Main, not within Category.

If instead I try to reverse engineer it by using ViewXML on an existing Category page, it gives me a block beginning <Category Name="catname"> which includes the contents of every page in the category, which is not what I want. Marinheiro 10:00, 9 March 2011 (UTC)

This was easier than I thought (though I did get a bit misled by the Help page): just use Category:category name as the page title, eg.

<Page Title="Category:Inactive Resources"> <Free_Text>

</Free_Text> </Page> 217.169.30.146 22:53, 12 March 2011 (UTC)

Version to download for MediaWiki 1.17
This is a just a tip for anybody like me who was having trouble with this extension after upgrading to Mediawiki 1.17. Any files that I tried to upload just gave the "The file you uploaded seems to be empty. This might be due to a typo in the file name. Please check whether you really want to upload this file." error.

If you follow the "Download snapshot" link on the Data Transfer Extension page, and select Mediawiki 1.17.x, you will be given version 0.3.6 of this extension which is not compatible with Mediawiki 1.17. You have to either select "trunk" on the Download Snapshot page, or download the code directly from the main Extension page. The version of this extension in the Semantic Bundle is also not compatible with Mediawiki 1.17.

Ability to add a single property to many pages in an automated fashion
Hello -- I know that this question was already asked to a certain extent and I wanted to ask if any progress had been made or to ask if there was another work-around.

I also would like to be able to add a property and value to a given page after the page is already created. I was wondering if you could use the export function on each page and then modify each file to have the additional property/field and then to re-import all of those same pages.

Thanks in advance, Dc321 14:54, 18 September 2011 (UTC)


 * There's now what might be a better solution to that problem: the "autoedit" functionality in Semantic Forms, which is available as both a parser function and an API action (the latter might be more helpful if you have a lot of pages, and if you know at least a minimal amount of programming.) I would look into that one. Yaron Koren 21:01, 19 September 2011 (UTC)

Order of templates
Hello, did anyone find out what in what order the templates are printed in the article text? To me it seem quite randomly, but maybe there is a system? rotsee 20:50, 12 October 2011 (UTC)

Database glitches?
I've noticed a few times that with CSV export (Data Transfer 0.3.9), some content can get duplicated on pages that are not related, except that they're in the same category and use the same template/form. If you've chosen "append to existing content", you can simply remove the spurious extras by hand. You have to be careful though if you choose "overwrite". Just so you know. Cavila MW 1.17, MySQL 5.5.16, Php 5.3.8 12:07, 15 April 2012 (UTC)

Import CSV not overwriting existing articles
I am using Data Transfer to upload a list of articles in a specific namespace with a csv file. After importing the csv, I tweaked some of the free text and reimported the file. The changes haven't shown up. I've been careful to have the overwrite options selected. Any ideas how to get the overwrite behavior I'm looking for? Davious (talk) 20:44, 17 May 2012 (UTC)

Oh. Wow, I'm sorry. I didn't realize the 1st import was still going on after 20 minutes. It's all good. Thanks.

Exporting contents of Category pages
Hello, I will need to export a lot of category pages (i.e. the category description pages, that happens to contain a lot of information structured by templates), and I realise that I will have to to some modification to the code of Data Transfer to achieve that. Before I start: Did anyone here do something like that already?

//rotsee (talk) 15:31, 8 June 2012 (UTC)


 * Turned out to be quite simple, once looking at the code. rotsee (talk) 07:22, 12 June 2012 (UTC)

Action log for data loads
Hello,

This is a fantastic extension, and I'm using it quite extensively. I load data via the CSV option, and it works very well. However, sometimes the data load does not work (that is, the "Import CSV" option states that the the pages will be created, but no data change results). I'm trying to identify the source of the problem, but am unable to find any logs for this extension, beyond the Recent Changes page. How can I figure out what the source of my problem is? Patelmm79 (talk) 20:37, 8 October 2012 (UTC)


 * My guess is that, when that happens, it's because the move is still waiting in the job queue - you can check the "job" database table to see what's going on. Yaron Koren (talk) 20:46, 8 October 2012 (UTC)

XML Parsing Error
Hello. The extension is working like a charm but there is only one little hick-up. When using Special:ViewXML and selecting a category and the Simplified format the XML output contains parsing errors when the text in a property of the type String and/or Text contains < > etc. The faulty part of the output looks like this:

The problem is that in the Free_Text the < is replaced with:

This is not happening for the text between the Remark ( Property::Remark ) tags. I actually found the function in   handling these exceptions but is there a way to do the same for all properties or am I doing something wrong here? The property Remark has the type Text, I also tried String but with the same result. It makes sense that property of the type Text & String could contain < > [ ] etc. Best regards --Jongfeli (talk) 15:54, 30 October 2012 (UTC)


 * I have added some code to handle < & > in property values ($field_value). It works fine now.

I found a strange problem. When exporting with ViewXML I always got a "·" (dot at the top) at the end of the XML result (it actually is a "?"). I needed to run Start updating data in Admin functions for Semantic MediaWiki and then the XML file parsed without ant problems. This probably has something to do with me changing the used template but I am not sure yet. --Jongfeli (talk) 10:12, 31 October 2012 (UTC)


 * Thanks for finding that fix! I just added it to the code, and released it as a new version, 0.3.11 (along with some other changes). I don't know what's causing that ./? problem, though - hopefully it was just a temporary glitch. Yaron Koren (talk) 19:47, 5 November 2012 (UTC)

Status of Import CSV
Hello,

The import CSV function has saved a lot of time for data upload! But one thing that is a bit frustrating is importing CSV files without any result, and not knowing reason for failure. I keep an eye on progress of the job via "Recent Changes" page, but this is not ideal. HOw is best to monitor status of job, where I can see status of job and reason for job failure? Patelmm79 (talk) 15:44, 22 November 2012 (UTC)


 * You can check on the status of jobs in general by looking at the "job" table in the database. And if there are failures, you can create a log file and then check that. Yaron Koren (talk) 15:51, 22 November 2012 (UTC)


 * Much obliged Yaron. I've set this up as you've suggested, but suddenly the data load seems to be working fine, and I can't figure out what has changed.  But now that I can log, I can observe the behavior more closely.  Thank you! Patelmm79 (talk) 18:28, 22 November 2012 (UTC)

Clarifying behavior of Overwrite function
Hello, I wanted to clarify how one should expect the Overwrite function to work.

I tried to add data with a subset of fields onto a set of pages. The pages had existing data already, and I wanted to add information into new fields.

My CSV file contained only the list of fields that I wished to update. In such a case, I would expect that the fields that are not specified would not be touched.

However, upon import of the file using the "Overwrite" option, the extension imported the new data, and overwrote all other fields with null values (again, I did not specify those fields in the CSV file.

"Append" option adds another instance of the data set with the imported values, even though I only want one data set.

I have not yet tried Skip, but before I did so, I wanted to be clear if this is how the "Overwrite" option should behave. Patelmm79 (talk) 18:34, 22 November 2012 (UTC)


 * "Overwrite" just overwrites what was there before - it ignores the previous content. Yaron Koren (talk) 18:39, 22 November 2012 (UTC)


 * May I suggest a modification to the behavior of the "overwrite" function, or perhaps a new option? Rather than "overwriting" for all fields, whether or not the field is actually specified in the source data file, it would be better to overwrite only those fields that are specified in the source data, and leave the other fields untouched.  From a practical perspective, this would make it much easier to populate new properties when they are created; instead of having to create a file with all data and then import that file, one would only have data for that property that will be updated. Patelmm79 (talk) 16:02, 21 March 2013 (UTC)


 * That's a very good idea - although it would probably be better as another option, maybe called "Limited overwrite". Yaron Koren (talk) 20:30, 22 March 2013 (UTC)

Database Import error

 * I am importing a very large XML file (~18000 pages) and I get an error after a while:

"INSERT INTO `recentchanges` (rc_timestamp,rc_cur_time,rc_namespace,rc_title,rc_type,rc_minor,rc_cur_id,rc_user,rc_user_text, rc_comment,rc_this_oldid,rc_last_oldid,rc_bot,rc_moved_to_ns,rc_moved_to_title,rc_ip, rc_patrolled,rc_new,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action, rc_params,rc_id) VALUES ('20130130101705','20130130101705','0','SA1961.084','1','0','6510','0','127.0.0.1',NULL,'6665', '0','0','0',,'127.0.0.1','0','1','0','0','0','0',NULL,,'',NULL)" from within function "RecentChange::save". Database returned error "1048: Column 'rc_comment' cannot be null (localhost)"

Any idea what could be causing this? I should mention, I am using runJobs.php to add the files, they seem to add alrigth when I am refreshing the page.


 * What versions of MediaWiki and Data Transfer are you using? Also, to import the XML, are you using the page Special:Import or Special:ImportXML? Yaron Koren (talk) 14:26, 30 January 2013 (UTC)


 * Mediawiki 1.20.2 Data Transfer (Version 0.3.12). I am using Special:importXML, it went through quite a few successfully before throwing this error, then it keeps throwing it very regularly, even after reboot.

When importing with CSV, is there a size limit to how many characters per field?
I've had success importing other CSV files using DataTransfer, and I'm currently having trouble importing a CSV file with only 67 records. 57 of the records will import, but 10 of them generate the "500 Error" pages.

What is similar about the 10 records that won't import is that the character counts of a certain field are all over 6000.

The property I created to store these values is of Text, so I assumed it should be able to hold something that size.

Is there anything I can do to overcome this limit?

Thanks

Ancaster (talk) 07:51, 9 March 2013 (UTC)

Edited to add that was able to make it work for me when I set $smwgLinksInValues = false. So it fixed my Import CSV problem, but now I need to learn more about the linking.