Extension talk:Data Transfer/Archive 2008 to 2013

From mediawiki.org

Error while uploading

Hi, My wiki uploads a file through the datatransfer extension. However, it always gives me the same error:

Error: the column 0 header, 'Gemeente[GemeenteCode]', must be either 'Title', 'Free Text' or of the form 'template_name[field_name]'

I do have a template Template:Gemeente that has a field GemeenteCode. (see code). So where can it go wrong?


<noinclude> This is the "Gemeente" template. It should be called in the following format: <pre> {{Gemeente |Naam= |ProvincieNaam= |GemeenteCode= |Inwoners= |InwonerKlasse= |WGRNaam= |Twitteradres= |URL= }} </pre> Edit the page to see the template text. </noinclude><includeonly> {| class="wikitable" ! align="left" |Naam Gemeente |align="left" | [[Naam::{{{Naam|}}}]] |- ! align="left" |Naam Provincie | align="left" |[[ProvincieNaam::{{{ProvincieNaam|}}}]] |- ! align="left" |GemeenteCode | align="left" |[[GemeenteCode::{{{GemeenteCode|}}}]] |- ! align="left" |Aantal Inwoners | align="left" |[[Inwoneraantal::{{{Inwoners|}}}]] |- ! align="left" |Inwoner Klasse | align="left" |[[GemeenteGrootteKlasse::{{{InwonerKlasse|}}}]] |- ! align="left" |Regionaam | align="left" |[[WGRNaam::{{{WGRNaam|}}}]] |} ==Nieuws== {{#ask: [[Category:Nieuws]] [[Gemeente::{{SUBPAGENAME}}]] | sort=date | order=descending | format=broadtable }} {{#widget:Twitter Search |query={{SUBPAGENAME}} |title={{SUBPAGENAME}} |caption=What people say about {{SUBPAGENAME}} }} {{#widget:Twitter |user={{Twitteradres}} |count=5 }} [[Category:Gemeente]] </includeonly>

Error after installation

I have attempted to install the Data Transfer Extension, but when trying to go to the View XML page I get the following error:
"Warning: Invalid argument supplied for foreach() in C:\mediawiki\extensions\DataTransfer\specials\DT_ViewXML.php on line 189". This error is printed 7 times. The page looks fine below the errors. Is there an easy fix? Jascraig 12:17, 23 April 2008 (UTC)Reply

Hi, thanks for the bug report. I think I know the problem - you have some pages that don't belong to any category, and Data Transfer wasn't expecting those. There's an easy fix, which I'll add to the next version of the extension. Yaron Koren 14:28, 23 April 2008 (UTC)Reply
Yaron, Thanks, that fixed the problem. Now I'm having problems when I click Submit on the View XML page, instead of getting the XML I expect, I'm redirected back to my Wiki Main Page. I am going to try and diagnose this locally, but if you have any ideas what could be causing this I would appreciate the information. Thanks for the quick response. Jascraig 16:21, 23 April 2008 (UTC)Reply

About dt_xml_* messages

I'm a translater in translatewiki.net. dt_xml_* software messages such as dt_xml_namespace are used in xml outputs as names of elements or attributes. They are, however, translatable. When you export xml files in some wiki, they can't be imported in different language wikis because of lacking portability of xml outputs. Is this intentional? --fryed-peach 15:01, 16 April 2009 (UTC)Reply

Well, Data Transfer was never meant to help with transferring pages from one wiki to another, since Special:Export and Special:Import already handle that; it's meant only for transferring data with non-wiki systems; so the tags can say anything. Yaron Koren 16:19, 16 April 2009 (UTC)Reply
Thank you for your quick response. The current documentation is not clear to me around interwiki-portability. But now I understand its design. --fryed-peach 06:03, 17 April 2009 (UTC)Reply

CSV example needed

I am unclear on how to properly format a CSV file to have it accepted by this extention. I am attempting to import a CSV file exported from Excel or Open Office.

Without the proper headers, the import generates this error: Error: the column 0 header, 'Revised: June 23, 2009', must be either 'Title', 'Free Text' or of the form 'template_name[field_name]

As the documentation says, actually it's pretty easy. All you have to do is preparing the header, in one line. IE, with the following CSV:
George, 22, 1977, He is a good guy
Julian, 24, 1979, He is a bad guy
The proper header would be:
Title, People[date], People[year], Free Text
You also have to create the Template with the proper variable to take it. In this case, would be Template:People, and in there you should have something like this "He is {{{date}}} year old, was born in {{{year}}}". And that's it.

#ask queries treated as templates

Hi... the XML output seems to treat #ask queries as templates, so I get:

<Template Name="#ask:_[[Category:Owned_entities]]_[[owner::School]]_[[status::Active]]"><Field Name="?is a">Type</Field><Field Name="?user">User</Field><Field Name="?description">Description</Field></Template>

Is that the expected behaviour? - Borofkin 04:39, 14 January 2010 (UTC)Reply

Call to a member function getNamespace()

I have a CSV-file with 5400 rows of information created in Openoffice Calc and saved as CSV. Just titles and a short free text on each row. When I try to import the CSV I get an error saying:
Fatal error: Call to a member function getNamespace() on a non-object in /home/christoffer/public_html/wiki/includes/JobQueue.php on line 277
This error seems to have something to do with non-UTF-8 encoded characters, but as far as I know I don't have any "special" character and my files are always saved with UTF-8 encoding. If I take some random rows from the file it works, but the whole file won't work. I have also tried to import a file with several non-UTF-8 characters like åäö, lots of quotation marks, commas and so on, and it also works. Is it the size of the file that is the problem (to big) or is it something else? --Squall Leonhart 22:37, 8 April 2010 (UTC)Reply

Hi - if possible, could you try splitting this up into smaller files, and try importing those? At the very least, it'll help determine whether the problem is the file size, or some specific row(s); and maybe help isolate the offending rows, if any. Yaron Koren 00:47, 10 April 2010 (UTC)Reply
I tried that now. I think I found the error. Some of the rows did contain characters like [], {} and ~ in the title column. When I removed them everything worked like a charm! --Squall Leonhart 13:16, 10 April 2010 (UTC)Reply
Okay, that's good to know, thanks. I'll try to fix the code so that it handles those characters more gracefully. Yaron Koren 21:20, 11 April 2010 (UTC)Reply
This has been fixed now in SVN, and it'll be fixed in the next version of Data Transfer as well, FYI. Yaron Koren 05:32, 13 April 2010 (UTC)Reply

Multiple templates & ImportCSV

One thing I've noticed - if you attempt to import a CSV file to create a page which has multiple instances of the same template, it silently just uploads the last one specified. eg in CSV:

Title, Template1[field1], Template1[field1]
Testpage, foo1, foo2

The workaround was to use ImportXML which does let you specify multiple versions of the same template. --92.236.50.225 17:05, 2 May 2010 (UTC)Reply

EDIT: hmm... have also noticed that "&amp;" in field values don't get re-imported by ImportXML, even if correctly escaped... --92.236.50.225 20:38, 2 May 2010 (UTC)Reply

The hack I've used is to create various templates, then Renaming them by ReplaceText extension, like this:
Title, Template1[field1], Template2[field1]
Testpage, foo1, foo2
Template2 is exactly the same as Template1, and once you imported it you go to Special:ReplaceText and replace Template2 for Template1 (yeah, its a tiny dirty hack XD)
Ooh, that's not bad! Yeah, I can't think of another solution for it. Yaron Koren 16:44, 12 May 2010 (UTC)Reply

Templates Inside of Templates

I run into a lot of XML parsing difficulties when it runs across a page with a template inside of a template. For instance, if the wikitext is:

{{Box|Home Demo|{{Mp3|Aisumasen Home.mp3}}}}

==Harmony==

So we have {{Bridge|key=I|start=ii}}, 

then the xml file gives

<Template Name="Box"><Field Name="1">Home Demo</Field><Field Name="2">Mp3</Field><Field Name="3">Aisumasen Home.mp3}}}</Field><Field Name="">=Harmony==

So we have Bridge</Field><Field Name="key">I</Field><Field Name="start">ii</Field><Field Name="4"></Field><Field Name="5">, 

It then seems to interpret main body text as various fields, which runs into all sorts of weird problems from then on.

Douglas (May 25 2010)

That looks like a bug in the code. Though Data Transfer isn't really geared for templates-within-templates, so I'm not that surprised that it doesn't work. Yaron Koren 13:20, 26 May 2010 (UTC)Reply

Template Calls with Special Characters

For instance, I sometimes have the line

<br>

as a parameter input. Example:

{{Template|param=I love you<br>...or do I?!}}

The XML then looks for a matching

</br>

Dunno if this is a bug in the code or a poor convention of mine.

Okay, I fixed this, so in case anyone wants to know: the convention between HTML and XHTML is that HTML permits the unclosed line break while XHTML requires that it be closed like

<br />

One more problem, however, arises with the use of

<syntaxhighlight lang="text" imline="div">...</syntaxhighlight>

I use this in my template input all the time when I want a box, for instance, to display some terminal text. I'll write something to ignore equal signs when its inside angular brackets.

-Douglas (May 26 2010)

overwrites existing pages

hi,

how can I prevent overwriting existing pages with this import-mechanism?

Thx. --Rolze 08:27, 15 June 2010 (UTC)Reply

Unfortunately, you can't - but that would be a nice feature to have. Yaron Koren 12:21, 15 June 2010 (UTC)Reply
Ok, so I - as a faithful semantic mediawiki fan and user - request that this feature gets on the roadmap of this fine extension! And that it won't be forgotten on the nice2have-wishlist! ;) --Rolze 21:29, 15 June 2010 (UTC)Reply
I think this would be a nice feature, too. Cheers --kgh 22:42, 15 July 2010 (UTC)Reply
still needed! Is there a possibility to get this on the roadmap? --Rolze 10:14, 22 April 2011 (UTC)Reply
Hi - that feature's already in there, as of version 0.3.7 - you can "skip" pages that already exist. Yaron Koren 15:46, 22 April 2011 (UTC)Reply

Bot flag is being ignored

Hi Yaron, so far the bot flag is being ignored by this extension. All the log actions of this extension are displayed on recent changes even though the importing user has a bot flag. This is not really a problem, but it would be nice if the extension accepts the bot flag in the future. Cheers --kgh 22:41, 15 July 2010 (UTC)Reply

semicolons as separators

Hi, it is me again. :-) I would be very useful if semicolons are the separators of choice with this extension, since they are by far more rare in use within texts. So if texts containing a comma get imported, they are destined to ruin the respective imported page. Hmm, is it possible to use text markers like ' or " for texts. This should be a workaround. Cheers --kgh 22:48, 15 July 2010 (UTC)Reply

Hi - well, CSV is a very standard format, unlike, say, "SSV". In CSV, values that contain commas are meant to be enclosed within quotes, which solves the problem. Yaron Koren 23:04, 15 July 2010 (UTC)Reply
That is true and every user should stick to this to keep the format exchangeable without having to prepare the file for further use. Thank you and cheers --kgh 07:16, 16 July 2010 (UTC)Reply

Enhancement to DataTransfer

Hi, a nice enhancement would be, if one is able to insert an individual description for the upload instead of the standard "CSV-Import". Another wish would be, that the name of the file used for import is piked up and displayed, too. To me the second request seems to be more important. Cheers --kgh 22:20, 19 July 2010 (UTC)Reply

That's a good idea, and you asked at just the right time - I was in the middle of getting a new version ready for a release. This feature is now available (though only for CSV import right now, not for XML import) in the latest version, 0.3.6. Yaron Koren 16:31, 26 July 2010 (UTC)Reply
Cool, thank you! Great work! I am very happy about this. I am mostly doing CSV imports, thus my urgent needs a fulfilled, which does not mean... :-) Cheers --kgh 08:51, 27 July 2010 (UTC)Reply

SVN tag for 0.3.6 missing

The last SVN tag is 0.3.5 --RScheiber 22:40, 14 September 2010 (UTC)Reply

Why does this extension exist?

RE: It should be noted that Data Transfer is not an ideal solution for backing up one's wiki, or transferring wiki pages from one MediaWiki site to another; for that, the much better solution is to use MediaWiki's built-in "Special:Export" and "Special:Import" pages.

So why have this extension at all if "Special:Export" and "Special:Import" is better? Adamtheclown 06:05, 24 November 2010 (UTC)Reply
What do you do in case your data is not stored in xml? That is the reason for this extension. Cheers --kgh 11:28, 24 November 2010 (UTC)Reply

Expected pages got mediawiki error

My error message: [1] Since I don't see why this extension exists, since "Special:Export" and "Special:Import" already moves pages better, I will not try to get assistance to fix this problem. Adamtheclown 06:17, 24 November 2010 (UTC)Reply

I think you've misunderstood the point of Data Transfer - it's for getting in data from any system other than MediaWiki. Here you seem to be using MediaWiki's XML, but Data Transfer has its own XML format that's different. Yaron Koren 15:03, 24 November 2010 (UTC)Reply

mediawiki error

My error message: Fatal error: Call to undefined function wfloadextensionmessages() in /home/zasve/public_html/oglasi/extensions/DataTransfer/specials/DT_ViewXML.php on line 18

Just when I approach http...www...zasve.net/oglasi/index.php?title=Posebno:Specialpages

wgLanguageCode = "sr-el";


Bonzo 12:45, 16 December 2010 (UTC)Reply

You must be using the very latest MediaWiki version. I'll fix that soon; for now, you can just delete that line in the code. Yaron Koren 17:50, 16 December 2010 (UTC)Reply
    • Yes. version 1.9.2 . Thanks :).

Bonzo 18:00, 16 December 2010 (UTC)Reply

Oh, you're using version 1.9.2? Never mind, then. :) That version's not supported by Data Transfer. Yaron Koren 18:36, 16 December 2010 (UTC)Reply

File size

Yaron, hi! First, I love this extension! Really makes my life easier! I´m importing 48k articles from Excel. Everything is ok, but I´m doing some tests and dividing the articles into multiple csv files everytime is painfull. How can I make your extension accept bigger csv files? Is that possible? Thanks, Filipe.

Hi - what's the error message you get when you try to upload a bigger file? Yaron Koren 04:59, 27 December 2010 (UTC)Reply
Doesn´t give a message. Returns to the same page like nothing has happened.
Hi - that's odd. What versions of MediaWiki and Data Transfer are you using? Yaron Koren 04:08, 5 January 2011 (UTC)Reply
I had this problem and it turned out to be a PHP timeout, increase maximum execution time in php.ini and restart apache to solve.
0.3.4. If my file is TOO big, it does that. I got a message once, when I removed some lines from the CSV file, that there were too many items. And if I remove even more, leaving like 1 thousand items, it works perfectly. Filipe, 7 January 2011 (UTC)
Okay, but what MediaWiki version? Yaron Koren 16:00, 7 January 2011 (UTC)Reply
1.16 (http://www.wikireceitas.com.br/Especial:Vers%C3%A3o) Filipe, 8 January 2011 (UTC)
Yaron can you help me woth the default tables I have to populate? I´m trying to make and excel macro to do that for me. Filipe, 8 January 2011 (UTC)
Sorry, I don't know what you mean by that. Yaron Koren 15:00, 10 January 2011 (UTC)Reply

Enhancement to DataTransfer

Hi, since the last version of this extension it is possible to add data to an existing page. However, it is not possible to add data to an existing template.

E. g. in case you want to add "World" to

{{Template1
|Property1=Hello
}}

you will get

{{Template1
|Property1=Hello
}}
{{Template1
|Property2=World
}}

instead of

{{Template1
|Property1=Hello
|Property2=World
}}

since this extension does not identify, that the same extension is already on the page.

There is a quite dirty workaround to do it Remove }} from the respective pages with extension ReplaceText and then import

|Property2=World}}

The disadvantage of this is, that you have two new revisions per page and that it is only possible once.

Perhaps it is possible to enhance this extension to avoid this. Hopefully this is not a daydream. ;)

Cheers --[[kgh]] 14:34, 9 February 2011 (UTC)Reply

This feature is very needed ;) Cheers, --Rolze (talk) 12:15, 4 February 2013 (UTC)Reply

Importing Category pages

How do I import Category pages? I have a large number of these externally generated. If I try to generate my pages with a block beginning:

<Namespace Name="Category">
<Page Title="catname">
<Free_Text>
.... etc

DataTransfer strips the Namespace information and creates the page within Main, not within Category.

If instead I try to reverse engineer it by using ViewXML on an existing Category page, it gives me a block beginning <Category Name="catname"> which includes the contents of every page in the category, which is not what I want. Marinheiro 10:00, 9 March 2011 (UTC)Reply

This was easier than I thought (though I did get a bit misled by the Help page): just use Category:category name as the page title, eg.

<Page Title="Category:Inactive Resources">
<Free_Text>
[[Category:Resources]]
</Free_Text>
</Page>

217.169.30.146 22:53, 12 March 2011 (UTC)Reply

Version to download for MediaWiki 1.17

This is a just a tip for anybody like me who was having trouble with this extension after upgrading to Mediawiki 1.17. Any files that I tried to upload just gave the "The file you uploaded seems to be empty. This might be due to a typo in the file name. Please check whether you really want to upload this file." error.

If you follow the "Download snapshot" link on the Data Transfer Extension page, and select Mediawiki 1.17.x, you will be given version 0.3.6 of this extension which is not compatible with Mediawiki 1.17. You have to either select "trunk" on the Download Snapshot page, or download the code directly from the main Extension page. The version of this extension in the Semantic Bundle is also not compatible with Mediawiki 1.17.

Ability to add a single property to many pages in an automated fashion

Hello -- I know that this question was already asked to a certain extent and I wanted to ask if any progress had been made or to ask if there was another work-around.

I also would like to be able to add a property and value to a given page after the page is already created. I was wondering if you could use the export function on each page and then modify each file to have the additional property/field and then to re-import all of those same pages.

Thanks in advance, Dc321 14:54, 18 September 2011 (UTC)Reply

There's now what might be a better solution to that problem: the "autoedit" functionality in Semantic Forms, which is available as both a parser function and an API action (the latter might be more helpful if you have a lot of pages, and if you know at least a minimal amount of programming.) I would look into that one. Yaron Koren 21:01, 19 September 2011 (UTC)Reply

Order of templates

Hello, did anyone find out what in what order the templates are printed in the article text? To me it seem quite randomly, but maybe there is a system? rotsee 20:50, 12 October 2011 (UTC)Reply

Database glitches?

I've noticed a few times that with CSV export (Data Transfer 0.3.9), some content can get duplicated on pages that are not related, except that they're in the same category and use the same template/form. If you've chosen "append to existing content", you can simply remove the spurious extras by hand. You have to be careful though if you choose "overwrite". Just so you know. Cavila MW 1.17, MySQL 5.5.16, Php 5.3.8 12:07, 15 April 2012 (UTC)Reply

Import CSV not overwriting existing articles

I am using Data Transfer to upload a list of articles in a specific namespace with a csv file. After importing the csv, I tweaked some of the free text and reimported the file. The changes haven't shown up. I've been careful to have the overwrite options selected. Any ideas how to get the overwrite behavior I'm looking for? Davious (talk) 20:44, 17 May 2012 (UTC)Reply

Oh. Wow, I'm sorry. I didn't realize the 1st import was still going on after 20 minutes. It's all good. Thanks.

Exporting contents of Category pages

Hello, I will need to export a lot of category pages (i.e. the category description pages, that happens to contain a lot of information structured by templates), and I realise that I will have to to some modification to the code of Data Transfer to achieve that. Before I start: Did anyone here do something like that already?

//rotsee (talk) 15:31, 8 June 2012 (UTC)Reply

Turned out to be quite simple, once looking at the code. rotsee (talk) 07:22, 12 June 2012 (UTC)Reply


Action log for data loads

Hello,

This is a fantastic extension, and I'm using it quite extensively. I load data via the CSV option, and it works very well. However, sometimes the data load does not work (that is, the "Import CSV" option states that the the pages will be created, but no data change results). I'm trying to identify the source of the problem, but am unable to find any logs for this extension, beyond the Recent Changes page. How can I figure out what the source of my problem is? Patelmm79 (talk) 20:37, 8 October 2012 (UTC)Reply

My guess is that, when that happens, it's because the move is still waiting in the job queue - you can check the "job" database table to see what's going on. Yaron Koren (talk) 20:46, 8 October 2012 (UTC)Reply

XML Parsing Error

Hello. The extension is working like a charm but there is only one little hick-up. When using Special:ViewXML and selecting a category and the Simplified format the XML output contains parsing errors when the text in a property of the type String and/or Text contains < > etc. The faulty part of the output looks like this:

<Page>
	<ID>509</ID>
	<Title>Pagetitle</Title>
		<Templatename>
		    <Remark>Bla bla <20</Remark>
		</Templatename>
	<Free_Text id="1">Bla bla &lt;20</Free_Text>
</Page>

The problem is that in the Free_Text the < is replaced with:

&lt;

This is not happening for the text between the Remark ([[Property::Remark]]) tags. I actually found the function in DT_ViewXML.php handling these exceptions but is there a way to do the same for all properties or am I doing something wrong here? The property Remark has the type Text, I also tried String but with the same result. It makes sense that property of the type Text & String could contain < > [ ] etc. Best regards --Jongfeli (talk) 15:54, 30 October 2012 (UTC)Reply

I have added some code to handle < & > in property values ($field_value). It works fine now.
            if ( $field_has_name ) {
              $field_value = str_replace( '&', '&amp;', $field_value );
	      $field_value = str_replace( '<', '&lt;', $field_value );  //New line
	      $field_value = str_replace( '>', '&gt;', $field_value );  //New line
              if ( $simplified_format ) {
                $field_name = str_replace( ' ', '_', trim( $field_name ) );
                $text .= "<" . $field_name . ">";
                $text .= trim( $field_value );
                $text .= "</" . $field_name . ">";
              } else {
                $text .= "<$field_str $name_str=\"" . trim( $field_name ) . "\">";
                $text .= trim( $field_value );
                $text .= "</$field_str>";
              }
              $field_value = "";
              $field_has_name = false;
            }

I found a strange problem. When exporting with ViewXML I always got a "·" (dot at the top) at the end of the XML result (it actually is a "?"). I needed to run Start updating data in Admin functions for Semantic MediaWiki and then the XML file parsed without ant problems. This probably has something to do with me changing the used template but I am not sure yet. --Jongfeli (talk) 10:12, 31 October 2012 (UTC)Reply

Thanks for finding that fix! I just added it to the code, and released it as a new version, 0.3.11 (along with some other changes). I don't know what's causing that ./? problem, though - hopefully it was just a temporary glitch. Yaron Koren (talk) 19:47, 5 November 2012 (UTC)Reply

Status of Import CSV

Hello,


The import CSV function has saved a lot of time for data upload! But one thing that is a bit frustrating is importing CSV files without any result, and not knowing reason for failure. I keep an eye on progress of the job via "Recent Changes" page, but this is not ideal. HOw is best to monitor status of job, where I can see status of job and reason for job failure? Patelmm79 (talk) 15:44, 22 November 2012 (UTC)Reply

You can check on the status of jobs in general by looking at the "job" table in the database. And if there are failures, you can create a log file and then check that. Yaron Koren (talk) 15:51, 22 November 2012 (UTC)Reply
Much obliged Yaron. I've set this up as you've suggested, but suddenly the data load seems to be working fine, and I can't figure out what has changed. But now that I can log, I can observe the behavior more closely. Thank you! Patelmm79 (talk) 18:28, 22 November 2012 (UTC)Reply

Clarifying behavior of Overwrite function

Hello, I wanted to clarify how one should expect the Overwrite function to work.

I tried to add data with a subset of fields onto a set of pages. The pages had existing data already, and I wanted to add information into new fields.

My CSV file contained only the list of fields that I wished to update. In such a case, I would expect that the fields that are not specified would not be touched.

However, upon import of the file using the "Overwrite" option, the extension imported the new data, and overwrote all other fields with null values (again, I did not specify those fields in the CSV file.

"Append" option adds another instance of the data set with the imported values, even though I only want one data set.

I have not yet tried Skip, but before I did so, I wanted to be clear if this is how the "Overwrite" option should behave. Patelmm79 (talk) 18:34, 22 November 2012 (UTC)Reply

"Overwrite" just overwrites what was there before - it ignores the previous content. Yaron Koren (talk) 18:39, 22 November 2012 (UTC)Reply


May I suggest a modification to the behavior of the "overwrite" function, or perhaps a new option? Rather than "overwriting" for all fields, whether or not the field is actually specified in the source data file, it would be better to overwrite only those fields that are specified in the source data, and leave the other fields untouched. From a practical perspective, this would make it much easier to populate new properties when they are created; instead of having to create a file with all data and then import that file, one would only have data for that property that will be updated. Patelmm79 (talk) 16:02, 21 March 2013 (UTC)Reply
That's a very good idea - although it would probably be better as another option, maybe called "Limited overwrite". Yaron Koren (talk) 20:30, 22 March 2013 (UTC)Reply
I really like this idea too. I often need to make mass updates to just one or two fields and I do not want to overwrite user comments, other edits, etc. I was surprised to see it overwrite all my fields when I only included just a few changed fields in the csv upload. Would love to see this functionality included in this extension. I depend on this extension a lot and overall am very grateful for it. Ancaster (talk) 06:21, 22 April 2013 (UTC)Reply
Okay - I just added this feature in to the latest version of Data Transfer, 0.4; I think it'll be a nice addition. Yaron Koren (talk) 20:39, 2 May 2013 (UTC)Reply
Also another question / clarification of functionality--does the "Import CSV" currently replace ALL content on the page (including the "Free text") with what is in the CSV file? I seem to remember, during my previous data imports in previous versions, that the non-semantic content on the page was preserved, but I could be remembering incorrectly... 69.80.65.58 18:33, 25 March 2013 (UTC)Reply
Yes, it replaces everything. Yaron Koren (talk) 20:10, 25 March 2013 (UTC)Reply

Database Import error

I am importing a very large XML file (~18000 pages) and I get an error after a while:
"INSERT  INTO `recentchanges` (rc_timestamp,rc_cur_time,rc_namespace,rc_title,rc_type,rc_minor,rc_cur_id,rc_user,rc_user_text,
rc_comment,rc_this_oldid,rc_last_oldid,rc_bot,rc_moved_to_ns,rc_moved_to_title,rc_ip,
rc_patrolled,rc_new,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,
rc_params,rc_id) VALUES ('20130130101705','20130130101705','0','SA1961.084','1','0','6510','0','127.0.0.1',NULL,'6665',
'0','0','0','','127.0.0.1','0','1','0','0','0','0',NULL,'','',NULL)"
from within function "RecentChange::save".
Database returned error "1048: Column 'rc_comment' cannot be null (localhost)"

Any idea what could be causing this? I should mention, I am using runJobs.php to add the files, they seem to add alrigth when I am refreshing the page.

What versions of MediaWiki and Data Transfer are you using? Also, to import the XML, are you using the page Special:Import or Special:ImportXML? Yaron Koren (talk) 14:26, 30 January 2013 (UTC)Reply
Mediawiki 1.20.2 Data Transfer (Version 0.3.12). I am using Special:importXML, it went through quite a few successfully before throwing this error, then it keeps throwing it very regularly, even after reboot.

When importing with CSV, is there a size limit to how many characters per field?

I've had success importing other CSV files using DataTransfer, and I'm currently having trouble importing a CSV file with only 67 records. 57 of the records will import, but 10 of them generate the "500 Error" pages.

What is similar about the 10 records that won't import is that the character counts of a certain field are all over 6000.

The property I created to store these values is of Text, so I assumed it should be able to hold something that size.

Is there anything I can do to overcome this limit?

Thanks

Ancaster (talk) 07:51, 9 March 2013 (UTC)Reply

Edited to add that was able to make it work for me when I set $smwgLinksInValues = false. So it fixed my Import CSV problem, but now I need to learn more about the linking.

Exporting just one page in XML

I figure out that I can export a group of pages with a link to Special:ViewXML?categories[CategoryName] or using namespaces. Is there a way to get a single page export with Special:ViewXML? | Jaider Msg 18:09, 30 April 2013 (UTC)Reply

There isn't, other than creating a category with only that page. Though it's an interesting idea... why would you want XML for only one page? Yaron Koren (talk) 18:33, 30 April 2013 (UTC)Reply
I would like to be able to convert a XML document (wikipage) with a XSLT stylesheet to other formats from the Data Transfer export. Similary to the Special:ExportRDF/Working_with_MediaWiki_(2012) but with the actual page content (without RDF URIs, properties, datatypes, objects, etc.). So, it would be nice to have a Special:ViewXML where we could put one or more page titles and export them, like the MediaWiki Special:Export (but with your nice XML output :)). | Jaider Msg 19:08, 30 April 2013 (UTC)Reply
Okay, that makes sense. Yes, that would be a nice feature to have. I would think it makes sense to implement this with a "pages=" parameter, where one or more pages can be specified directly, in the same way that the MediaWiki API does it. It shouldn't be hard to implement it... you wouldn't happen to know PHP, would you? :) Yaron Koren (talk) 19:42, 30 April 2013 (UTC)Reply
It's great to hear that it shouldn't be hard to implement it, but, no, unfortunately, I wouldn't know PHP :( | Jaider Msg 20:05, 30 April 2013 (UTC)Reply
This was a well-timed request - I was just about to release a new version of Data Transfer, so I added this parameter (now called "titles="). If you get the latest version, it should work. Yaron Koren (talk) 20:33, 2 May 2013 (UTC)Reply
Wow! Awesome! That's great! Thanks a lot, Yaron! You are the best! This feature is really helpful :) | Jaider Msg 00:03, 3 May 2013 (UTC)Reply

White Screen on Special:ViewXML

Hello there. We are trying to make use of this awesome Extension. MW 1.20.2, php 5.3.3, SMW 1.8.0.1, german language

But unfortunately Special:ViewXML only shows a blank html page. Firebug doesn't show anything. No errors in the Php-log-files. The CSV- and XML-Import pages show fine however. Haven't tried using them yet though. Anybody any idea? Thank you very much.

EDIT: Oops. Looked into the wrong log file. :|

Errors: PHP Warning: in_array() expects parameter 2 to be array, string given in /GlobalFunctions.php on line 795 and PHP Fatal error: Call to a member function getParentCategoryTree() on a non-object in /DT_ViewXML.php on line 37 --Simon Fecke (talk) 09:48, 17 May 2013 (UTC)Reply

EDIT2: Ok, we found out $wgUrlProtocols is supposed to be an array and changed that. Still the same blank screen though. :-( But now only the second error appears in the error log (DT_ViewXML.php, line 37)

It sounds like you might have a bad entry in your "categorylinks" table, for some reason. Anyway, I think you can get around this problem by adding the following line above line 37:
if ( is_null( $title ) ) { continue; }
If that works for you, I'll add it to the code. Yaron Koren (talk) 12:49, 17 May 2013 (UTC)Reply
It worked!! Thank you very much Mr Koren! --Simon Fecke (talk) 12:01, 24 May 2013 (UTC)Reply

Error: Non-static method DTPageStructure::newFromTitle() should not be called statically

Hi Yaron, hi all,

The export special page of this extension is terrific and I could make use of it in many occasions. However, when I try to import things, I keep getting the following error:

[12-Jun-2013 11:45:23 UTC] PHP Strict standards:  Non-static method DTPageStructure::newFromTitle() should not be called statically in /Applications/MAMP/htdocs/mediawiki-1.20.2/utest-windwiki/extensions/DataTransfer/specials/DT_ViewXML.php on line 216

I use MW 1.20.2 with the latest DataTransfer version 0.4. Is this a known error?

My example import XML is pretty simple:

<Pages>
    <Page Title="XML-Test">
      <Template Name="IstKomponenteVon">
        <Field Name="1">Monitor</Field>
        <Field Name="Verantwortlicher">Herbert</Field>
        <Field Name="Position">A5</Field>
      </Template>
      <Free_Text>
        <p> dieser Absatz wurde geaendert...,  id="1" auch bei Free_Text-Attribut rausgenommen
          <a href="/mediawiki-1.20.2/utest-windwiki/index.php?title=Kategorie:XML-Test&action=edit&redlink=1" class="new" title="Kategorie:XML-Test (page does not exist)">Kategorie:XML-Test</a>
        </p>
      </Free_Text>
    </Page>
</Pages>

Any hints will be appreciated, thanks in advance, --Achimbode (talk) 13:20, 12 June 2013 (UTC)Reply

Hi - it's known now. :) It's not really an error, just a PHP notice, but still that's annoying. I just fixed the problem in Git, so if you can get the latest Git code, the problem should go away. Yaron Koren (talk) 15:44, 12 June 2013 (UTC)Reply
Hi, thanks a lot for the fix. You are right, it is just a notice. But it was the only indicator I had - there is nothing imported.
Computers says:
Importing... 0 pages will be created from the XML file. 
But when I try to open the page, it says:
There is currently no text in this page. You can search for this page title in other pages, search the related logs, or edit this page. 
Same if I change the page name to XML-Test2. So: is there anything wrong about the XML? As there are no errors thrown or any other messages output on the page (like when you leave the namespace tag in there), I don't really know where to search for the problem...
A second message in the logs was: [12-Jun-2013 12:13:36 UTC] PHP Strict standards: call_user_func_array() expects parameter 1 to be a valid callback, non-static method SemanticESP::sespUpdateDataBefore() should not be called statically in /Applications/MAMP/htdocs/mediawiki-1.20.2/utest-windwiki/includes/Hooks.php on line 216, but this is gone now, too. --Achimbode (talk) 18:07, 13 June 2013 (UTC)Reply
Okay. My guess is that it's an issue with the XML - the "Free Text" part looks weird. Also, what's SemanticESP? Yaron Koren (talk) 20:21, 13 June 2013 (UTC)Reply
SemanticESP comes from the SemanticExtraSpecialProperties extension. I commented that out and now there are no more PHP errors - sorry, I should have looked it up. Thought it was coming from the DT extension. Computer now says
Importing... 1 pages will be created from the XML file. 
, but I still can't find any. Neither in the related logs, nor by entering the page directly, nor by listing all pages. But the output on top of the page said Expected </Page>, found </Free_Text>. So I simplified the XML to
<Pages>
    <Page Title="XML-Test3">
      <Template Name="IstKomponenteVon">
        <Field Name="1">Monitor</Field>
        <Field Name="Verantwortlicher">Herbert</Field>
        <Field Name="Position">A5</Field>
      </Template>
      <FreeText>
        <p> dieser Absatz wurde geaendert...</p>
      </FreeText>
    </Page>
</Pages>
with FreeText as tag name, because it seemed to me the extension might have a problem with the underline. Result was as one might have expected: Expected <Template>, got <FreeText> Expected <Template>, got <p> Expected </Page>, got </p> Expected </Page>, got </FreeText>, so I changed the tag name back to Free_Text again and got Expected </Page>, got </Free_Text>.
Finally, I removed the free text part completely and tried just importing a page with a template. Now there are no more reports about tag confusion on top of the page, no more php errors, but still no page to be found. Could you probably post an XML example that works for you?
Fortunately I found a solution for my actual problem that makes use of the DT export, but allows for the normal way of importing XML (Special:Import). Nevertheless, the import of this extension could be of great use for us in other projects and I'd love to get it to work. Have you tested it with MW 1.20.2? --Achimbode (talk) 14:07, 14 June 2013 (UTC)Reply
No... then again, I haven't tested it much with any MediaWiki version (I should). The "ImportCSV" option is much better tested, and is generally the easier solution anyway, though it might not be helpful for you. Yaron Koren (talk) 20:06, 14 June 2013 (UTC)Reply

Having trouble loading poperties, categories and line breaks ("\n") - the import seems to ignore them

There seems to be something that I'm not getting. I'm wanting to upload pages, a line at a time, from .csv files into mediawiki. I've loaded the semantic extensions. I'm able to load the .csv file into a template, but it seems to ignore the properties, the categories and even text features like '\n' - simply treating it all as raw text. How do I change this behaviour? The files I've used for testing are: .csv:

Title,Import_template[namespace],Import_template[category],Import_template[policy_owner],Import_template[policy_name],Import_template[policy_description]

"sandpit:one","policy","[category:policy]\n[[status::active]]\n[[renewal
date::20/02/2014]]\n[[service::Administration]]\n[[speciality::all]]","My.Name","test_policy_name","==Policy
Description==\n\ntest_policy_description\n\nmore test policy
description"

What I get is:

title |
namespace | policy
category | [[category::[category:policy]\nactive\n20/02/2014\nAdministration\nall]]
policy_owner | My.Name
policy_name | test_policy_name
policy_description| =Policy
Description==\n\ntest_policy_description\n\nmore test policy
description
test_policy |

So it seems to be interpreting the input as if it was binary text, ignoring the categories, the [] and the '\n'. So, as you can see, I've tried to set properties and categories, but doesn't seem to recognise them. The template I've set is:

<noinclude>
This is the "Import_template" template.
It should be called in the following format:
<pre>
{{Import_template
|title=
|namespace=
|category=
|policy_owner=
|policy_name=
|policy_description=
}}
</pre>
Edit the page to see the template text.
</noinclude><includeonly>{| class="wikitable"
! title
|  [[title::{{{title|}}}]]
|-
! namespace
| {{#arraymap:{{{namespace|}}}|,|x|[[namespace::x]]}}
|-
! category
| {{#arraymap:{{{category|}}}|,|x|[[category::x]]}}
|-
! policy_owner
|  [[name::{{{policy_owner|}}}]]
|-
! policy_name
|  [[text::{{{policy_name|}}}]]
|-
! policy_description
|  [[text::{{{policy_description|}}}]]
|-
! test_policy
| {{#ask:[[test_policy::{{SUBJECTPAGENAME}}]]|format=table}}
|}

[[Category:testing]]
</includeonly>

Fustbariclation (talk) 10:37, 29 December 2013 (UTC)Reply

There appear to be various things wrong with both your source CSV and your template definition; but the most important thing is that the source CSV shouldn't contain any category or property tags - it should just contain raw data. It's up to the template to assign properties and categories to that data. (Another major thing wrong: your "category" field seems to contain five different, unrelated pieces of data.) Yaron Koren (talk) 13:45, 29 December 2013 (UTC)Reply