Topic on Extension talk:RDFIO

Importing problems. Too much data causes unavailability to import?

5
00maiser00 (talkcontribs)

I am trying to make an RDF import, but whenever i put too many statements, it seems RDFIO is unavailable to handle it. After like 3 minutes, page turns white and i see that only like 15 statemets were correctly imported.

If i do not import much it works (with about 20).

Atm i am trying with like 250, and seems it is processing still(about 5 minutes done) so seems things are going smoothly.

Problem is i want to import about....2000. Any parameter that i can config. to do so, or is it a bug which happens when there is a lot of info going?

SHL (talkcontribs)

Hi! Thanks for testing it out! This is unfortunately a known problem, and has to do with the fact that a page write in MediaWiki is quite a heavy task, since text has to be indexed, many tables updated etc.

We have been thinking about adding a batch import feature, that would import a number of triples per "run", and then either refresh the page to continue, or that it is completely done from the commandline, but we haven't got there yet.

00maiser00 (talkcontribs)

Ok thanks! That would certainly work. I had to do the import in about 6 times, of about 250 statements each. Hope i can export data directly from mySQL to reause it on other databases and do not have to do this each time!

SHL (talkcontribs)

Have a look at the new commandline batch import script:

https://github.com/samuell/RDFIO/blob/master/maintenance/importRdf.php

If you have commandline access to the wiki, that might help, especially if you can change the timeout settings for commandline php to not time out too fast. The import probably is still equally slow, but hopefully can be done with a longer timeout this way.

Even better, if you have access to batch XML import in MediaWiki, is to use the new rdf2smw tool, which converts RDF n-triples to a MediaWiki XML dump:

https://github.com/samuell/rdf2smw

(If you need to convert from RDF/XML to N-Triples, you could use the rdf2rdf tool, for example:https://github.com/knakk/rdf2rdf ).

Plan to add support for more formats in rdf2smw shortly too.

Best // Samuel

SHL (talkcontribs)

Just a ping that the importRdf.php script is now included in the latest RDFIO release!

Reply to "Importing problems. Too much data causes unavailability to import?"