Extension talk:DumpHTML

Jump to navigation Jump to search

About this board

Archives 

/Archive 1


80.167.128.233 (talkcontribs)

Is there a 1.27-compatible version ? Or some other extension, performing the same task of exporting a whole wiki to a bunch of .html docs ?

Currently resorting to using HTTrack, which kind of works.

Reply to "New version of this?"

PHP Fatal error: Cannot access protected property ThumbnailImage

1
143.164.102.14 (talkcontribs)

Hello,

can you help me with this error? I upgraded the php and the Wiki (to 1.24.1) and the extension also, and the dumpHTML not working anymore.

/usr/bin/php /var/lib/mediawiki/extensions/DumpHTML/dumpHTML.php -d /var/www/wikidump/ --image-snapshot --group=user 

WARNING: destination directory already exists, skipping initialisation

Creating static HTML dump in directory /var/www/wikidump/.
Using database 127.0.0.1
Starting from page_id 1 of 736
Processing ID: 1
PHP Fatal error:  Cannot access protected property ThumbnailImage::$file in /var/lib/mediawiki/extensions/DumpHTML/dumpHTML.inc on line 1274

Regards, Balázs

Reply to "PHP Fatal error: Cannot access protected property ThumbnailImage"

Only executing dumpHTML on specific categories AND their sub categories

2
Zc5 (talkcontribs)

Is there a way to tell dumpHTML to convert on specific categories along with their sub categories? I do not want to use --categories because it converts ALL of the wiki's categories. I just want to convert a specific category along with ALL of its sub categories

Daisystanton~mediawikiwiki (talkcontribs)
Reply to "Only executing dumpHTML on specific categories AND their sub categories"

Subpages are not exported correctly - PNGs are copied as 0 byte files

3
Livxtrm~mediawikiwiki (talkcontribs)

I have noticed 2 problems with the DumpHTML extension in combination with the latest version of mediawiki. Mediawiki version: 1.20.3. Version of DumpHTML: rev 115794 from mediawiki svn.

The first problem is that if you use 'subpages' ( /s in urls ) then the links are broken in the static output. The generated pages have the incorrect number of ".." relative links in them. This may potentially be resolved by setting 'munge-title' to md5; to prevent the extra slashes from being dumped into the name, but I didn't try to do so. ( and I don't want to do so, I want the titles to be left alone, not converted to a hash )

How come there is no way to output pages in the same structure as they exist originally? Why are you -forced- to use the new 3 folder deep "hashing" mechanism? I assume this is to prevent too many files from being dumpdr into the same folder, but there should be a way to shut this off and the code should be fixed to allow subpages.

The second problem is that image-snapshot seems unable to handle png images. The main icon in the upper left of my wiki is a png. A file for that is created in the duplicate, but it is a 0 byte file. The original file is not copied. It seems the images are not directly copied, and that DumpHTML is not able to handle pngs.

This post was posted by Livxtrm~mediawikiwiki, but signed as Livxtrm.

83.167.103.98 (talkcontribs)

If you got a problem with images, try running with "sudo php dumpHTML.php ..." (yes, you can kill me now). DONT TRY TO USE IT ON PRODUCTION SERVER, ITS DANGEROUS! The problem seem to be coming from http://www.mediawiki.org/wiki/Manual:Image_Authorization, but at least with sudo it works as is.

141.5.11.5 (talkcontribs)

Running the dumpHTML.php with root permissions doesn't seem to solve the problem (at least for me).

Reply to "Subpages are not exported correctly - PNGs are copied as 0 byte files"

Unicode diacritic character on dumped html

1
Peachey88 (Flood) (talkcontribs)

My chars, mostly diacritic chars in dumped htmls seem changed.

Examples:

  • Saṃyutta Nikāya -> Sa峁儁utta Nik膩ya
  • … -> 鈥�
  • Soṇadaṇḍa Sutta -> So峁嘺da峁囜笉a

Does anyone have similar issue?

Thanks.

This post was posted by Peachey88 (Flood), but signed as Benzwu.

Reply to "Unicode diacritic character on dumped html"

Bug with german umlauts in filenames of images

1
Peachey88 (Flood) (talkcontribs)

When there is an umlaut in the filename of an image, the image will be saved in the dump but with a wrong name - the link in the HTML is not working. Does somebody know how to fix this?

This post was posted by Peachey88 (Flood), but signed as 212.114.205.190.

Reply to "Bug with german umlauts in filenames of images"
Peachey88 (Flood) (talkcontribs)

When i try to run the script, i always get the error message: default users are not allowed to read, please specify (--group=sysop). I also tried it with this option, but then i become the error message "the specified user group is not allowed to read". Any ideas? :)

Peachey88 (Flood) (talkcontribs)
187.78.14.203 (talkcontribs)

I had to check out the database of my mediawiki installation to find out the user group (check out LocalSettings.php and search for "Database settings"). Connect to the database using a client. Look for a table named user_groups. On this table there is a mapping from ids of users to user group. Try out diferent user group names until it works. :p

Reply to "Error"
Peachey88 (Flood) (talkcontribs)

I have a MW V1.16.0 with the Lockdown-extension installed and need a username and password to look at it. It is a Windows system so I used the modified Version of dumphtml that produces the hash-filenames. How can i provide username and password with DumpHTML? First time using, DumpHTML asked me providing a -group parameter and i did. DumpHTML then produced a lot of stuff. But i cant login to the index site in the static wiki. Furthermore some (many,most) pages and their pathes are missing in the static wiki. The page "login required" (Anmeldung erforderlich in German) exist multiple, multiple times. Everytime a page is shown it is this one, each on another path and filename. I tried to give 'read' permission to * in localsettings.php, but then the extension produces a static wiki without any style, pictures...and even many pages and paths linked to do not exist.

Peachey88 (Flood) (talkcontribs)

get the version for MW 1.16 - there is a --group parameter. use the group "user" (sysop doesn't work for me)

This post was posted by Peachey88 (Flood), but signed as 212.114.205.190.

Reply to "Howto provide login data"
Peachey88 (Flood) (talkcontribs)

If the error is

DB connection error: No database connection

It may be a problem logging into the database. Looking in the PostgreSQL logs revealed I had to adapt pg_hba.conf

This post was posted by Peachey88 (Flood), but signed as Albert25.

Peachey88 (Flood) (talkcontribs)

This is the error message I get when I try to execute the dumpHTML.php file on my local machine. Does anybody know a fix for that?

Peachey88 (Flood) (talkcontribs)

I have a similar problem which I cant solve:

DB connection error: Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstel
le nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergest
ellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat.
 (localhost)
Mediawiki 13.3, Win7, Mowes webserver, just installed php5.3 without webserver support to execute the dumpHTML.php
I can trace the problem to occur in includes/db/Loadbalancer.php function reallyOpenConnection, $db = new $class( $host, $user, $password, $dbname, 1, $flags ); just times out
database class is DatabaseMysql (I cant find the source), host, database name, user and password are correct
any help?
Reply to "PostgreSQL and Mediawiki 1.12.0"

Fatal error, running DumpHTML from Shell

4
91.57.77.46 (talkcontribs)

Hello :)

we searched, and found many people with the same error, but with no solutions... perhaps someone can help us...

we get the following error, when we try to run the dumpHTML.php from shell (different working folders, same error): Fatal error: require_once() [<a href='function.require'>function.require</a>]: Failed opening required '__DIR__/Maintenance.php' (include_path='.:/usr/local/lib/php') in /.../mediawiki/maintenance/commandLine.inc on line 24

kind regards, markus h.

Kelson (talkcontribs)

try the latest version of DumpHTML&Mediawiki and run DumpHTML as extension (not in the maintenance directory).

91.57.89.200 (talkcontribs)

Thank you for your answer, but that was not the Problem.

Out PHP-Version was too low (5.2.17). We updated php to 5.3.10 and now the script runs....

BUT: it only creates the index.html and all the links lead back to the online-version, which is of course non-sense for an offline-copy ;) Does anyone know, how we can force dumpHTML to download ALL pages as html-files and make relative links?

kind regards, markus h.

Reply to "Fatal error, running DumpHTML from Shell"