Manual talk:Backing up a wiki

Note: Some of the information in this page was adapted from Manual:Moving a wiki. robchurch | talk 20:59, 11 April 2007 (UTC)

alternative cronjob command
Apparently I don't have rights to make changes on the page .... Sytange: you can post, and after that everything is disappeared....

Because if you don't have the nice-program the following alternative cronjob:

/usr/bin/mysqldump -u [username] --password=[password] [databasename] | /bin/gzip > [databasename].gz


 * [username] - this is your database username
 * [password] - this is the password for your database
 * [databasename] - the name of your database

don't forget to remove the []

You can use other names for the file: [database].gz and/or put a number before it.

If you use different numbers for each cronjob, you can schedule it (for instance: a cronjob for each day of the week and name it 1[databasename], 2[databasename] etc. For uploading the .gz file with the database manager (in my situation with DirectAdmin) the name of the file is not critical, but it has to be a .gz file. --Livingtale 13:33, 22 September 2008 (UTC)

"php dumpbackup.php --full" returns "DB connection error: Unknown error"
Peter Blaise asks:

I run:
 * C:\www\apache2\htdocs\mediawiki\maintenance>php dumpbackup.php --full
 * DB connection error: Unknown error

... and when I search the drive for new files, I see nothing's been created. How should an XML export / backup work? HELP, please!
 * -- Peter Blaise peterblaise 13:34, 20 June 2007 (UTC)


 * Learn to use a shell. You will notice that the dumpBackup.php script spits out XML. This XML should be saved into a file. The standard means of doing this is to redirect standard output to a file. On Windows (and also on POSIX-compliant shells), this is done using the > operator, e.g.


 * php dumpBackup.php --full > backup.xml


 * We expect our users to have at least a basic working knowledge of their computers. robchurch | talk 14:11, 21 June 2007 (UTC)


 * ...and geeks are expected to read carefully before bashing other users: Peters problem is not the output file, his problem is the DB connection error: Unknown error. Got the same under Kubuntu Feisty, seems something is wrong with the out-of-the-box installation. Cheers, 88.73.85.21 13:45, 21 July 2007 (UTC)

I'm getting the same error on an FC5 installation. Does anyone know the cause of this error. So far my Google searched haven't turned up any relevant information. Zeekec 20:32, 1 August 2007 (UTC)
 * I tried creating an AdminSettings.php file as suggested, but still get the same error. Zeekec 15:27, 2 August 2007 (UTC)


 * And I'm too...
 * I've exactly the same error but need dumpBackup.php for Lucene integration.
 * Can nobody explain the DB connection error: Unknown error- Problem?

Edit:
 * OK Guys,

After big trouble and consideration of this script I've found a solution for this/my and our Problem. The Problem exists, because of the for dumpBackup.php required File "includes/backup.inc". This File does the main-backup-work and uses some MediaWiki-Variables($wg...). This is really no Problem, if dumpBackup.php runs with mediaWiki but as standalone console-script, it will miss this $wg..-Parameters. So dumpBackup.php uses empty strings for $wgDBtype,$wgDBadminuser,$wgDBadminpassword,$wgDBname,$wgDebugDumpSql and this causes the DB connection error: Unknown error while running. I've solved this Problem with a self-written php-wrapper-script, which only initializes this Variables and then simply include dumpBackup.php and now it works fine. This is my php-wrapper-script: <?php
 * 1) dumpBackupInit - Wrapper Script to run the mediaWiki xml-dump "dumpBackup.php" correctly
 * 2) @author: Stefan Furcht
 * 3) @version: 1.0
 * 4) @require: /srv/www/htdocs/wiki/maintenance/dumpBackup.php

$wgDBtype = 'mysql'; $wgDBadminuser="[MySQL-Username]"; $wgDBadminpassword ="[MySQL-Usernames-Password]"; $wgDBname = '[mediaWiki-Database-scheme]'; $wgDebugDumpSql='true';
 * 1) The following Variables musst be set, to get dumpBackup.php at work
 * 1) you'll find this Values in the DB-section into your mediaWiki-Config: LocalSettings.php

require_once("/srv/www/htdocs/wiki/maintenance/dumpBackup.php"); ?>
 * 1) XML-Dumper 'dumpBackup.php' requires the setted Vars to run
 * 2) simply include the original dumpBackup-Script

Now you can use this script as like as the dumpBackup.php with exception it will (hopefully) now run correctly. Example:  php dumpBackupInit.php --current > WikiDatabaseDump.xml 

I hope this will help you. Please excuse my properly bad english

Regards -Stefan-

Another (simpler) solution.

Simply add the above mentioned variables to you LocalSettings.php You will notice that most of them are already there. The ones that need to be added are: $wgDBadminuser="[MySQL-Username]"; $wgDBadminpassword ="[MySQL-Usernames-Password]";

Tested to work with MediaWiki 1.11.0

-Rammer-

Similar problem with dumpBackup
Docduke says:
 * I have been using MediaWiki for about a year, and have about 20 MB of pages in 1.6.6. I have been backing it up regularly with mysqldump.  This computer is still running fine, but I would like to be able to move the data to another computer.  I installed 1.10.1 on the new computer, restored the data to mysql, and found that Table 'wikidb.objectcache' doesn't exist (localhost).  Rooting around some more, I found out about dumpBackup.  On the new computer, php dumpBackup.php --full > wikidump works.  On 1.6.6 it does not, and reports (Can't contact the database server: Unknown error).  Looking inside dumpBackup, I find, near the beginning:

require_once( 'command_line.inc' ); require_once( 'SpecialExport.php' ); require_once( 'maintenance/backup.inc' );
 * What does this do? Well, the PHP Manual says that it includes the files, and the link to "require" says that if it does not find them, it will fail.  However, looking around for these files, 'command_line.inc' and 'backup.inc' are in 'maintenance', but 'SpecialExport.php' is in 'includes'.  But how does php find the files?  'SpecialExport is in a different folder, and the prepended 'maintenance' edge on the third file suggests the program should be run from the root folder for the wiki.  In any case, the first and third are inconsistent.

It would be very helpful to readers if someone who has actually used dumpBackup can explain how it is supposed to be used, and what sets the environment so that files such as these can be found. The environment is very likely relevant, since the "keys" to the database come from LocalDefaults.
 * Docduke 2 August 2007 0147 GMT. P.S. The gremlins are working overtime tonight.  I logged into MediaWiki about a half hour ago, now it won't let me in.  I even had it send me a new password, and it won't let me in with that either!
 * Docduke 01:16, 2 August 2007 (UTC) [It let me in on a different computer]
 * Digging deeper ... In version 1.6.1, the "Can't contact the database server" message comes from line 1829 of includes/Database.php. My guess is that the "real" people adapt a copy of "index.php" to initialize the environment, then call dumpBackup from that environment in an automated backup script.  Either a copy of that script, or some indication of what is in it, would be greatly appreciated!
 * Docduke 02:48, 2 August 2007 (UTC)

I still have no idea how dumpBackup sets its environment, but I have found the solution to my problem, and probably that of the other folks who have reported backup failures. It is necessary to create an AdminSettings.php file at the wiki root from a copy of AdminSettings.sample, inserting a valid username, password pair for a DBuser with full privileges. Then dumpBackup runs when started in either the root folder or the "maintenance" folder, in version 1.6.6. It runs in 1.10.1 without an AdminSettings file because the LocalDefaults file has a username, password pair with administrator privileges. [Hint: I tried update.php. It failed, but the resulting error message said to check the AdminSettings file.]
 * Docduke 03:18, 2 August 2007 (UTC)

information needed on how to verify and restore a backup
Peter Blaise says:

Great overview, but so much more needed:
 * - how to verify our backup or export
 * - how to restore or import our data again

And we need specific, unambiguous, differentiating definitions of these words. Let me try:


 * "backup / restore":file copy, done from outside the program using the operating system utilities


 * "export / import":data copy, done inside the program using MediaWiki program interface utilities

... or something like that. Then we can clearly write specific steps (the "by doing what") for each way.

I agree that backing up everything is probably best, but, how does someone KNOW what's been customized and belongs to them, and what's standard and can be replaced from a fresh reinstall of the master software? Does any restore or reinstall routine intelligently preserve existing data? Or, do these process merely overwrite anything in their way? So, if I backup today, then crash tomorrow, and then restore yesterday's backup, and something is missing or it doesn't function properly, what should I do? I'd probably then reinstall from scratch to rebuild an empty MediaWiki structure. Then I'd try restoring from my backup again to fill in the supposedly completely rebuilt but empty structure. What if even that fails? What gets clobbered? How do I "know"?

Should I reinstall an empty MediaWiki and try to "import" data? What if I don't have a data "export", but I only have a file "backup" copy? How do I then reconfigure my customized choices? Do all my users come back with a restore or import?

So, it probably makes sense to preserve an mirror image off line and copy everything or one file at a time from there during troubleshooting if something's not working in the main system. My MediaWiki is small at the moment - ~500,000 words in ~4,500 sections, ALL files = ~120 MB, so making multiple copies to CD daily is acceptable, even ~5 backups to a CD will fit, total materials cost per year of ~$60 or less.

I suggest people try renaming directory structures to pull their main MediaWiki off line and TRY restoring their backup to a fresh directory structure to verify if their chosen backup routine works or not. If not, figure out why not before trusting our backups!


 * -- Peter Blaise peterblaise 11:21, 19 April 2007 (UTC)

--


 * I have moved the paragraph you added for now, as it's not quite correct:
 * See also Verifying a wiki backup and Restoring a wiki from a backup and Combining wikis from multiple backups. Note also that this page addresses backing up DATA only.  Your MediaWiki installation also probably includes much custom configuration in the form of changes to various supporting files that are as yet NOT incorporated into any database table, including CSS cascading style sheet files and PHP script files and JS Java script files.  You must backup and restore / re-integrate these as separate steps.  (Does someone want to write an extension to import and export all support files into supplemental database tables so everything is all in one place?)


 * The statement that the page is about data backup only is plain wrong - there's an extra section about backing up files; It could probably be more detailed, though. The only file that is "custom" by default is LocalSettings.php, and the uploaded files in the images directory of course.
 * Also, while red links are generally a good thing, they should "invite" people to write pages that actually make sense. First of all, those pages should be in the Manual namespace. "Verifying a wiki backup" isn't really possible, or rather, it's the same as Restoring a wiki from a backup (which is a page we should probably write soon). And Combining wikis from multiple backups probably doesn't make sense as a page of its own, and should be addressed in Restoring a wiki from a backup which should also explain what is overwritten when, or not.
 * There's also Manual:Moving a wiki, with which the contents of future pages should be coordinated.
 * Sticking everything into the database isn't really possible - the configuration must be outside, because it has to tell mediawiki how to access the database in the first place. Customized JS and CSS can go into the database (as MediaWiki:common.js and MediaWiki:common.css respectively), and that's the preferred way. Skins that need custom PHP code cannot reside in the database, program code needs to be in files (for technical as well as security reasons). Same for extensions. -- Duesentrieb ⇌ 11:32, 19 April 2007 (UTC)

Import config files, backup, verify, restore, automate
Peter Blaise says:


 * Why not auto-import text config files into the database?

Thanks, Duesentrieb. I see your points. Yes, there's mention of some support files, but I have dozens not mentioned! In my pushing for the maturation of MediaWiki, including it's support universe, I suggest that it could use a feature to automatically import copies of our custom config files into the main database even though it needs master copies of those files outside, in the operating system, in order to run properly. Your suggestion of manually creating copies as articles is interesting. How about automation, anyone?


 * Why not create an auto verify of backup?

I also see that you agree with me that there is no MediaWiki "verify" or "compare" option for backup or export, and as you say, so all we can do is try to restore or import and check it manually (against what, our memory of how the wiki behaved before?). Again, I'm pointing out a difference between "mature" applications we may have experienced before MediaWiki, and MediaWiki. My operating system backup application has a verify / compare feature after backup, MediaWiki has ... what?


 * Let's call items by their names, not shorthand.

Also, to reduce confusion and invite and enhance the quick success of newcomers, may I suggest sticking to a specific and complete nomenclature? You say you moved my comments to the "talk" pages, which I could not find. But, thankfully, I eventually found your link that brought me to the "discussion" page, which I'd already been looking at, and was already using. Why not call it the "discussion/talk" page? Thanks.
 * -- Peter Blaise peterblaise 18:02, 20 April 2007 (UTC)

--


 * The phrases "talk page" and "discussion page" are synonymous and interchangeable in MediaWiki wiki culture. robchurch | talk 00:54, 29 April 2007 (UTC)

--

Peter Blaise says:

So ... newbies, non-wiki culture people, are not invited to make the wiki their home? Rob, I'm suggesting that we recognize how offputting and success-inhibiting "jargon" is to newcomers. I suggest that we all foster growth by welcoming newcomers, and welcoming and recognizing criticism, not chiding them for their "not getting it, not fitting in". I'm suggesting that any Wiki is not owned by first comers, but is owned by anyone at any moment who is reading and offering their edits, their input at any time. I think this coincides with the intended Wiki "culture" more so than elitist jargon and belligerent exclusivity of first-comers against newbies.
 * "Why can't we all just get along?"
 * -- Rodney King

"Talk" and "discussion" ado not function as synonyms in that they are not interchangeable. Note this page's URL says "talk" yet this page's tab says "discussion". Try typing "discussion" into the URL and try looking for a "talk" tab. No can do. Is it so hard to say, "I moved your comment to the discussion/talk page" AND give a link, rather than presume others can figure out the ambiguity on their own, and then dismiss them when they suggest a way around getting lost?
 * -- Peter Blaise peterblaise 10:39, 1 May 2007 (UTC)

--


 * The problem is that internally, these pages are "talk" pages, and the namespace's english name has always been "talk" (and user_talk, etc). But the (english) text on the tab at the top of each page was at some point decided to be labeled "discussion" instead. This label can be changed by editing MediaWiki:Talk. Most other languages seem to use the equivalent of "Diskussion" on the label, and also as the Namespace name. Some languages may use something else entirely.
 * This is inconsistent and confusing, yes. But it's a fact that in the context of MediaWiki, "talk" and "discussion" are interchangable; Making "Discussion" an alias for the "Talk" namespace may be nice, but it would break backwards compatibility. This confusion isn't easy to resolve. It's better to explain it and live with it. -- Duesentrieb ⇌ 12:57, 1 May 2007 (UTC)

back up with phpmyadmin
If your host will not allow you to access such tools and you can only use phpmyadmin, or if this does not work for you, you might want to:
 * 1) export your full mysql wiki from phpmyadmin export functionality. Save the exported file locally.
 * 2) edit the exported file, and change the following if it applies for you:
 * 3) search and replace to change your tables prefixes (e.g., because prefixing is no more required on your new host)
 * 4) to work around the "latin1 in mysql > 4.1" character set problem, search and replace latin1 character set with utf8 one's. This might cause some strange behaviors afterwards because I'm not sure that media wiki won't be disturbed by the column encoding changing without warning. But apparently, for me, it works (and I found no other way to do it).
 * 5) * please note: as utf8 encoding take more space (three bytes per character) than latin1 (two, I think), some keys might become too large (my mysql installation does not allow keys > 1000 bytes). For these fields I didn't change the encoding (luckily these tables were empty at migration time). You can just do this by trial and error: phpmyadmin will warn you at import time if some key is too big.
 * 6) you might want to transform the utf8 to latin1 back with ALTER TABLE statements (phpmyadmin can do that for you). This will not revert the changes you just made as this time the contents will be re-encoded also.

Maybe this should be checked by some mediawiki expert and, if judged a good advice, integrated into the manual? I spent a full day searching for this workaround!

--

Thanks for the idea : worth trying but not sufficient for me though :(

NewMorning 20:10, 16 June 2008 (UTC)

latin1
The following line contradicts itself, no? Use the option --default-character-set=latin1 on the mysqldump command line to avoid the conversion if you find it set to "latin1". Jidanni 04:46, 13 December 2007 (UTC)

The latin warning has me confused as well. I think that section should be rewritten/clarified by someone who understands it. Where do I enter 'SHOW CREATE TABLE text'?. It doesn't look like a valid sql command to me and gives me an error when I run it. You can see which character set your tables are using with a statement like SHOW CREATE TABLE text. The last line will include a DEFAULT CHARSET clause. --71.107.96.222 17:40, 22 March 2008 (UTC)

No answer still.

I agree with the above: SHOW CREATE TABLE text; is not a working command

replacing "text" or skipping "text" neither

--Livingtale 09:06, 20 September 2008 (UTC)

Rewrite this page
This page is written very badly, it jumps all over and I can barely figure out what to do, please make them into more short and simple steps. PatPeter 20:51, 10 December 2007 (UTC)

Corruption Section for MySQL 4.1 is unclear
The section that discusses possible corruption due to nonstandard character encoding is unclear. I cannot seem to make out if it is saying that the dump may be corrupted or if my actual database may be corrupted.

It discusses doing a conversion prior to dumping, but does not say if this conversion is reflected back to the DB.

I thank you for the documentation that is here, but could you please clarify this issue. --Vaccano 17:50, 30 January 2008 (UTC)


 * I agree, this section is not clear at all!
 * I'm setting up a new wiki, my host has mysql 5.
 * In phpMyAdmin, here are the server variables I have :
 * character set client = utf8
 * character set connection = utf8
 * character set database = latin1
 * character set filesystem = binary
 * character set results = utf8
 * character set server = latin1
 * character set system = utf8
 * What should I do ? The connection, client, results and system are utf8 but server is latin1. Do I need a conversion ?
 * --Iubito 17:24, 1 March 2008 (UTC)
 * --Iubito 17:24, 1 March 2008 (UTC)

You can check for instant in your "categorylinks" table : mine is full of accent transformed in strange caracters. Impossible to figure out what to do with phpMyAdmin though...

NewMorning 20:10, 16 June 2008 (UTC)

I'm a WikiNewb....help!!
Can I just back up my wamp file and everything in it on an external hard disk?

I mean, the backup that I'm used to consists of moving files to a different disk. I know that there must be more to it than this when it comes to a wiki, but, frankly, I don't know what the heck I'm doing.

I downloaded MediaWiki two days ago to use as a database for personal journals, etc. and know how to edit and create new "articles" within the database. I plan on creating new "articles" everyday for the rest of my life and suspect that my computer's hardware will not last that long. So, I not only want to back up all of my database information, in case of some tragedy, but eventually will want move it to a different computer, altogehter.

What am I to do?

==

My table categorylinks is corrupted !
I can't figure out how this is possible : I first thought that the dump had corrupted the table. But when I checked inside PHPMyAdmin, I realised that the french accents, correctly written in the wiki, where corrupted in the table ! This is annoying since my wiki was supposed to be a test version, and I need to backup it for a new server where strange caracters give strange caracters on screen ! I only have access to PhpMyAdmin, and wonder what I can do with that : any suggestion ? NewMorning 20:18, 16 June 2008 (UTC)

In the end it's not that bad : I extracted my corrupted database and imported as well with bad caracters : they appear correctly in th other wiki ! Strange that I can't have them corrected in the DB though... I also could read the database extraction using a text converter to UTF8, but if inserted corrected in the other DB the wiki sets strange caracters again! --NewMorning 04:32, 22 June 2008 (UTC)


 * phpmyadmin probably got it wrong. actually, it doesn't have a way to know how the caracters in the tables are encoded, since mediawiki (per default) stores them as binary.
 * Generally, be very carfull about character encoding: no matter what client you use (phpmyadmin, php cli client, whatever), mysql nearly always performs some conversion on the characters. which is supposed to make them "look right" for you, but quite often screws things up. -- Duesentrieb ⇌ 11:37, 22 June 2008 (UTC)

Thanks for answering, I got it through anyway : export was weard (strange accents) but I imported it the same way and it worked ! I used PHPmyAdmin both times, and both time had set everything to UTF8. The export itsefl was readable with a text editor tranforming it to UTF8, but I could not import it afterwards : it was nice in the table, but awful in the wiki !

dumpBackup.php seems to be generating invalid xml
I'm trying to export my MediaWiki content to TWiki and the conversion program fails with "junk after document element at line 9626, column 2, byte 907183 at ... (very long error message)" The problem seems to be that the xml produced by dumpBackup.php is invalid. I tested this by pointing Firefox to the xml dump and it stops at the same place. I've looked at the raw xml code and I don't see anything obviously wrong. Any ideas what might be the problem? I'm using MediaWiki 1.12 and FreeBSD 6.2
 * please run xmllint --noout
 * xmllint is standard on most linux distributions, don't know about bsd. if you can't find it, please find some other xml checker and run it over the file. the hope is that it will produce a more meaningful error message. -- Duesentrieb ⇌ 19:42, 6 August 2008 (UTC)
 * Also, please provide the xml code around the given location (use head and tail, or something similar) -- Duesentrieb ⇌ 19:44, 6 August 2008 (UTC)

Here's the output of xmllint --noout (from a Linux box)--Ldillon 15:55, 7 August 2008 (UTC) Basically, there is a at line 1, a corresponding at line 9808, and a new at line 9809, where the parser errors. I did a quick grep -c \<page && grep -c \<\/page and I can at least say there are the same number of open and close tags. I'd include more but, even with the code and nowiki tags, this page tries to parse the text.

If the first tag is, then something is very wrong - should not be the top level tag (and there must only be one top level tag in an xml document, hence the error). The first tag in the file should be a tag, which wraps everything - compare the output of Special:Export/Test. What'S the exact command used to generate these dumps? do you perhaps use --skip-header? That would generate such an incomplete xml snippet. -- Duesentrieb ⇌ 18:55, 7 August 2008 (UTC)

I've tried a bunch of things to get this working, including the --skip-header and --skip-footer flags, because I was getting header and footer "junk" that I didn't need when I tried to import to TWiki. Sorry if I do not completly understand what the flags are supposed to do; I didn't see any documentation that said otherwise so I took it at face-value. I'm still getting errors on import when i omit the flags, but the xml passes "xmllint --noout" so I guess the problem lies elsewhere. Thank you for your feedback.--Ldillon 21:19, 7 August 2008 (UTC)

Bringing the wiki offline?
Is it possible to bring a MediaWiki site offline or to put it in a "read only" mode so no changes are made during a database dump? --Kaotic 12:50, 20 August 2008 (UTC)


 * Manual:$wgReadOnly -- Duesentrieb ⇌ 13:55, 20 August 2008 (UTC)

I've taken your script and added the ability to place the wiki into read/write mode. let me know what you think. User:Kaotic/WikiBackup --Kaotic 09:51, 21 August 2008 (UTC)