2007

First Note

Latest comment: 17 years ago1 comment1 person in discussion

Note: Some of the information in this page was adapted from Manual:Moving a wiki. robchurch | talk 20:59, 11 April 2007 (UTC)Reply

Information needed on how to verify and restore a backup

Latest comment: 17 years ago2 comments2 people in discussion

Great overview, but so much more needed:
- how to verify our backup or export
- how to restore or import our data again

And we need specific, unambiguous, differentiating definitions of these words. Let me try:

"backup / restore"
file copy, done from outside the program using the operating system utilities
"export / import"
data copy, done inside the program using MediaWiki program interface utilities

... or something like that. Then we can clearly write specific steps (the "by doing what") for each way.

I agree that backing up everything is probably best, but, how does someone KNOW what's been customized and belongs to them, and what's standard and can be replaced from a fresh reinstall of the master software? Does any restore or reinstall routine intelligently preserve existing data? Or, do these process merely overwrite anything in their way? So, if I backup today, then crash tomorrow, and then restore yesterday's backup, and something is missing or it doesn't function properly, what should I do? I'd probably then reinstall from scratch to rebuild an empty MediaWiki structure. Then I'd try restoring from my backup again to fill in the supposedly completely rebuilt but empty structure. What if even that fails? What gets clobbered? How do I "know"?

Should I reinstall an empty MediaWiki and try to "import" data? What if I don't have a data "export", but I only have a file "backup" copy? How do I then reconfigure my customized choices? Do all my users come back with a restore or import?

So, it probably makes sense to preserve an mirror image off line and copy everything or one file at a time from there during troubleshooting if something's not working in the main system. My MediaWiki is small at the moment - ~500,000 words in ~4,500 sections, ALL files = ~120 MB, so making multiple copies to CD daily is acceptable, even ~5 backups to a CD will fit, total materials cost per year of ~$60 or less.

I suggest people try renaming directory structures to pull their main MediaWiki off line and TRY restoring their backup to a fresh directory structure to verify if their chosen backup routine works or not. If not, figure out why not before trusting our backups!

-- Peter Blaise peterblaise 11:21, 19 April 2007 (UTC)Reply

I have moved the paragraph you added for now, as it's not quite correct:

See also Verifying a wiki backup and Restoring a wiki from a backup and Combining wikis from multiple backups. Note also that this page addresses backing up DATA only. Your MediaWiki installation also probably includes much custom configuration in the form of changes to various supporting files that are as yet NOT incorporated into any database table, including CSS cascading style sheet files and PHP script files and JS Java script files. You must backup and restore / re-integrate these as separate steps. (Does someone want to write an extension to import and export all support files into supplemental database tables so everything is all in one place?)

The statement that the page is about data backup only is plain wrong - there's an extra section about backing up files; It could probably be more detailed, though. The only file that is "custom" by default is LocalSettings.php, and the uploaded files in the images directory of course.

Also, while red links are generally a good thing, they should "invite" people to write pages that actually make sense. First of all, those pages should be in the Manual namespace. "Verifying a wiki backup" isn't really possible, or rather, it's the same as Restoring a wiki from a backup (which is a page we should probably write soon). And Combining wikis from multiple backups probably doesn't make sense as a page of its own, and should be addressed in Restoring a wiki from a backup which should also explain what is overwritten when, or not.

There's also Manual:Moving a wiki, with which the contents of future pages should be coordinated.

Sticking everything into the database isn't really possible - the configuration must be outside, because it has to tell mediawiki how to access the database in the first place. Customized JS and CSS can go into the database (as MediaWiki:Common.js and MediaWiki:Common.css respectively), and that's the preferred way. Skins that need custom PHP code cannot reside in the database, program code needs to be in files (for technical as well as security reasons). Same for extensions. -- Duesentrieb ⇌ 11:32, 19 April 2007 (UTC)Reply

Import config files, backup, verify, restore, automate

Latest comment: 17 years ago4 comments3 people in discussion

Peter Blaise says:

Why not auto-import text config files into the database?

Thanks, Duesentrieb. I see your points. Yes, there's mention of some support files, but I have dozens not mentioned! In my pushing for the maturation of MediaWiki, including it's support universe, I suggest that it could use a feature to automatically import copies of our custom config files into the main database even though it needs master copies of those files outside, in the operating system, in order to run properly. Your suggestion of manually creating copies as articles is interesting. How about automation, anyone?

Why not create an auto verify of backup?

I also see that you agree with me that there is no MediaWiki "verify" or "compare" option for backup or export, and as you say, so all we can do is try to restore or import and check it manually (against what, our memory of how the wiki behaved before?). Again, I'm pointing out a difference between "mature" applications we may have experienced before MediaWiki, and MediaWiki. My operating system backup application has a verify / compare feature after backup, MediaWiki has ... what?

Let's call items by their names, not shorthand.

Also, to reduce confusion and invite and enhance the quick success of newcomers, may I suggest sticking to a specific and complete nomenclature? You say you moved my comments to the "talk" pages, which I could not find. But, thankfully, I eventually found your link that brought me to the "discussion" page, which I'd already been looking at, and was already using. Why not call it the "discussion/talk" page? Thanks.

-- Peter Blaise peterblaise 18:02, 20 April 2007 (UTC)Reply

The phrases "talk page" and "discussion page" are synonymous and interchangeable in MediaWiki wiki culture. robchurch | talk 00:54, 29 April 2007 (UTC)Reply

Peter Blaise says:

So ... newbies, non-wiki culture people, are not invited to make the wiki their home? Rob, I'm suggesting that we recognize how offputting and success-inhibiting "jargon" is to newcomers. I suggest that we all foster growth by welcoming newcomers, and welcoming and recognizing criticism, not chiding them for their "not getting it, not fitting in". I'm suggesting that any Wiki is not owned by first comers, but is owned by anyone at any moment who is reading and offering their edits, their input at any time. I think this coincides with the intended Wiki "culture" more so than elitist jargon and belligerent exclusivity of first-comers against newbies.

"Why can't we all just get along?"

-- Rodney King

"Talk" and "discussion" ado not function as synonyms in that they are not interchangeable. Note this page's URL says "talk" yet this page's tab says "discussion". Try typing "discussion" into the URL and try looking for a "talk" tab. No can do. Is it so hard to say, "I moved your comment to the discussion/talk page" AND give a link, rather than presume others can figure out the ambiguity on their own, and then dismiss them when they suggest a way around getting lost?

-- Peter Blaise peterblaise 10:39, 1 May 2007 (UTC)Reply

The problem is that internally, these pages are "talk" pages, and the namespace's english name has always been "talk" (and user_talk, etc). But the (english) text on the tab at the top of each page was at some point decided to be labeled "discussion" instead. This label can be changed by editing MediaWiki:Talk. Most other languages seem to use the equivalent of "Diskussion" on the label, and also as the Namespace name. Some languages may use something else entirely.

This is inconsistent and confusing, yes. But it's a fact that in the context of MediaWiki, "talk" and "discussion" are interchangable; Making "Discussion" an alias for the "Talk" namespace may be nice, but it would break backwards compatibility. This confusion isn't easy to resolve. It's better to explain it and live with it. -- Duesentrieb ⇌ 12:57, 1 May 2007 (UTC)Reply

"php dumpbackup.php --full" returns "DB connection error: Unknown error"

Latest comment: 16 years ago8 comments5 people in discussion

Peter Blaise asks:

I run:

C:\www\apache2\htdocs\mediawiki\maintenance>php dumpbackup.php --full

DB connection error: Unknown error

... and when I search the drive for new files, I see nothing's been created. How should an XML export / backup work? HELP, please!
-- Peter Blaise peterblaise 13:34, 20 June 2007 (UTC)Reply

Learn to use a shell. You will notice that the dumpBackup.php script spits out XML. This XML should be saved into a file. The standard means of doing this is to redirect standard output to a file. On Windows (and also on POSIX-compliant shells), this is done using the > operator, e.g.

php dumpBackup.php --full > backup.xml

We expect our users to have at least a basic working knowledge of their computers. robchurch | talk 14:11, 21 June 2007 (UTC)Reply

...and geeks are expected to read carefully before bashing other users: Peters problem is not the output file, his problem is the DB connection error: Unknown error. Got the same under Kubuntu Feisty, seems something is wrong with the out-of-the-box installation. Cheers, 88.73.85.21 13:45, 21 July 2007 (UTC)Reply

I'm getting the same error on an FC5 installation. Does anyone know the cause of this error. So far my Google searched haven't turned up any relevant information. Zeekec 20:32, 1 August 2007 (UTC)Reply

I tried creating an AdminSettings.php file as suggested, but still get the same error. Zeekec 15:27, 2 August 2007 (UTC)Reply

And I'm too...

I've exactly the same error but need dumpBackup.php for Lucene integration.

Can nobody explain the DB connection error: Unknown error- Problem? --11 September 2007

OK Guys,
After big trouble and consideration of this script I've found a solution for this/my and our Problem. The Problem exists, because of the for dumpBackup.php required File "includes/backup.inc". This File does the main-backup-work and uses some MediaWiki-Variables($wg...). This is really no Problem, if dumpBackup.php runs with mediaWiki but as standalone console-script, it will miss this $wg..-Parameters. So dumpBackup.php uses empty strings for $wgDBtype,$wgDBadminuser,$wgDBadminpassword,$wgDBname,$wgDebugDumpSql and this causes the DB connection error: Unknown error while running. I've solved this Problem with a self-written php-wrapper-script, which only initializes this Variables and then simply include dumpBackup.php and now it works fine.

This is my php-wrapper-script:

 <?php
 ## dumpBackupInit - Wrapper Script to run the mediaWiki xml-dump "dumpBackup.php" correctly
 ## @author: Stefan Furcht
 ## @version: 1.0
 ## @require: /srv/www/htdocs/wiki/maintenance/dumpBackup.php

 # The following Variables musst be set, to get dumpBackup.php at work
 $wgDBtype = 'mysql';
 $wgDBadminuser="[MySQL-Username]";
 $wgDBadminpassword ="[MySQL-Usernames-Password]";
 $wgDBname = '[mediaWiki-Database-scheme]';
 $wgDebugDumpSql='true';
 # you'll find this Values in the DB-section into your mediaWiki-Config: LocalSettings.php

 # XML-Dumper 'dumpBackup.php' requires the setted Vars to run
 # simply include the original dumpBackup-Script
 require_once("/srv/www/htdocs/wiki/maintenance/dumpBackup.php");
 ?>

Now you can use this script as like as the dumpBackup.php with exception it will (hopefully) now run correctly. Example: php dumpBackupInit.php --current > WikiDatabaseDump.xml

I hope this will help you. Please excuse my properly bad english

Regards -Stefan- 12 September 2007

Another (simpler) solution.

Simply add the above mentioned variables to you LocalSettings.php

You will notice that most of them are already there. The ones that need to be added are:

   $wgDBadminuser="[MySQL-Username]";
   $wgDBadminpassword ="[MySQL-Usernames-Password]";

Tested to work with MediaWiki 1.11.0

-Rammer- 9 November 2007

back up with phpmyadmin

Latest comment: 15 years ago4 comments4 people in discussion

If your host will not allow you to access such tools and you can only use phpmyadmin, or if this does not work for you, you might want to:

export your full mysql wiki from phpmyadmin export functionality. Save the exported file locally.
edit the exported file, and change the following if it applies for you:
1. search and replace to change your tables prefixes (e.g., because prefixing is no more required on your new host)
2. to work around the "latin1 in mysql > 4.1" character set problem, search and replace latin1 character set with utf8 one's. This might cause some strange behaviors afterwards because I'm not sure that media wiki won't be disturbed by the column encoding changing without warning. But apparently, for me, it works (and I found no other way to do it).
  - please note: as utf8 encoding take more space (three bytes per character) than latin1 (two, I think), some keys might become too large (my mysql installation does not allow keys > 1000 bytes). For these fields I didn't change the encoding (luckily these tables were empty at migration time). You can just do this by trial and error: phpmyadmin will warn you at import time if some key is too big.
3. you might want to transform the utf8 to latin1 back with ALTER TABLE statements (phpmyadmin can do that for you). This will not revert the changes you just made as this time the contents will be re-encoded also.

Maybe this should be checked by some mediawiki expert and, if judged a good advice, integrated into the manual? I spent a full day searching for this workaround! --OlivierMiR 7 July 2007

Thanks for the idea : worth trying but not sufficient for me though :(

NewMorning 20:10, 16 June 2008 (UTC)Reply

latin1

The following line contradicts itself, no? Use the option --default-character-set=latin1 on the mysqldump command line to avoid the conversion if you find it set to "latin1". Jidanni 04:46, 13 December 2007 (UTC)Reply

The latin warning has me confused as well. I think that section should be rewritten/clarified by someone who understands it.

Where do I enter 'SHOW CREATE TABLE text'?. It doesn't look like a valid sql command to me and gives me an error when I run it.

You can see which character set your tables are using with a statement like SHOW CREATE TABLE text. The last line will include a DEFAULT CHARSET clause.

--71.107.96.222 17:40, 22 March 2008 (UTC)Reply

No answer still.

I agree with the above:

SHOW CREATE TABLE text; is not a working command

replacing "text" or skipping "text" neither

--Livingtale 09:06, 20 September 2008 (UTC)Reply

Rewrite this page

Latest comment: 16 years ago1 comment1 person in discussion

This page is written very badly, it jumps all over and I can barely figure out what to do, please make them into more short and simple steps. PatPeter 20:51, 10 December 2007 (UTC)Reply

2008

Corruption Section for MySQL 4.1 is unclear

Latest comment: 16 years ago3 comments3 people in discussion

The section that discusses possible corruption due to nonstandard character encoding is unclear. I cannot seem to make out if it is saying that the dump may be corrupted or if my actual database may be corrupted.

It discusses doing a conversion prior to dumping, but does not say if this conversion is reflected back to the DB.

I thank you for the documentation that is here, but could you please clarify this issue. --Vaccano 17:50, 30 January 2008 (UTC)Reply

I agree, this section is not clear at all!

I'm setting up a new wiki, my host has mysql 5.

In phpMyAdmin, here are the server variables I have :

character set client = utf8
character set connection = utf8
character set database = latin1
character set filesystem = binary
character set results = utf8
character set server = latin1
character set system = utf8

What should I do ? The connection, client, results and system are utf8 but server is latin1. Do I need a conversion ?

--Iubito 17:24, 1 March 2008 (UTC)Reply

You can check for instant in your "categorylinks" table : mine is full of accent transformed in strange caracters. Impossible to figure out what to do with phpMyAdmin though...

NewMorning 20:10, 16 June 2008 (UTC)Reply

I'm a WikiNewb....help!!

Can I just back up my wamp file and everything in it on an external hard disk?

I mean, the backup that I'm used to consists of moving files to a different disk. I know that there must be more to it than this when it comes to a wiki, but, frankly, I don't know what the heck I'm doing.

I downloaded MediaWiki two days ago to use as a database for personal journals, etc. and know how to edit and create new "articles" within the database. I plan on creating new "articles" everyday for the rest of my life and suspect that my computer's hardware will not last that long. So, I not only want to back up all of my database information, in case of some tragedy, but eventually will want move it to a different computer, altogehter.

What am I to do? --13 June 2008

My table categorylinks is corrupted !

Latest comment: 16 years ago3 comments2 people in discussion

I can't figure out how this is possible : I first thought that the dump had corrupted the table. But when I checked inside PHPMyAdmin, I realised that the french accents, correctly written in the wiki, where corrupted in the table ! This is annoying since my wiki was supposed to be a test version, and I need to backup it for a new server where strange caracters give strange caracters on screen ! I only have access to PhpMyAdmin, and wonder what I can do with that : any suggestion ? NewMorning 20:18, 16 June 2008 (UTC)Reply

In the end it's not that bad : I extracted my corrupted database and imported as well with bad caracters : they appear correctly in th other wiki ! Strange that I can't have them corrected in the DB though... I also could read the database extraction using a text converter to UTF8, but if inserted corrected in the other DB the wiki sets strange caracters again!

--NewMorning 04:32, 22 June 2008 (UTC)Reply

phpmyadmin probably got it wrong. actually, it doesn't have a way to know how the caracters in the tables are encoded, since mediawiki (per default) stores them as binary.

Generally, be very carfull about character encoding: no matter what client you use (phpmyadmin, php cli client, whatever), mysql nearly always performs some conversion on the characters. which is supposed to make them "look right" for you, but quite often screws things up. -- Duesentrieb ⇌ 11:37, 22 June 2008 (UTC)Reply

Thanks for answering, I got it through anyway : export was weard (strange accents) but I imported it the same way and it worked ! I used PHPmyAdmin both times, and both time had set everything to UTF8. The export itsefl was readable with a text editor tranforming it to UTF8, but I could not import it afterwards : it was nice in the table, but awful in the wiki ! --NewMorning 22 June 2008

dumpBackup.php seems to be generating invalid xml

Latest comment: 15 years ago5 comments2 people in discussion

I'm trying to export my MediaWiki content to TWiki and the conversion program fails with "junk after document element at line 9626, column 2, byte 907183 at ... (very long error message)" The problem seems to be that the xml produced by dumpBackup.php is invalid. I tested this by pointing Firefox to the xml dump and it stops at the same place. I've looked at the raw xml code and I don't see anything obviously wrong. Any ideas what might be the problem? I'm using MediaWiki 1.12 and FreeBSD 6.2 --Ldillon 6 August 2008

please run xmllint --noout <filename>

xmllint is standard on most linux distributions, don't know about bsd. if you can't find it, please find some other xml checker and run it over the file. the hope is that it will produce a more meaningful error message. -- Duesentrieb ⇌ 19:42, 6 August 2008 (UTC)Reply

Also, please provide the xml code around the given location (use head and tail, or something similar) -- Duesentrieb ⇌ 19:44, 6 August 2008 (UTC)Reply

Here's the output of xmllint --noout (from a Linux box)

 MediaWiki_dump.xml:9809: parser error : Extra content at the end of the document
 <page>
 ^

Basically, there is a <page> at line 1, a corresponding </page> at line 9808, and a new <page> at line 9809, where the parser errors. I did a quick grep -c \<page && grep -c \<\/page and I can at least say there are the same number of open and close <page> tags.

     </revision>
   </page>
 <page>
     <title>Cacti Information</title>
     <id>3</id>
    <revision>

I'd include more but, even with the code and nowiki tags, this page tries to parse the text. --Ldillon 15:55, 7 August 2008 (UTC)Reply

If the first tag is <page>, then something is very wrong - <page> should not be the top level tag (and there must only be one top level tag in an xml document, hence the error). The first tag in the file should be a <mediawiki> tag, which wraps everything - compare the output of Special:Export/Test. What'S the exact command used to generate these dumps? do you perhaps use --skip-header? That would generate such an incomplete xml snippet. -- Duesentrieb ⇌ 18:55, 7 August 2008 (UTC)Reply

I've tried a bunch of things to get this working, including the --skip-header and --skip-footer flags, because I was getting header and footer "junk" that I didn't need when I tried to import to TWiki. Sorry if I do not completly understand what the flags are supposed to do; I didn't see any documentation that said otherwise so I took it at face-value. I'm still getting errors on import when i omit the flags, but the xml passes "xmllint --noout" so I guess the problem lies elsewhere. Thank you for your feedback.--Ldillon 21:19, 7 August 2008 (UTC)Reply

Bringing the wiki offline?

Latest comment: 15 years ago3 comments2 people in discussion

Is it possible to bring a MediaWiki site offline or to put it in a "read only" mode so no changes are made during a database dump? --Kaotic 12:50, 20 August 2008 (UTC)Reply

Manual:$wgReadOnly -- Duesentrieb ⇌ 13:55, 20 August 2008 (UTC)Reply

I've taken your script [1] and added the ability to place the wiki into read/write mode. let me know what you think. User:Kaotic/WikiBackup --Kaotic 09:51, 21 August 2008 (UTC)Reply

alternative cronjob command

Latest comment: 15 years ago1 comment1 person in discussion

Apparently I don't have rights to make changes on the page .... Sytange: you can post, and after that everything is disappeared....

Because if you don't have the nice-program the following alternative cronjob:

/usr/bin/mysqldump -u [username] --password=[password] [databasename] | /bin/gzip > [databasename].gz

[username] - this is your database username
[password] - this is the password for your database
[databasename] - the name of your database

don't forget to remove the []

You can use other names for the file: [database].gz and/or put a number before it.

If you use different numbers for each cronjob, you can schedule it (for instance: a cronjob for each day of the week and name it 1[databasename], 2[databasename] etc. For uploading the .gz file with the database manager (in my situation with DirectAdmin) the name of the file is not critical, but it has to be a .gz file. --Livingtale 13:33, 22 September 2008 (UTC)Reply

Files list

Latest comment: 4 years ago1 comment1 person in discussion

The section about file system should contain a comprehensive list of files and directories to back up, or link to such a list. --Florent Georges 00:13, 17 November 2008 (UTC)

Agree

Although the instructions are indeed sufficient for backing up a wiki, compress the entire $IP is overkill. That section should at least tell which directories must be copied. In a cursory check, $IP/images is the only one guaranteed to exist in all MediaWiki wikis and must be present in all backups. Extensions vary greatly among wiki instances, and may be treated as data by some. I believe they should be managed as code and kept in SCM or in Docker images beside MediaWiki itself.

Think a container scenario: pieces that hardly change should be in the image, data goes into volumes. Whatever goes into volumes is the portion that must go into backups and therefore listed under the "Filesystem" section.

--Cybermandrake (talk) 13:03, 7 October 2019 (UTC)Reply

2009

Charset problem after switching to wgDBmysql5 = true

Hello,

After years, we finally solve our charset issue and have been able to switch back to "wgDBmysql5 = false".

One of our symptoms was some special characters (like "φ") were converted silently (corrupt) into literal question marks (?).

How we resolved it is documented here in English and in French.

I hope this will help!

Jean-Luc

xml backup/restore + database backup/restore or one of the other?

Latest comment: 14 years ago1 comment1 person in discussion

I'm confused, should I backup(and restore) both xml and sql database (mysql) or just one or the other?

As per the article, an XML dump does not include site-related data (user accounts, image metadata, logs, etc), so you'd be better off performing an SQL Dump. In a discussion about MediaWiki backups, I'd describe an XML dump as more of an 'Export' and in this sense it can be used as a fallback to save off the rendition of wiki content (with or without historical revisions), but it is not capable of restoring your MediaWiki installation to a running state on a new server, as it existed previously. -- Gth-au 03:44, 3 November 2009 (UTC)Reply

crontab?

Latest comment: 14 years ago2 comments2 people in discussion

Since the manual are meant for non-experts too, a word like crontab maybe shouldn't be used, or defined clearly. Personally I have no idea what a crontab is.

The word is now linked to the "cron" Wikipedia article. —Emufarmers^(T|C) 05:00, 16 May 2009 (UTC)Reply

The explanation of cron is useful, however while a simple backup may be attempted by non-experts, investigations should be someone who is at least familiar with your environment's operating system, database engine, webserver and MediaWiki installation. -- Gth-au 03:44, 3 November 2009 (UTC)Reply

Maintenance / Optimisation / or normal part of Backup?

Latest comment: 14 years ago1 comment1 person in discussion

Can the reference to data that need not reside in the wiki backup content be clarified? Specifically, this list discussion linked to from the article mentions content that could be rebuilt upon restoration, without explaining how that rebuild would be performed, nor how long that may typically take in a small/large wiki implementation. Comments are also made that during the rebuild period the user experience would be curtailed in some areas (search, whatlinkshere, category views; any other areas?) - would users see a warning that the wiki is undergoing a rebuild and to come back later? What happens to edits made during the rebuild process? I see the reduction of backup data as a useful method of achieving faster backups and restoration (thus faster verification of backups, too), but I'm curious for more detail. Conversely, if the suggestion has no merit and is purely an advanced topic for a possible future feature, perhaps it shouldn't be presented in the article. --Gth-au 04:30, 3 November 2009 (UTC)Reply

Empirical Backup Procedure Needed

Latest comment: 14 years ago2 comments2 people in discussion

The backup procedure described in this article is full of vague statements, suggestions that 'might' be necessary or desirable. Front and center there should be a process for hosted installations to backup files (FTP) and database (SQL). Then, separately, expand on other options as differing methods to achieve the same goal, such as: if the wiki owner has filesystem access, then that enables more than just FTP to backup the files; and, using specific a database engine toolset enables other database backup options, and so on. I'm interested in assisting in improving this area of the documentation, but is this the appropriate place to brainstorm it out? Is there a beta discussion area to present test backup results etc. that would be more appropriate? -- Gth-au 03:44, 3 November 2009 (UTC)Reply

Well said, I 2^nd this request --SomaticJourney 11:43, 19 February 2010 (UTC)Reply

I agree as well. This page is all over the place. Just give me two examples, one for windows and one for other systems. JB

Empirical Backup Verification Needed

Latest comment: 14 years ago2 comments2 people in discussion

I saw an earlier suggestion for this (see top), but it degenerated into the inherited 'talk vs. discussion' issue. The need for users/admins to be confident their backup has worked is still a clear requirement and a very valuable feature to aim for, IMHO. As the various backup methods resolve down to files, I'd posit that there's two levels of verification desired: (1) confirmation that none of the backup steps ended in error; and, (2) an expanded test to prove the content from the backup repository can be restored to a working system, possibly with some kind of verification tool between the original site and the other. Item (1) can be confirmed with cross-checks such as file counts (per directory & grand total), and database backup could be confirmed via row counts (per table & grand total), log files that contain tasks executed, recorded their actions and their return codes. Item 2 is more involved, but it would be a desirable goal - perhaps more suitable for a page like Manual:Restoring a wiki though? -- Gth-au 03:44, 3 November 2009 (UTC)Reply

Yes, very important need. I agree Gth-au --SomaticJourney 11:46, 19 February 2010 (UTC)Reply

Tips for Shared Hosting wikis

Latest comment: 14 years ago1 comment1 person in discussion

For those of us who use Godaddy and other shared hosted sites (issues with permissions). What is the best possible method to backup the wiki? --SomaticJourney 13:04, 19 February 2010 (UTC)Reply

about Backing up the wiki from public caches...

Latest comment: 14 years ago5 comments3 people in discussion

Someone posted about Backing up the wiki from public caches... (added by --Diego Grez ^{return fire} 02:51, 26 May 2010 (UTC)) (content restored to Manual:Restoring wiki code from cached HTML}}Reply

You haven't explained why you removed this content, so I'm going to restore it. Then, you can add your comments on the discussion page. Jdpipe 08:09, 26 May 2010 (UTC)Reply

Because it should be merged there, not on another page anyway. --Diego Grez ^{return fire} 18:00, 26 May 2010 (UTC)Reply

My content is on a different topic to 'proper' backups... IMHO instead of deleting the content, you should have issued a 'merge request. 150.203.43.41 02:06, 27 May 2010 (UTC)Reply

New discussion is at Manual talk:Restoring wiki code from cached HTML Jdpipe 02:33, 27 May 2010 (UTC)Reply

XML dump... images?

Latest comment: 14 years ago1 comment1 person in discussion

Not clear from the main page whether or not the XML dump includes the image files or not...? Maybe it should be explicity stated if the images need to be separately handled? Jdpipe 08:22, 26 May 2010 (UTC)Reply

2010

Background on Latin-1 to UTF-8 Conversion and Character Set Problems

Latest comment: 13 years ago1 comment1 person in discussion

Problems with the character set (i.e., special characters like umlauts not working) appear to be widespread when converting vom latin-1 to utf-8. The tips on the main page sure helped me a lot. However, to solve my problems I needed more background information. Here I report what I found out.

An encoding (also inaccurately called a character set) describes how a character is represented as one or more bytes. The latin-1 encoding has 256 characters, thus it always uses exactly one byte per character. This does not support the characters of many languages existing on Earth. The utf-8 encoding supports all characters of all languages practically used. It uses one to three bytes per character. Utf-8 encodes the ASCII characters (i.e., A to Z, a to z and the most common punctuation characters) in one byte, and it uses the same value for this as in the latin-1 encoding.

Mediawiki uses the utf-8 encoding, thus allowing all special characters to be used in a wiki page. Internally, Mediawiki stores a page as a string of bytes in its data base. Mediawiki (at least as of version 1.16.0) does not convert the encoding in any way when storing a page in its data base.

The data base must accept a string of bytes from Mediawiki when storing a page, and the data base must return the exact same string when retrieving the page. It does not matter what the data base thinks the encoding of the page is, as long as it returns the same string that was stored. Therefore, Mediawiki can have a page in utf-8 encoding, and it can store it in a data base which thinks the string are characters encoded in latin-1. And historically, exactly this was necessary to do when no data base was available that supported utf-8.

The problems started when an updated MySQL data base did not return the same string of bytes on retrieval. In particular, this can happen after backing up and restoring a MySQL data base. The reasons for this behaviour are as follows.

One popular way of backing up a MySQL data base is to use the program mysqldump. It generates SQL code that describes the contents of the data base. If you feed this SQL code back into a data base, for example by using the code as the standard input for the program mysql, then the data base contents will be recreated.

One problem with the mysqldump/mysql approach is that the SQL code of course contains the special characters from the wiki pages, but that the SQL code does not specify the encoding used for wiki page table content. (To be more precise, this is true only for tables using the InnoDB engine. Tables using the MyISAM engine get a pure ASCII representation immune to encoding problems.) Therefore, backup/restore work only if both sides use the same encoding.

A harmless side effect is that the generated SQL code can contain broken special characters even if everything works. This happens when the data base internally stores the wiki pages in latin-1 encoding, but the default encoding for talking to data base clients like mysqldump is utf-8. Then the MySQL server "converts" the bytes in the database from latin-1 to utf-8 when dumping and converts them back when restoring. For example, an umlaut character, which is represented in utf-8 by two bytes, gets a four-byte representation in the SQL code, which no editor can display correctly. This works because every byte can be interpreted as a latin-1 character (even if it really is one of the several bytes of an utf-8 character).

The change from MySQL version 5.0 to version 5.1 included a change of the default encoding from latin-1 to utf-8.

You can specify the encoding explicitly when you need to move your data base contents to a data base with a different default encoding:

mysqldump --user=root --password --default-character-set=latin1 --skip-set-charset wikidb > mywiki.sql

(This assumes that your data base is named "wikidb", and that the internal representation is set to latin-1. As a consequence, the MySQL server returns the data as-is, i.e. without any conversion.) You can read mywiki.sql into another data base, which uses utf-8 by default, by typing:

mysql --user=root --password --default-character-set=latin1 wikidb < mywiki.sql

(Again, this reads the data as-is, because MySQL thinks that no conversion is necessary, since the MySQL code specifies that the individual fields shall be stored in latin-1 representation.)

However, having stored the data unmodified in the utf-8 data base is not sufficient. When the data base server is asked to retrieve a wiki page, it will notice that it is stored in latin-1 encoding, while it talks to its clients in utf-8. Therefore, the data base server will "convert" the data, thus breaking it on delivery.

You can fix this problem by changing the specifications for the internal encoding of the data that are written into the SQL code of a data base dump. Editing the SQL code manually would be tedious and error-prone. A better solution is to use an automated stream editor like sed, which comes with all Linux/Unix distributions (and with Cygwin on Windows).

The stream editor must find all occurences of latin-1 data base field definitions and replace them. You could choose an utf-8 encoding, but I chose to mark the fields as "binary", i.e. without a specific encoding. The reason is that this is what Mediawiki really puts into the data base. The command line for this is:

sed < mywiki.sql > mywiki-patched.sql \
    -e 's/character set latin1 collate latin1_bin/binary/gi'

Additionally, you should change the default encoding for each table from latin-1 to utf-8. Therefore, you extend the above command line like this:

sed < mywiki.sql > mywiki-patched.sql \
    -e 's/character set latin1 collate latin1_bin/binary/gi' \
    -e 's/CHARSET=latin1/CHARSET=utf8/gi'

But you should make still some more modifications. As explained on main page, there is a restriction on the length of sort keys that might be violated when a wiki page character is converted from latin-1 to utf-8. (I did not really understand this particular aspect, since the there should not be any actual conversion when things are done as described by me above.) If you don't experience the problem, you might skip the fix, but I suppose it does not hurt to shorten the sort key limit in any case. You can do all substitutions using the following command line:

sed < mywiki.sql > mywiki-patched.sql \
    -e 's/character set latin1 collate latin1_bin/binary/gi' \
    -e 's/CHARSET=latin1/CHARSET=utf8/gi' \
    -e 's/`cl_sortkey` varchar([0-9]*)/`cl_sortkey` varchar(70)/gi'

(Note that the regular expression [0-9]* matches a string of digits of any length. This subsumes the three separate substitutions giveon on the main page.)

Finally, the main page says that the content of the table named math could cause problems and should not be deleted, since it is a cache only and not needed when upgrading. The complete command line including this deletion of math is:

sed < mywiki.sql > mywiki-patched.sql \
    -e 's/character set latin1 collate latin1_bin/binary/gi' \
    -e 's/CHARSET=latin1/CHARSET=utf8/gi' \
    -e 's/`cl_sortkey` varchar([0-9]*)/`cl_sortkey` varchar(70)/gi' \
    -e '/^INSERT INTO `math/d'

Please note that the example given there (and here) assumes that you have defined an empty table name prefix for your Mediawiki data base tables. If not, you have to prepend that prefix. For example, if your prefix is mw_, you have to write mw_math.

I hope this background information is helpful. Please correct any mistakes or omissions. If it helps, someone could link this material from the main page. (It is probably too long to be put there directly.)

Bigoak 09:39, 27 August 2010 (UTC)Reply

Ubuntu 10.10 - Step by Step Instructions

Latest comment: 13 years ago1 comment1 person in discussion

Details

Howdy folks. Ubuntu has some funky restrictions with it (like cPanel) but it was the distro of choice at the time of launch. I thought I'd drop some step-by-step instructions here for the next Ubuntu user. Hope it helps someone, this is the procedure I've followed so far (my Wiki is only used by about 20 people, so its comfortable for me to work with it manually)

Assumptions:

- You have installed PhpMyAdmin

- Your wiki is running as expected

- You are using Ubuntu 10.10 (should be fine on earlier versions)

- You are using the default directories

- You are comfortable with PHP and Ubuntu

Notes:

- I tested this myself

- Backed up and deleted my SQL tables

- Loaded the backup to make sure it works

Preparing the Wiki

Step 1 - Turn Wiki to Read-Only

1a - Launch Terminal

1b - Enter: gksu nautilus

1c - Go to /var/www/LocalSettings.php

1d - Add the flag: $wgReadOnly = 'Site Maintenance';

Backup MySQL

Restore MySQL

Backup File System

Step 1 - Launch: gksu nautilus

Step 2 - Browse to /var/www/

Step 3 - Right click on folder mediawiki and select Compress

Step 4 - Save file - and the backup can now be downloaded through the browsers root to the backup machine by appending the filename to the url

Example: Hostname = mediawiki

Download: http://mediawiki/mediawiki_backup_30-12-2010.tar.gz

Restore File System

Step 1 - Launch: gksu nautilus

Step 2 - Move the faulty mediawiki folder out of /var/www

Step 3 - Extract the backup to /var/www/mediawiki

Thanks to SpiralOfYarn for the handy links.

Cheers, KermEd 15:28, 30 December 2010 (UTC)Reply

2012

Avoid phpmyadmin 'gzipped'

Latest comment: 11 years ago1 comment1 person in discussion

As I commented here, I found phyMyAdmin would fail silently half way through exporting, if I set compression to 'gzippped'. I notice the above advice mentions 'zipped' format, which may suffer form the same problem. The problem seemed to crop up during export of the binary BLOB fields of the text table. In my case this was several gigabytes of data, but seemed to fail fairly near the beginning of that table. Exporting without compression worked fine. -- Harry Wood (talk) 02:46, 8 October 2012 (UTC)Reply

Clarification on using Charco needed

Latest comment: 11 years ago1 comment1 person in discussion

Step 2 of the conversion under Windows says to use Charco to convert the database. That's great, but Charco doesn't have a single 'latin1' option. It has 'DOSLatin1', 'ISOLatin1', and 'WindowsLatin1'. I suppose I could take a guess and go with 'WindowsLatin1' since the wiki was created on a Windows system, but I think I'd rather be safe than sorry and decided to ask first with of the 3 would be the one to use. --Korby (talk) 11:22, 17 October 2012 (UTC)Reply

Tables

Latest comment: 11 years ago1 comment1 person in discussion

Is section Tables still relevant? Maybe it would be useful to have an explanation of which character sets to look out for, by MediaWiki version. When installing 1.19.2 -- 1.20.2 there is no option for latin1, it's utf8 or a default of binary. --Robkam (talk) 22:16, 22 December 2012 (UTC)Reply

2013

Character sets

Latest comment: 11 years ago2 comments2 people in discussion

I've moved some overly detailed information here that used to be on the main page. Graham87 (talk) 14:05, 19 June 2013 (UTC)Reply

Character set

Warning: In some common configurations of MySQL 4.1 and later, mysqldump can corrupt MediaWiki's stored text. If your database's character set is set to "latin1" rather than "UTF-8", mysqldump in 4.1+ will apply a character set conversion step which can corrupt text containing non-English characters as well as punctuation like "smart quotes" and long dashes used in English text.

You can see which character set your tables are using with a mysql statement like SHOW CREATE TABLE text; (including the semicolon). The last line will include a DEFAULT CHARSET clause.

If the last line does not include a DEFAULT CHARSET clause then there is another way if you know that nobody has changed the character set of the database server since it was installed and the wiki's database was created using the default character set of the database. The STATUS command displays the database server's default character set next to Server characterset:. Here is an example output:

mysql> status
- - - - - - - - -
mysql  Ver 12.22 Distrib 4.0.20a, for Win95/Win98 (i32)

Connection id:          13601
Current database:
Current user:           root@localhost
SSL:                    Not in use
Server version:         4.0.20a-nt
Protocol version:       10
Connection:             localhost via TCP/IP
Client characterset:    latin1
Server characterset:    latin1
TCP port:               3306
Uptime:                 27 days 4 hours 58 min 26 sec

Use the option --default-character-set=latin1 on the mysqldump command line to avoid the conversion if you find it set to "latin1".

Like this:

nice -n 19 mysqldump -u $USER -p$PASSWORD --default-character-set=$CHARSET $DATABASE -c | nice -n 19 gzip -9 > ~/backup/wiki-sql-$(date '+%a').sql.gz

Also one can try --default-character-set=binary . “Convert latin1 to UTF-8 in MySQL” on Gentoo Linux Wiki has more information.

Latin-1 to UTF-8 conversion

In the following I intentionally use different input and output file names for commands using sed because the -i (inplace) option of sed throws problems on very big dumps. The described procedure was used several times and works 100% reliably. The steps do not change your existing database. You can use the old wiki with the old database until your new wiki runs with the new database, the UTF-8 copy clone of the old one. This section is contributed and updated by --Wikinaut 10:31, 18 February 2010 (UTC). Feedback is welcome.Reply

When you want to upgrade from a rather old Mediawiki installation with Latin-1 to UTF-8 which might be tricky depending on your operating system and MySQL settings - in my example from Mediawiki 1.5 (2004) to 1.15.1 (2009) - perform the following steps as found in the article Convert a MySQL DB from latin1 to UTF8 and further adapted to Mediawiki specialities (DBNAME is the name of your wiki database):

mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DBNAME > DBNAME.sql

Then use sed to change character settings latin1 to utf8:

 sed  -e 's/character set latin1 collate latin1_bin/character set utf8 collate utf8_bin/g' -e 's/CHARSET=latin1/CHARSET=utf8/g' DBNAME.sql > DBNAME2.sql

Every character in UTF-8 needs up to 3 bytes.
Relevant sources:

MediaWiki maintenance/tables.sql
I found different MediaWiki versions already using different cl_sortkey length and took this into account below
"Specified key was too long; max key length is 1000 bytes": "Truncate so that the cl_sortkey key fits in 1000 bytes"

A further problem which prevents reimporting the database was the math table (ERROR line 389: Duplicate entry '' for key 1 when trying to import the mysqldump). I solved it by simply deleting the math table content, as this is only a cache and need not to be imported when upgrading.

There may also be issues with custom tables introduced by extensions which may prevent the sed command from changing the information as required for them too. As a result an error like e.g. ERROR 1253 (42000) at line 1198: COLLATION 'latin1_general_ci' is not valid for CHARACTER SET 'utf8' may occur on importing the database dump.

sed -e 's/`cl_sortkey` varchar(255)/`cl_sortkey` varchar(70)/gi' DBNAME2.sql > DBNAME21.sql
sed -e 's/`cl_sortkey` varchar(86)/`cl_sortkey` varchar(70)/gi' DBNAME21.sql > DBNAME22.sql
sed -e 's/`cl_sortkey`(128)/`cl_sortkey`(70)/gi' DBNAME22.sql > DBNAME23.sql
sed -e '/^INSERT INTO `math/d' DBNAME23.sql > DBNAME3.sql

From here I then created a new database DBNEW and then imported the dumpfile

mysql -u root -p -e "create database DBNEW"
mysql -u root -p --default-character-set=utf8 DBNEW < DBNAME3.sql

Now start a fresh MediaWiki installation and use your new wiki database name DBNEW - actually the UTF-8 converted copy of your untouched old DBNAME wiki - and the database copy will be automatically upgraded to the recent MediaWiki database scheme. Several successful conversions from MediaWiki 1.5 to MediaWiki 1.15.1 under PHP 5.2.12 (apache2handler) and MySQL 4.1.13 have been made.

Latin-1 to UTF-8 conversion under Windows

Dump your Database as usual.
Convert your Database using the character set conversion utility Charco
Replace all latin1 thru utf8 inside the dump.
Import the dump into a new DB or overwrite the old.
Ready

Tested under WindowsXP. Mediawiki 1.13.2 dumped under EasyPHP 1.8.0.1. Converted with Chargo 0.8.1. Imported to XAMPP 1.7.3. Updated to Mediawiki 1.15.1.

Latin-1 to UTF-8 conversion under Mac

First export your Database as usual, separated into schema and data. You can use the terminal command mysqldump, which the official installer places in /usr/local/mysql/. Note that in the following lines, the lack of spaces between -u and username and -p and password is deliberate:
- ./mysqldump --default-character-set=latin1 --skip-set-charset -d -uuser -ppassword DBNAME > ~/db_schema.sql
- ./mysqldump --default-character-set=latin1 --skip-set-charset -t -uuser -ppassword DBNAME > ~/db_data.sql
The database exports are now in your personal folder. Convert both exports with Charco from ISOlatin1 to UTF-8. Append "_utf8" to the output file names and fix the .txt extension that Charco enforces back to .sql.
Open the file ~/db_schema_utf8.sql with Text Editor and replace each "DEFAULT CHARSET=latin1" phrase with "DEFAULT CHARSET=utf8"
Make a new database using Sequel Pro, with encoding "UTF-8 Unicode (utf8)".
- Import the ~/db_schema_utf8.sql file into your new database
- Import the ~/db_data_utf8.sql file into your new database
- ensure that your wiki user has access to the new database by adding a relevant line in your MYSQL database in the DB table
Change the variable $wgDBname in your LocalSettings.php to reflect the name of the new database. Then test if everything works. If not, flip back to the old database and try a different method.
(Optional) Delete the old database, and (also optional) rename the new database to the old database and revert the change in $wgDBname.

This sequence was adapted from Khelll's Blog, and used for Mediawiki 1.19.2 and MySQL 5.1.57. It will fix encodings that already show up as garbled under an updated wiki installation as well.

Repairing corrupted character sets

In case your database's character set got corrupted (see warning above), an easy way to fix the corrupted characters and remedy the situation for future backups has been posted in this source

Directly changing all latin1-encoded columns to UTF-8 won't help, as MySQL will just transform the erroneous characters directly. The remedy is to change the wrongly encoded latin1 string type (char/varchar/TEXT) into a binary type (binary/varbinary/BLOB). A conversion into a UTF8-encoded string type (char/varchar/TEXT) will then fix all your previously erroneous characters to their proper representation.

In short: latin1 char/varchar/TEXT -> binary/varbinary/BLOB -> UTF8 char/varchar/TEXT

Also don't forget to change the default charset for your database and the single tables to UTF-8, so your character sets won't get corrupted again.

2014

Unclear instructions

"Other parameters might be useful such as ..." It would be far better if someone would explain exactly why these parameters would be useful.

2017

How do I download a wiki from Special:Export?

Latest comment: 6 years ago2 comments2 people in discussion

I do a lot of work on a wiki. The guy who owns it hasn't been around for the past few years, and the website expires this November. How can I use Special:Export in October to download the entire wiki, so that I can recreate it if he turns out to have died or something? Edit: Or more regular backups -- I don't know when the hosting will run out. Banaticus (talk) 00:42, 5 August 2017 (UTC)Reply

Whoops... well, using Special:Export you can export pages given a list of pages or categories. Get a list of pages first (looking at Special:AllPages) and paste them in the input textarea. Note, however, that it won't export images (only the file description pages, but not the files themselves), nor logs. There's a different approach, which will recreate all the possible details of the wiki on a new host, including logs, files and even deleted pages if you have a user with the required permissions: This is Grabbers. I'm now involved in moving an entire wiki from wikia outside of it, and I'm fixing some of those scripts. I'll update the repository when I finish (probably before the end of this month). Feel free to ping me if you need assistance with this. --Ciencia Al Poder (talk) 09:14, 7 August 2017 (UTC)Reply

2021

There was an attempt to archive this talk page

Latest comment: 3 years ago1 comment1 person in discussion

I tried to archive older discussions from years ago to a subpage, but since I couldn't due to Structured Discussions being in place -- an attempt to create Manual talk:Backing up a wiki/Archive 1 will result in me being taken to a SD board. So instead I followed what an earlier user did and just created level-1 headings for years. 朝彦 (Asahiko) (talk) 02:56, 20 January 2021 (UTC)Reply

mysqldump may require additional options to suppress errors

Latest comment: 1 year ago4 comments4 people in discussion

While following the instructions on this page for my local MediaWiki instance, mysqldump failed with a following error:

mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces

This is likely due to a change in MySQL:

Incompatible Change: Access to the INFORMATION_SCHEMA.FILES table now requires the PROCESS privilege.
This change affects users of the mysqldump command, which accesses tablespace information in the FILES table, and thus now requires the PROCESS privilege as well. Users who do not need to dump tablespace information can work around this requirement by invoking mysqldump with the --no-tablespaces option. (Bug #30350829)

I also got this error:

mysqldump: Couldn't execute 'SELECT COLUMN_NAME,                       JSON_EXTRACT(HISTOGRAM, '$."number-of-buckets-specified"')                FROM information_schema.COLUMN_STATISTICS                WHERE SCHEMA_NAME = 'wikidb' AND TABLE_NAME = 'actor';': Unknown table 'column_statistics' in information_schema (1109)

which can be worked around by setting --column-statistics=0. I have not understood the implications and side-effects of this workaround yet, but am writing this down for reference. 朝彦 (Asahiko) (talk) 02:56, 20 January 2021 (UTC)Reply

Same here, I was wondering if --no-tablespaces will have an effect on being able to restore from backup. However, checking the original schema, it doesn't seem to use the CREATE TABLESPACE command, so I hope it will be ok. Adam Millerchip (talk) 11:56, 7 February 2021 (UTC)Reply

I know this is an old issue, but I stumbled across it today. Has anyone experience with backups that use the --no-tablespace option? Are they fully functional or are we going to have a bad time? --87.157.189.207 22:30, 27 December 2022 (UTC)Reply

Yes, it's safe to add the --no-tablespace option, unless you modify $wgDBTableOptions to add tablespace information. In the worst case scenario, tables would be restored on the default tablespace. --Ciencia Al Poder (talk) 11:42, 29 December 2022 (UTC)Reply

2007

First Note

Information needed on how to verify and restore a backup

Import config files, backup, verify, restore, automate

"php dumpbackup.php --full" returns "DB connection error: Unknown error"

Similar problem with dumpBackup

back up with phpmyadmin

latin1

Rewrite this page

2008

Corruption Section for MySQL 4.1 is unclear

I'm a WikiNewb....help!!

My table categorylinks is corrupted !

dumpBackup.php seems to be generating invalid xml

Bringing the wiki offline?

alternative cronjob command

Files list

Agree

2009

Charset problem after switching to wgDBmysql5 = true

xml backup/restore + database backup/restore or one of the other?

crontab?

Maintenance / Optimisation / or normal part of Backup?

Empirical Backup Procedure Needed

Empirical Backup Verification Needed

Tips for Shared Hosting wikis

about Backing up the wiki from public caches...

XML dump... images?

2010

Background on Latin-1 to UTF-8 Conversion and Character Set Problems

Ubuntu 10.10 - Step by Step Instructions

Details

Preparing the Wiki

Backup MySQL

Restore MySQL

Backup File System

Restore File System

2012

Avoid phpmyadmin 'gzipped'

Clarification on using Charco needed

Tables

2013

Character sets

Character set

Latin-1 to UTF-8 conversion

Latin-1 to UTF-8 conversion under Windows

Latin-1 to UTF-8 conversion under Mac

Repairing corrupted character sets

2014

Unclear instructions

2017

How do I download a wiki from Special:Export?

2021

There was an attempt to archive this talk page

mysqldump may require additional options to suppress errors