User:DanielRenfro/MySQL

This is a page about mysql for the Hu laboratory. This page used to be specifically about replication, but replication has been moved into a subsection and other sections added.

=Usage=

Starting and Stopping
I think there are about 1.2e5 different ways to start and stop mysql, some of which are:
 * 1) using mysqld_safe script
 * 2) * Can be run to start a mysqld daemon. mysqld_safe adds some safety features such as restarting the server when an error occurs and logging runtime information to an error log file.
 * 3) using mysqldadmin
 * 4) * not used to start mysqld daemons, but can be used to do other things like flush-tables, start/stop replication, shutdown. always good in a pinch.
 * 5) using launchctl (for macs)
 * using the launchctl wrapper: sudo /opt/local/etc/LaunchDaemons/org.macports.mysql5/mysql5.wrapper [start|stop]
 * --or--
 * using launchctl directly: sudo launchctl load -w /opt/local/etc/LaunchDaemons/org.macports.mysql5/org.macports.mysql5.plist
 * 1) * this method is typically used by the system and not used by the system-admin
 * 2) using mysql.server script (for linux)
 * 3) * similar to the aforementioned launchctl method, but for systems that use V-style run directories
 * 4) using mysqld_multi[5]
 * 5) * for starting one or more daemons

=General Documentation= These are things that are good to know, but didn't fit in other sections.

mysqld_multi

 * This program is a perl script for starting/stopping multiple daemons at the same time.
 * It logs to /opt/local/share/mysql5/mysqld_multi.log (for macports)
 * mysqld_multi at dev.mysql.com

mysql.server & mysqld_multi.server

 * Scripts that are typically put into /etc/init.d (like on Linux systems) so that the MySQL server(s) are started automatically upon system start.
 * For Macintosh computers, I don't think these are relevant, as Macintosh has migrated to the launchd system of starting/stopping processes when the system boots.
 * See /Library/LaunchDaemons/org.macports.mysql5.plist, and/or /opt/local/etc/LaunchDaemons/org.macports.mysql5/org.macports.mysql5.plist</tt> (the first might be a soft-link to the latter)

MacPorts
These might be helpful one day:
 * https://trac.macports.org/wiki/howto/MAMP#mysql
 * https://trac.macports.org/wiki/howto/MySQLWorkbench
 * Snow Leopard: http://stackoverflow.com/questions/4798464/installing-mysql-on-snow-leopard-using-macports
 * http://hennessynet.com/2008/01/15/installing-macports-and-mysql-on-leopard.html

Paths
Macports installs MySQL in the following places: /opt/local/etc/my.cnf /opt/local/bin/mysql*5 /opt/local/var/db/mysql5/ /opt/local/lib/mysql5/ ... probably lots of other places

The difference between % and localhost
When setting up users, you can choose between three options for the host:</tt>
 * This means any user connecting over TCP/IP.
 * 1) localhost
 * This means any user connecing via the socket - which means they must be on this computer.
 * This is totally different than connecting via 127.0.0.1 (which would use the permissions set for user@%)
 * 1) string - hostname</tt>
 * A specific host, over TCP/IP
 * A specific host, over TCP/IP

When does it read my.cnf</tt>
It seems like sometimes the mysql binaries read the configuration directives that are stored in [/opt/local]/etc/my.cnf</tt> and sometimes they don't. I haven't figured out why this is; maybe just some programs/scripts know about my.cnf.

types

 * shared locks / read locks
 * many clients can read a shared-locked resource


 * exclusive locks / write locks
 * only one client can access/change this

Storage Engines
=Replication= It is important to understand how MySQL replication works. There is a lot of documentation on the web, but here are a few notes I have made through my process of learning:

This process is as simple as:
 * 1) configure the master
 * 2) configure the slave
 * 3) start both servers
 * 4) connect the slave to the master

Russell Dyer, author of MySQL in a Nutshell explains on this website the rational for having a simple replication setup. One of the difficulties with a large and active MySQL database is making clean backups without having to bring the server down. Otherwise, a backup may slow down the system and there may be inconsistency with data, since related tables may be changed while another is being backed up. Taking the server down will ensure consistency of data, but it means interruption of service to users. Sometimes this is necessary and unavoidable, but daily server outages for backing up data may be unacceptable. A simple alternative method to ensure reliable backups without having to shut down the server daily is to set up replication for MySQL.

For the Hu-laboratory we're going to set up a single master and slave pair on the same server. This means we're dealing with two issues:
 * 1) replication between master &amp; slave
 * 2) running concurrent mysqld</tt> processes at the same time

Chapter 1 - Why Replication?
Why would you want to set up replication?
 * High availability
 * In case of catastrophe or system-failure, you can (automagically) fall back to a slave.


 * Load Balancing
 * For large systems you can spread the load out over several machines/daemons, thereby increasing efficiency and speed.


 * Off-site processing
 * Don't disturb the master server when doing database-intensive processing.

For us it is a matter of availability. We want to have one database (the master) that is always available, instead of having a long period of latency for backups. With this system we can backup the slave at any time without affecting what goes on with the production sites.

Chapter 2 - Replication Basics
In this tutorial we will be using a single master server and a single slave server. The master is the control-point of replication and all changes should happen on the master and get "pushed" to the slave. (In actuality, the slave will ask for changes and execute them.) This is not a synchronous mechanism - meaning the changes to the slave might not happen in the same order that they took place on the master. In theory this sounds bad, but in practice it turns out to be alright. Synchronous replication is slower and has higher fidelity, but is harder to setup. For our purposes using the more simple method is sufficient.

Essentially the replication process consists of these steps:
 * 1) an SQL query is sent to the MASTER. MySQL records this query in the bin.log file along with the log position number
 * 2) the SLAVE database checks (through an IO process) the bin.log file for any changes
 * 3) when a change is encountered the SLAVE copies the SQL statement to it's relay.log file, along with the position number and records this number in the master.info file
 * 4) the SLAVE executes the SQL andqueries the MASTER
 * 5) the SLAVE checks the results of both queries for consistency and issues any errors if necessary

Chapter 3 - The bin-log</tt> file(s)
The binary log is a set of files that keep track of what changes have been made on the master server so that the same changes can be made on the slave. Here's a brief section from MySQL High Availability (available through the library as an e-book): The previous chapter included a very brief introduction to the binary log. In this chapter, we will fill in more details and give a more thorough description of the binary log structure, the replication event format, and how to use the mysqlbinlog tool to investigate and work with the contents of binary logs.

The binary log records changes made to the database so that the same changes can be made on any of the slaves as well. Since the binary log normally keeps a record of all changes, you can also use it for auditing purposes to see what happened in the database, and for PITR by playing back the binary log to a server, repeating changes that were recorded in the binary log.

The binary log contains only statements that could change the database. Note that statements that do not change the database but that could potentially change the database are logged. The most notable statements are those that optionally make a change, such as DROP TABLE IF EXISTS or CREATE TABLE IF NOT EXISTS, along with statements such as DELETE and UPDATE that have WHERE conditions that don’t happen to match any rows.

SELECT statements are not normally logged, since they do not make any changes to any database. There are, however, exceptions.

Transactions on a server are not normally executed in sequence, one after the other, but are rather interleaved and executed in parallel. To ensure that two transactions do not conflict and generate an inconsistent result, the server ensures the transaction execution is serializable, meaning the transactions are executed in such a way that the execution yields the same result as if they were executed in a serial order—that is, in a fixed order, one transaction after another.

The binary log records each transaction in the order that the commit took place on the master. Although transactions may be interleaved on the master, each appears as an uninterrupted sequence in the binary log, the order determined by the time of the commit.

So, we'll need to configure the master/slave pair to use the bin-log.

Replication Settings
On the master we will need a few options configured in the my.cnf</tt> file...usually found at /etc/my.cnf</tt>. The necessary options needed are: log-bin server-id and optionally: log-bin-index binlog-do-db binlog-ignore-db

The explanation for each parameter is as follows:
 * log-bin
 * the base name for all the files created by the binary log


 * log-bin-index
 * gives the name of the binary log index file, which keeps a list of all bin-log files
 * defaults to whatever the log-bin basename is


 * server-id
 * each server must have a unique id for identification purposes


 * binlog-do-db, binlog-ignore-db
 * a list of databases to replicate or skip, respectively.

On the slave all we really need is: server_id and optionally: replicate-do-db replicate-ignore-db replicate-do-table replicate-ignore-table read-only log-slave-updates skip-slave-start ... I think these options are pretty self-explanatory. If you want to read more, see the Replication and Binary Logging Options and Variables documentation section on the MySQL website.

concurrent mysqld</tt> processes
At this point let's take a moment to focus on setting up the two mysqld</tt> servers without worrying about replication. Running Multiple MySQL Instances on One Machine has some good information. At this point we have a few options:
 * 1) Compiling different binaries for each daemon we want to run with different settings.  (seems like overkill, and I don't want to have to keep up with many different binaries.)
 * 2) Instantiating the same binary many times with different settings:
 * 3) * set on the command line
 * 4) ** we would need a shell script to instantiate both daemons with different options because doing it manually is too much work and I forget things too easily
 * 5) ** we can't set all the options we want on the commandline
 * 6) * set in option files
 * 7) ** we would need one option file for each daemon running
 * 8) * or by setting environment variables (not permanent enough for me)

Each daemon will need unique settings for the following things:
 * --datadir
 * This is the most important! This path is where each daemon will keep its data. We can put each directory on a different physical disk to spread out the diskIO, or not.
 * --port / --bind-address
 * The port this daemon is bound to.
 * If we're going move the slave to another machine we need to substitute --bind-address for --port.
 * --socket
 * Path to the local unix socket.
 * --shared-memory-base-name
 * Used only on Windows
 * --pid-file
 * Path to the file which the daemon records it's process id.

In order to increase performance, it is suggested to spread the disk IO load over separate disks with this option: See http://dev.mysql.com/doc/refman/5.1/en/multiple-servers.html for more information.
 * --tmpdir

Chapter 6 - A note about managing multiple servers
There are some scripts packaged with mysql that can help us have concurrent processes: The mysqld_multi</tt> script is (as far as I can tell) a very basic wrapper around mysqld_safe</tt>. It loops and starts each server you tell it to.
 * mysqld-multi</tt>
 * mysqlmanager</tt>
 * Also you can manipulate/start/stop/etc. each server with mysqladmin</tt> -- assuming you give the script the correct socket so it knows which server to connect to.

Starting
After fiddling with this for a while, I have found the <tt>mysqld_multi</tt> script adequate for starting servers. The command goes something like this: sudo mysqld_multi start This will start all servers in the <tt>my.cnf</tt> file (which means both master and slave in our case.) Optionally you can put a "group number" as the last parameter. See the docs for an explanation.

Reporting what's going on
Either try: sudo mysqld_multi report or, (which I do a lot): ps aux | grep mysqld and sort through the processes.

Stopping
The <tt>mysqld_multi</tt> binary should stop the servers just like starting them, but I've found that sometimes it doesn't (for unknown reasons.) If this doesn't work: sudo mysqld_multi stop then try this (you'll have to stop each server independently): sudo mysqladmin --socket=/var/run/mysqld/mysql.sock       shutdown sudo mysqladmin --socket=/var/run/mysqld/mysql-slave.sock shutdown

If nothing works, you can <tt>kill</tt> the processes, but I'm afraid that might interrupt a transaction in the database.

Connecting
If you want to connect to the mysql server, you have two options:
 * 1) using the <tt>mysql</tt> command-line interface
 * 2) using phpMyAdmin through the web-browser

command line
If you want to use the command line you can connect to the master like this: mysql --user=$USERNAME --password=XXXXXX or if you want to be prompted for a password mysql --user=$USERNAME -p You can simplify some of this by creating a <tt>~/.my.cnf</tt> file that holds your settings. You can read more at MySQL's option files page. Here is a short sample:
 * 1) personal settings for MySQL programs

[client] user           = daniel password       = XXXXXXXXXXXXX

[mysql] auto-rehash line-numbers pager          = more

Connecting to the slave requires the <tt>--socket</tt> flag: mysql --user=$USERNAME -p --socket=/var/run/mysq/mysql-slave.sock In practice normal users should never need to log onto the SLAVE database, the backup-bot should be the only person to ever log-on. If you don't have a good reason for connecting to the slave, don't.

phpMyAdmin
I think most of the servers have phpMyAdmin installed. Have a look at the Servers section on the Hardware page and then on the server's page -- hopefully there is a URL to phpMyAdmin. Also ask.

Chapter 7 - Getting things running
So far we have covered the background of replication, learned about options, and written a <tt>my.cnf</tt> configuration file but have not actually set anything up. Now comes the fun part - implementation.

The process can be broken down into sizable pieces:
 * 1) set up the directories &amp; permissions
 * 2) configure the servers (if not already done)
 * 3) start the servers
 * 4) add a dedicated replication user to the master (with correct permissions)
 * 5) lock the master
 * 6) dump the master to an SQL file
 * 7) load the slave from SQL file
 * 8) connect the slave to the master
 * 9) start slave
 * 10) unlock master
 * 11) debug/test

=Backups=

Version 1

 * the first script written only dumped two databases. not sufficient for reliable backups.

Version 2

 * this version checks (all) the databases and repairs them if necessary. run-time is about 6 hours, which is a little long.

=Slow Queries= The slow query log usually is defined in <tt>/etc/my.cnf</tt>, and typically lives in <tt>/var/log/mysql/</tt>. It can be analyzed using the mk-query-digest from the [maatkit software suite. mk-query-digest /var/log/mysql/mysql-slow.log > mk-query-digest.slow.2011-07-12.out

=Notes=

Sandbox
The MySQL Sandbox stuff looks interesting. I installed this via cpan.

You can make a single sandbox: cd ~daniel/working/mysql/sandbox/ make_sandbox ./mysql-5.1.51-osx10.5-x86.tar.gz Or a replication sandbox cd ~daniel/working/mysql/sandbox/ make_replication_sandbox ./mysql-5.1.51-osx10.5-x86.tar.gz

Links
Helpful links:
 * http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html