(mysql.info) replication-faq

 
 6.10 Replication FAQ
 ====================
 
 *Q*: How do I configure a slave if the master is running and I do not
 want to stop it?
 
 *A*: There are several possibilities. If you have taken a snapshot
 backup of the master at some point and recorded the binary log filename
 and offset (from the output of `SHOW MASTER STATUS') corresponding to
 the snapshot, use the following procedure:
 
   1. Make sure that the slave is assigned a unique server ID.
 
   2. Execute the following statement on the slave, filling in
      appropriate values for each option:
 
           mysql> CHANGE MASTER TO
               ->     MASTER_HOST='MASTER_HOST_NAME',
               ->     MASTER_USER='MASTER_USER_NAME',
               ->     MASTER_PASSWORD='MASTER_PASS',
               ->     MASTER_LOG_FILE='RECORDED_LOG_FILE_NAME',
               ->     MASTER_LOG_POS=RECORDED_LOG_POSITION;
 
   3. Execute `START SLAVE' on the slave.
 
 If you do not have a backup of the master server, here is a quick
 procedure for creating one. All steps should be performed on the master
 host.
 
   1. Issue this statement to acquire a global read lock:
 
           mysql> FLUSH TABLES WITH READ LOCK;
 
   2. With the lock still in place, execute this command (or a variation
      of it):
 
           shell> tar zcf /tmp/backup.tar.gz /var/lib/mysql
 
   3. Issue this statement and record the output, which you will need
      later:
 
           mysql> SHOW MASTER STATUS;
 
   4. Release the lock:
 
           mysql> UNLOCK TABLES;
 
 An alternative to using the preceding procedure to make a binary copy
 is to make an SQL dump of the master. To do this, you can use
 `mysqldump --master-data' on your master and later load the SQL dump
 into your slave. However, this is slower than making a binary copy.
 
 Regardless of which of the two methods you use, afterward follow the
 instructions for the case when you have a snapshot and have recorded
 the log filename and offset. You can use the same snapshot to set up
 several slaves. Once you have the snapshot of the master, you can wait
 to set up a slave as long as the binary logs of the master are left
 intact. The two practical limitations on the length of time you can
 wait are the amount of disk space available to retain binary logs on
 the master and the length of time it takes the slave to catch up.
 
 You can also use `LOAD DATA FROM MASTER'. This is a convenient
 statement that transfers a snapshot to the slave and adjusts the log
 filename and offset all at once. Be warned, however, that it works only
 for `MyISAM' tables and it may hold a read lock for a long time. It is
 not yet implemented as efficiently as we would like. If you have large
 tables, the preferred method is still to make a binary snapshot on the
 master server after executing `FLUSH TABLES WITH READ LOCK'.
 
 *Q*: Does the slave need to be connected to the master all the time?
 
 *A*: No, it does not. The slave can go down or stay disconnected for
 hours or even days, and then reconnect and catch up on updates. For
 example, you can set up a master/slave relationship over a dial-up link
 where the link is up only sporadically and for short periods of time.
 The implication of this is that, at any given time, the slave is not
 guaranteed to be in synchrony with the master unless you take some
 special measures.
 
 *Q*: How do I know how late a slave is compared to the master? In other
 words, how do I know the date of the last statement replicated by the
 slave?
 
 *A*: You can read the `Seconds_Behind_Master' column in `SHOW SLAVE
 STATUS'. See  replication-implementation-details.
 
 When the slave SQL thread executes an event read from the master, it
 modifies its own time to the event timestamp. (This is why `TIMESTAMP'
 is well replicated.) In the `Time' column in the output of `SHOW
 PROCESSLIST', the number of seconds displayed for the slave SQL thread
 is the number of seconds between the timestamp of the last replicated
 event and the real time of the slave machine.  You can use this to
 determine the date of the last replicated event. Note that if your
 slave has been disconnected from the master for one hour, and then
 reconnects, you may immediately see `Time' values like 3600 for the
 slave SQL thread in `SHOW PROCESSLIST'. This is because the slave is
 executing statements that are one hour old.
 
 *Q*: How do I force the master to block updates until the slave catches
 up?
 
 *A*: Use the following procedure:
 
   1. On the master, execute these statements:
 
           mysql> FLUSH TABLES WITH READ LOCK;
           mysql> SHOW MASTER STATUS;
 
      Record the replication cooredinates (the log filename and offset)
      from the output of the `SHOW' statement.
 
   2. On the slave, issue the following statement, where the arguments
      to the `MASTER_POS_WAIT()' function are the replication coordinate
      values obtained in the previous step:
 
           mysql> SELECT MASTER_POS_WAIT('LOG_NAME', LOG_OFFSET);
 
      The `SELECT' statement blocks until the slave reaches the
      specified log file and offset. At that point, the slave is in
      synchrony with the master and the statement returns.
 
   3. On the master, issue the following statement to allow the master
      to begin processing updates again:
 
           mysql> UNLOCK TABLES;
 
 *Q*: What issues should I be aware of when setting up two-way
 replication?
 
 *A*: MySQL replication currently does not support any locking protocol
 between master and slave to guarantee the atomicity of a distributed
 (cross-server) update. In other words, it is possible for client A to
 make an update to co-master 1, and in the meantime, before it
 propagates to co-master 2, client B could make an update to co-master 2
 that makes the update of client A work differently than it did on
 co-master 1. Thus, when the update of client A makes it to co-master 2,
 it produces tables that are different from what you have on co-master
 1, even after all the updates from co-master 2 have also propagated.
 This means that you should not chain two servers together in a two-way
 replication relationship unless you are sure that your updates can
 safely happen in any order, or unless you take care of mis-ordered
 updates somehow in the client code.
 
 You should also realize that two-way replication actually does not
 improve performance very much (if at all) as far as updates are
 concerned. Each server must do the same number of updates, just as you
 would have a single server do. The only difference is that there is a
 little less lock contention, because the updates originating on another
 server are serialized in one slave thread.  Even this benefit might be
 offset by network delays.
 
 *Q*: How can I use replication to improve performance of my system?
 
 *A*: You should set up one server as the master and direct all writes
 to it. Then configure as many slaves as you have the budget and
 rackspace for, and distribute the reads among the master and the
 slaves. You can also start the slaves with the -skip-innodb, -skip-bdb,
 -low-priority-updates, and -delay-key-write=ALL options to get speed
 improvements on the slave end. In this case, the slave uses
 non-transactional `MyISAM' tables instead of `InnoDB' and `BDB' tables
 to get more speed by eliminating transactional overhead.
 
 *Q*: What should I do to prepare client code in my own applications to
 use performance-enhancing replication?
 
 *A*: If the part of your code that is responsible for database access
 has been properly abstracted/modularized, converting it to run with a
 replicated setup should be very smooth and easy. Change the
 implementation of your database access to send all writes to the
 master, and to send reads to either the master or a slave. If your code
 does not have this level of abstraction, setting up a replicated system
 gives you the opportunity and motivation to it clean up. Start by
 creating a wrapper library or module that implements the following
 functions:
 
    * `safe_writer_connect()'
 
    * `safe_reader_connect()'
 
    * `safe_reader_statement()'
 
    * `safe_writer_statement()'
 
 `safe_' in each function name means that the function takes care of
 handling all error conditions. You can use different names for the
 functions. The important thing is to have a unified interface for
 connecting for reads, connecting for writes, doing a read, and doing a
 write.
 
 Then convert your client code to use the wrapper library. This may be a
 painful and scary process at first, but it pays off in the long run.
 All applications that use the approach just described are able to take
 advantage of a master/slave configuration, even one involving multiple
 slaves. The code is much easier to maintain, and adding troubleshooting
 options is trivial. You need modify only one or two functions; for
 example, to log how long each statement took, or which statement among
 those issued gave you an error.
 
 If you have written a lot of code, you may want to automate the
 conversion task by using the `replace' utility that comes with standard
 MySQL distributions, or write your own conversion script. Ideally, your
 code uses consistent programming style conventions. If not, then you
 are probably better off rewriting it anyway, or at least going through
 and manually regularizing it to use a consistent style.
 
 *Q*: When and how much can MySQL replication improve the performance of
 my system?
 
 *A*: MySQL replication is most beneficial for a system that processes
 frequent reads and infrequent writes. In theory, by using a
 single-master/multiple-slave setup, you can scale the system by adding
 more slaves until you either run out of network bandwidth, or your
 update load grows to the point that the master cannot handle it.
 
 To determine how many slaves you can use before the added benefits
 begin to level out, and how much you can improve performance of your
 site, you need to know your query patterns, and to determine
 empirically by benchmarking the relationship between the throughput for
 reads (reads per second, or `reads') and for writes (`writes') on a
 typical master and a typical slave. The example here shows a rather
 simplified calculation of what you can get with replication for a
 hypothetical system.
 
 Let's say that system load consists of 10% writes and 90% reads, and we
 have determined by benchmarking that `reads' is 1200 - 2 × `writes'.
 In other words, the system can do 1,200 reads per second with no
 writes, the average write is twice as slow as the average read, and the
 relationship is linear. Let us suppose that the master and each slave
 have the same capacity, and that we have one master and N slaves.  Then
 we have for each server (master or slave):
 
 `reads = 1200 - 2 × writes'
 
 `reads = 9 × writes / (N + 1)' (reads are split, but writes go to all
 servers)
 
 `9 × writes / (N + 1) + 2 × writes = 1200'
 
 `writes = 1200 / (2 + 9/(N+1))'
 
 The last equation indicates the maximum number of writes for N slaves,
 given a maximum possible read rate of 1,200 per minute and a ratio of
 nine reads per write.
 
 This analysis yields the following conclusions:
 
    * If N = 0 (which means we have no replication), our system can
      handle about 1200/11 = 109 writes per second.
 
    * If N = 1, we get up to 184 writes per second.
 
    * If N = 8, we get up to 400 writes per second.
 
    * If N = 17, we get up to 480 writes per second.
 
    * Eventually, as N approaches infinity (and our budget negative
      infinity), we can get very close to 600 writes per second,
      increasing system throughput about 5.5 times. However, with only
      eight servers, we increase it nearly four times.
 
 Note that these computations assume infinite network bandwidth and
 neglect several other factors that could be significant on your system.
 In many cases, you may not be able to perform a computation similar to
 the one just shown that accurately predicts what will happen on your
 system if you add N replication slaves. However, answering the
 following questions should help you decide whether and by how much
 replication will improve the performance of your system:
 
    * What is the read/write ratio on your system?
 
    * How much more write load can one server handle if you reduce the
      reads?
 
    * For how many slaves do you have bandwidth available on your
      network?
 
 *Q*: How can I use replication to provide redundancy or high
 availability?
 
 *A*: With the currently available features, you would have to set up a
 master and a slave (or several slaves), and to write a script that
 monitors the master to check whether it is up. Then instruct your
 applications and the slaves to change master in case of failure. Some
 suggestions:
 
    * To tell a slave to change its master, use the `CHANGE MASTER TO'
      statement.
 
    * A good way to keep your applications informed as to the location
      of the master is by having a dynamic DNS entry for the master.
      With `bind' you can use `nsupdate' to dynamically update your DNS.
 
    * Run your slaves with the -log-bin option and without
      -log-slave-updates. In this way, the slave is ready to become a
      master as soon as you issue `STOP SLAVE'; `RESET MASTER', and
      `CHANGE MASTER TO' statement on the other slaves. For example,
      assume that you have the following setup:
 
                  WC
                   \
                    v
            WC----> M
                  / | \
                 /  |  \
                v   v   v
               S1   S2  S3
 
      In this diagram, `M' means the master, `S' the slaves, `WC' the
      clients issuing database writes and reads; clients that issue only
      database reads are not represented, because they need not switch.
      `S1', `S2', and `S3' are slaves running with -log-bin and without
      -log-slave-updates. Because updates received by a slave from the
      master are not logged in the binary log unless -log-slave-updates
      is specified, the binary log on each slave is empty initially. If
      for some reason `M' becomes unavailable, you can pick one of the
      slaves to become the new master. For example, if you pick `S1',
      all `WC' should be redirected to `S1', which will log updates to
      its binary log. `S2' and `S3' should then replicate from `S1'.
 
      The reason for running the slave without -log-slave-updates is to
      prevent slaves from receiving updates twice in case you cause one
      of the slaves to become the new master. Suppose that `S1' has
      -log-slave-updates enabled. Then it will write updates that it
      receives from `M' to its own binary log. When `S2' changes from
      `M' to `S1' as its master, it may receive updates from `S1' that
      it has already received from `M'
 
      Make sure that all slaves have processed any statements in their
      relay log. On each slave, issue `STOP SLAVE IO_THREAD', then check
      the output of `SHOW PROCESSLIST' until you see `Has read all relay
      log'. When this is true for all slaves, they can be reconfigured
      to the new setup. On the slave `S1' being promoted to become the
      master, issue `STOP SLAVE' and `RESET MASTER'.
 
      On the other slaves `S2' and `S3', use `STOP SLAVE' and `CHANGE
      MASTER TO MASTER_HOST='S1'' (where `'S1'' represents the real
      hostname of `S1'). To `CHANGE MASTER', add all information about
      how to connect to `S1' from `S2' or `S3' (USER, PASSWORD, PORT).
      In `CHANGE MASTER', there is no need to specify the name of `S1''s
      binary log or binary log position to read from: We know it is the
      first binary log and position 4, which are the defaults for
      `CHANGE MASTER'.  Finally, use `START SLAVE' on `S2' and `S3'.
 
      Then instruct all `WC' to direct their statements to `S1'. From
      that point on, all updates statements sent by `WC' to `S1' are
      written to the binary log of `S1', which then contains every update
      statement sent to `S1' since `M' died.
 
      The result is this configuration:
 
                  WC
                 /
                 |
            WC   |  M(unavailable)
             \   |
              \  |
               v v
                S1<--S2  S3
                 ^       |
                 +-------+
 
      When `M' is up again, you must issue on it the same `CHANGE
      MASTER' as that issued on `S2' and `S3', so that `M' becomes a
      slave of `S1' and picks up all the `WC' writes that it missed
      while it was down. To make `M' a master again (because it is the
      most powerful machine, for example), use the preceding procedure
      as if `S1' was unavailable and `M' was to be the new master.
      During this procedure, do not forget to run `RESET MASTER' on `M'
      before making `S1', `S2', and `S3' slaves of `M'.  Otherwise, they
      may pick up old `WC' writes from before the point at which `M'
      became unavailable.
 
      Note that there is no synchronization between the different slaves
      to a master. Some slaves might be ahead of others. This means that
      the concept outlined in the previous example might not work. In
      practice, however, the relay logs of different slaves will most
      likely not be far behind the master, so it would work, anyway (but
      there is no guarantee).
 
 *Q*: Does replication work on mixed operating systems (for example, the
 master runs on Linux while slaves run on Mac OS and Windows)?
 
 *A*: Yes.
 
 *Q*: Does replication work on mixed hardware architectures (for
 example, the master runs on a 64-bit machine while slaves run on 32-bit
 machines)?
 
 *A*: Yes.
Info Catalog
(mysql.info) replication-rules
(mysql.info) replication
(mysql.info) replication-problems
automatically generated byinfo2html