1. Replication Models

This document describes two alternative replication models that can be used when configuring MySQL for failover protection:

  • Master-Slave (M-S)

  • Master-Master in Active-Passive mode (M-M-A-P)

[Important]Important

Please consult with your internal DBAs. They already know how to do this. Setting up database replication in a reliable and highly available way requires accommodation to your specific application environment, and these instructions are meant to be a guide, not a complete specification guaranteed to be sufficient in all circumstances. This information is provided "AS IS" and is not warranteed by Hortonworks Inc.

 1.1. Master-Slave Model

The Master-Slave model is the most basic replication model, where one box acts as a master, generating transaction records which are then replicated over to the slave. This creates a cold standby that can then be used as a backup or as a replacement for the master when it goes down. Manual configuration can also result in swapping around the order of the master and the slave.

  • Pros: Robust, well-understood. Allows for backups, simple to configure and manage.

  • Cons: Requires explicit manual intervention to switch it over when failure of the master happens.

 1.1.1. Master-Slave Runtime Configurations

Table 7.1. M-S Runtime Configurations: Master
Configurations Set on Master Comments

server_id

Change from default 1 to a unique id.

log_bin

On, set to a filename where all transactions are recorded for transfer to the slave.

binlog_format

ROW. The default isolation level for the Hive metastore is repeatable-read, and bin logs are not supported in STATEMENT or MIXED modes with that setting.

innodb_flush_logs_at_trx_commit

1 — recommended setting, default with newer MySQL versions.

innodb_support_xa

1 — recommended setting, default with newer MySQL versions.

Table 7.2. M-S Runtime Configurations: Slave
Configurations Set on Slave Comments

server_id

Change from default 1 to a unique id separate from the master.

read_only

Turned on — prevents any application from accidentally attempting to write as a result of a misconfiguration.

relay_log

On, set to a filename to use.

log_slave_updates

Turned on — this is not strictly necessary and is mostly used for slave chains, but makes it much easier to recover and rebuild the original master after a failure; therefore using log_slave_updates is recommended.

log_bin

On — again not strictly necessary on the slave, but useful to have when rebuilding.

binlog_format

ROW. The default isolation level for the Hive metastore is repeatable-read, and bin logs are not supported in STATEMENT or MIXED modes with that setting.

Master setting

Set to the ip/name of the master (done directly in MySQL, not in my.cnf).

 1.2. Master-Master-Active-Passive Model

First of all, it should be noted that MySQL does not support what is often considered to be a multi-master model, in that no slave can receive replication records from more than one master. However, it is possible for us to set up two machines which each refer to each other as master, and as long as they each have only one master (each other), we can have multiple "masters". It is also possible to have a Master-chain setup, but that brings up more problems than it solves, so we'll restrict ourselves primarily to the 2-master model in this discussion, and compare it to a typical M-S setup.

While a Master-Master model does support, in theory, updates to both masters, it is recommended to not use it in that fashion, as it can lead to inconsistencies between the two masters in cases of updates involving auto-increment fields or updates that update the same field. Also, there isn't any major benefit to writing across two masters in terms of write scaling, since the update will have to be processed on both computers anyway. For the most part, the main advantage that an M-M model gives over an M-S model is simply the ability to switch over to a secondary master upon failure of the primary master and to continue using the application that uses the database, and eventually, when the original master comes back up, it can recover much more easily than in an M-S situation. Thus, for the purposes of using a MySQL database as a backend for the Hive metastore, it is recommended that the M-M model not be used in what's referred to as Active-Active mode, but rather, that it be used in an Active-Passive mode, where writes happen to only one master at any given time.

 1.2.1. M-M-A-P Runtime Configurations

Table 7.3. M-M-A-P Runtime Configurations: Master-A
Configurations Set on Master-A Comments

server_id

Change from default 1 to a unique id.

read_only

Turned off — this will be our "active" master.

relay_log

On, set to a filename to use.

log_slave_updates

Turned on.

log_bin

On, set to a filename to use.

binlog_format

ROW. The default isolation level for the Hive metastore is repeatable-read, and bin logs are not supported in STATEMENT or MIXED modes with that setting.

innodb_flush_logs_at_trx_commit

1 — recommended setting, default with newer MySQL versions.

innodb_support_xa

1 — recommended setting, default with newer MySQL versions.

Master setting

Set to master-B (done directly in MySQL, not in my.cnf).

Table 7.4. M-M-A-P Runtime Configurations: Master-B
Configurations Set on Master-B Comments

server_id

Change from default 1 to a unique id separate from the master-A.

read_only

Turned on — prevents any application from accidentally attempting to write as a result of a misconfiguration.

relay_log

On, set to a filename to use.

log_slave_updates

Turned on.

log_bin

On, set to a filename to use.

binlog_format

ROW. The default isolation level for the Hive metastore is repeatable-read, and bin logs are not supported in STATEMENT or MIXED modes with that setting.

innodb_flush_logs_at_trx_commit

1 — recommended setting, default with newer MySQL versions.

innodb_support_xa

1 — recommended setting, default with newer MySQL versions.

Master setting

Set to master-A (done directly in MySQL, not in my.cnf).


loading table of contents...