Keepalived+MySQL双主热备实现高可用方案配置

2018-06-03 20次浏览 发表评论

双机热备是指两台机器都在运行,但并不是两台机器都同时在提供服务。当提供服务的一台出现故障的时候,另外一台会马上自动接管并且提供服务,而且切换的时间非常短。

Keepalived是一个免费开源的,用C编写的类似于layer3, 4 & 7交换机制软件,具备我们平时说的第3层、第4层和第7层交换机的功能。主要提供loadbalancing(负载均衡)和 high-availability(高可用)功能,负载均衡实现需要依赖Linux的虚拟服务内核模块(ipvs),而高可用是通过VRRP协议实现多台机器之间的故障转移服务。

方案拓扑图

环境描述:

  • MySQL安装推荐使用lnmp一键安装包安装MySQL服务器,方便快捷。
  • 服务器系统:CentOS Linux release 7.5.1804 (Core) 
  • Master A:192.168.0.109        安装mysql和keepalived
  • Master B:192.168.0.108        安装mysql和keepalived
  • VIP:192.168.0.50

MySQL配置部分

Master A 节点配置

在my.cnf文件的[mysqld]配置区域添加下面内容:

# 服务器的唯一ID,一般用IP区分
server-id = 109

# 复制过滤,不需要备份的数据库
binlog-ignore-db = mysql,information_schema

# 开启二进制日志功能,名字最好有含义(比如项目名)
log-bin = mysql-bin

# 为 0 时性能最好,风险大;为 1 时写入性能降低5倍以上,数据安全性高
sync_binlog = 1

# 为每个 session 分配的内存,在事务过程中用来存储二进制日志的缓存
binlog_cache_size = 1M

# 主从复制的格式(mixed,statement,row,默认格式是statement)
binlog_format = mixed

# 二进制日志自动删除/过期的天数,默认值为 0,表示不自动删除
expire_logs_days = 10

# 跳过主从复制中遇到的所有错误或指定类型的错误,避免 slave 端复制中断
# 如:1062错误是指一些主键重复,1032错误是因为主从数据库数据不一致
slave_skip_errors = all

# 作为从服务器时的中继日志
relay_log = mysql-relay-bin

# log_slave_updates 表示 slave 将复制事件写进自己的二进制日志
log_slave_updates = 1

# 主键自增规则,避免主从同步ID重复的问题
auto_increment_increment = 2  # 自增因子(每次加2),一般为MySQL服务器数量
auto_increment_offset = 1     # 自增偏移(从1开始),第一台MySQL写1,第二台写2

配置好之后重启MySQL:

/etc/init.d/mysql restart

登录MySQL:

/usr/local/mysql/bin/mysql -uroot -p
-- 创建数据库同步用户,并授予相应的权限
mysql> grant replication slave, replication client on *.* to 'syncroot'@'192.168.0.%' identified by 'sync123456';

-- 刷新授权表信息
mysql> flush privileges;

-- 查看binlog文件的position(偏移)和File(日志文件)的值,配置 Master B 的 Slave 时需要用到。
mysql> show master status;
+------------------+----------+--------------+--------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB         |
+------------------+----------+--------------+--------------------------+
| mysql-bin.000009 |      372 |              | mysql,information_schema |
+------------------+----------+--------------+--------------------------+

配置 Master A 的 Slave :

-- 在 Master A 上执行同步操作,必须要在 Master B上创建同步账户之后才能执行。
-- master_user 和 master_password 是在 Master B 上执行 grant replication slave... 创建的用户和密码。
-- master_log_file 和 master_log_pos 是在 Master B 上运行 show master status; 命令执行结果对应File和Position字段的值。
mysql> change master to master_host='192.168.0.108', master_user='syncroot', master_password='sync123456', master_port=3306, master_log_file='mysql-bin.000008', master_log_pos=372, master_connect_retry=30;

Master B 节点配置

在[mysqld]中增加以下配置项:

# 服务器的唯一ID,一般用IP区分
server-id = 108

# 复制过滤,不需要备份的数据库
binlog-ignore-db = mysql,information_schema

# 开启二进制日志功能,名字最好有含义(比如项目名)
log-bin = mysql-bin

# 为 0 时性能最好,风险大;为 1 时写入性能降低5倍以上,数据安全性高
sync_binlog = 1

# 为每个 session 分配的内存,在事务过程中用来存储二进制日志的缓存
binlog_cache_size = 1M

# 主从复制的格式(mixed,statement,row,默认格式是statement)
binlog_format = mixed

# 二进制日志自动删除/过期的天数,默认值为 0,表示不自动删除
expire_logs_days = 10

# 跳过主从复制中遇到的所有错误或指定类型的错误,避免 slave 端复制中断
# 如:1062错误是指一些主键重复,1032错误是因为主从数据库数据不一致
slave_skip_errors = all

# 作为从服务器时的中继日志
relay_log = mysql-relay-bin

# log_slave_updates 表示 slave 将复制事件写进自己的二进制日志
log_slave_updates = 1

# 主键自增规则,避免主从同步ID重复的问题
auto_increment_increment = 2  # 自增因子(每次加2),一般为MySQL服务器数量
auto_increment_offset = 2     # 自增偏移(从1开始),第一台MySQL写1,第二台写2

配置好之后同样重启MySQL:

/etc/init.d/mysql restart

登录MySQL:

/usr/local/mysql/bin/mysql -uroot -p
-- 创建数据库同步用户,并授予相应的权限
mysql> grant replication slave, replication client on *.* to 'syncroot'@'192.168.0.%' identified by 'sync123456';

-- 刷新授权表信息
mysql> flush privileges;

-- 查看binlog文件的position(偏移)和File(日志文件)的值,配置 Master A 的 Slave 时需要用到。
mysql> show master status;
+------------------+----------+--------------+--------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB         |
+------------------+----------+--------------+--------------------------+
| mysql-bin.000008 |      372 |              | mysql,information_schema |
+------------------+----------+--------------+--------------------------+

配置 Master B 的 Slave :

-- master_log_file 和 master_log_pos是Master A节点上执行 show master status; 对应File和position的值。 change master to mysql> master_host='192.168.0.109', master_user='syncroot', master_password='sync123456', master_port=3306, master_log_file='mysql-bin.000009', master_log_pos=372, master_connect_retry=30;

至此可以分别开启 Master A Slave Master B Slave:

-- 开启 Master A Slave:
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

-- 查看作为从节点的状态信息
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.108
                  Master_User: syncroot
                  Master_Port: 3306
                Connect_Retry: 30
              Master_Log_File: mysql-bin.000009
          Read_Master_Log_Pos: 34442
               Relay_Log_File: mysql-relay-bin.000011
                Relay_Log_Pos: 11150
        Relay_Master_Log_File: mysql-bin.000009
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 34442
              Relay_Log_Space: 11306
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 108
1 row in set (0.00 sec)

-- 同样开启 Master B 的 Slave,查看状态信息,此处省略。

MySQL部分配置完毕,可以在两个节点上新建数据库,分别写入数据进行验证,此处省略...

Keepalived配置部分

在 Master A 和 Master B 分别执行以下步骤安装 Keepalived:

#安装依赖包
yum -y install libnl libnl-devel libnfnetlink-devel openssl-devel

#下载、编译安装Keepalived
cd /usr/local/src/
wget http://www.keepalived.org/software/keepalived-1.4.5.tar.gz
tar zxvf keepalived-1.4.5.tar.gz
cd keepalived-1.4.5/
./configure --prefix=/usr/local/keepalived
make && make install

#复制启动和配置文件到相应的目录
cp keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
mkdir /etc/keepalived/
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
cp /usr/local/keepalived/sbin/keepalived /usr/sbin/

#设为开机启动
echo "/etc/init.d/keepalived start" >> /etc/rc.local

#备份默认配置文件
cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak

#清空配置文件
> /etc/keepalived/keepalived.conf

在 Master A 和 Master B 上分别创建MySQL的监测脚本,内容如下:

vim /etc/keepalived/check-mysql.sh

#!/bin/bash

#此脚本可以用多种方式检测MySQL是否存活,并可以加判断实现发送报警信息到运维人员
#if netstat -na | grep "LISTEN" | grep "0 0.0.0.0:3306" > /dev/null; then
#    exit 0
#else
#    因为keepalived会2秒执行一次脚本,需要实现报警,这里必须让keepalived自杀,不然会重复报警。
#    /etc/init.d/keepalived stop
#    发短信、发邮件代码..此处省略...
#    exit 1
#fi
#
netstat -na | grep "LISTEN" | grep "0 0.0.0.0:3306" > /dev/null

Master A 的配置文件内容:

vim /etc/keepalived/keepalived.conf

# Keepalive Config for MySQL Master A
global_defs {
    # 运行Keepalived机器的一个标识
    router_id HA_MYSQL_SRV
}

vrrp_script chk_mysql_port {
    # 检测MySQL是否存活的脚本
    script "/etc/keepalived/check-mysql.sh"

    #脚本执行间隔,每2s检测一次
    interval 2

    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级减20
    weight -20

    #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    fall 2

    #检测1次成功就算成功。但不修改优先级
    rise 1
}

vrrp_instance VI_1 {

    state MASTER

    # 指定虚拟ip的网卡接口
    interface ens33

    # 发送多播包的地址,设为本机网卡地址
    mcast_src_ip 192.168.0.109

    # VRRP组名,两个节点必须一致,以指明各节点同属一VRRP组
    virtual_router_id 51

    # 数字越大,优先级越高,在同一个vrrp_instance下,MASTER优先级必须大于BACKUP优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来。
    priority 100

    # 播信息发送间隔,两个节点需一致
    advert_int 1

    authentication {
        # 认证方式
        auth_type PASS

        # 认证密码
        auth_pass qazwsx
    }

    # 设置虚拟IP,两个节点需设置一样
    virtual_ipaddress {
        192.168.0.50
    }

    track_script {
        chk_mysql_port
    }
}

Master B 的配置文件内容:

vim /etc/keepalived/keepalived.conf

# Keepalive Config for MySQL Master B
global_defs {
    # 运行Keepalived机器的一个标识
    router_id HA_MYSQL_SRV
}

vrrp_script chk_mysql_port {
    # 检测MySQL是否存活的脚本
    script "/etc/keepalived/check-mysql.sh"

    #脚本执行间隔,每2s检测一次
    interval 2

    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级减20
    weight -20

    #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    fall 2

    #检测1次成功就算成功。但不修改优先级
    rise 1
}

vrrp_instance VI_1 {

    state MASTER

    # 指定虚拟ip的网卡接口
    interface ens33

    # 发送多播包的地址,设为本机网卡地址
    mcast_src_ip 192.168.0.108

    # VRRP组名,两个节点必须一致,以指明各节点同属一VRRP组
    virtual_router_id 51

    # 数字越大,优先级越高,在同一个vrrp_instance下,MASTER优先级必须大于BACKUP优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来。
    priority 90

    # 播信息发送间隔,两个节点需一致
    advert_int 1

    authentication {
        # 认证方式
        auth_type PASS

        # 认证密码
        auth_pass qazwsx
    }

    # 设置虚拟IP,两个节点需设置一样
    virtual_ipaddress {
        192.168.0.50
    }

    track_script {
        chk_mysql_port
    }
}

分别开启两个节点的Keepalived,开机启动顺序要保证MySQL在前,Keepalived在后。

/etc/init.d/keepalived start
/etc/init.d/keepalived status
 keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Sun 2018-06-03 21:06:58 CST; 2h 7min ago
  Process: 76004 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 76005 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─76005 /usr/local/keepalived/sbin/keepalived -D
           ├─76006 /usr/local/keepalived/sbin/keepalived -D
           └─76007 /usr/local/keepalived/sbin/keepalived -D

Jun 03 22:56:21 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:21 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:21 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:21 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50
Jun 03 22:56:26 localhost.localdomain Keepalived_vrrp[76007]: Sending gratuitous ARP on ens33 for 192.168.0.50

此时使用ip add命令查看VIP漂移在哪个节点上:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:e1:c5:4e brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.109/24 brd 192.168.0.255 scope global dynamic ens33
       valid_lft 127920sec preferred_lft 127920sec
    inet 192.168.0.50/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fee1:c54e/64 scope link 
       valid_lft forever preferred_lft forever

可以看出VIP漂移在Master A节点上。

1条评论

您可以在这里写下评论