MHA实现MySQL8高可用

Mysql的一主多从架构,无法实现故障自动转移,自动提升从库为主库。

一、介绍及规划

1.1 介绍

MHA(mha4mysql-manager)是一款用于 MySQL 的主高可用性 (MHA) 管理器和工具,用于自动化主故障转移和快速主切换。

由两部分构成,

1.2 规划

注意:需提前搞定一主两从的架构,详见MySQL8主从复制读写分离

主机名 ip 用途 系统
mysql101 192.168.10.101 master(node) centos7
mysql102 192.168.10.102 slave01(node) centos7
mysql103 192.168.10.103 slave02(node) centos7
mha-manager 192.168.10.104 管理器 centos7
192.168.10.100 虚拟IP
1
2
3
4
5
6
cat >> /etc/hosts <<EOF
192.168.10.101 mysql101
192.168.10.102 mysql102
192.168.10.103 mysql103
192.168.10.104 mha-manager
EOF

1.3 验证互信

1
ssh-keygen
1
cat ~/.ssh/id_rsa.pub
1
cat >> ~/.ssh/authorized_keys
1
2
3
4
ssh mysql101 hostname
ssh mysql102 hostname
ssh mysql103 hostname
ssh mha-manager hostname

1.4 mysql 的配置

  • mysql101
1
egrep -v '^$|^#' /etc/my.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@mysql101 ~]# egrep -v '^$|^#' /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server-id=101
port=3306
gtid_mode=ON
enforce_gtid_consistency=ON
log_bin=mysql-bin
binlog_format=ROW
expire_logs_days=7
log_slave_updates=ON
skip_name_resolve = 1
  • mysql102
1
egrep -v '^$|^#' /etc/my.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@mysql102 ~]# egrep -v '^$|^#' /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server-id = 102
port = 3306
gtid_mode = ON
enforce_gtid_consistency = ON
log_bin = mysql-bin
binlog_format = ROW
read_only = 1
skip_name_resolve = 1
  • mysql103
1
egrep -v '^$|^#' /etc/my.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@mysql103 ~]# egrep -v '^$|^#' /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server-id = 103
port = 3306
gtid_mode = ON
enforce_gtid_consistency = ON
log_bin = mysql-bin
binlog_format = ROW
read_only = 1
skip_name_resolve = 1

1.5 一主两从

  1. 重启数据库服务
1
systemctl restart mysqld
  1. 主数据库(mysql101) 创建主从复制账号
1
mysql -u root -p'Root21..'
1
2
3
4
5
-- 创建复制专用用户
CREATE USER 'repl'@'%' IDENTIFIED WITH mysql_native_password BY 'Repl21..';
-- 赋予复制权限
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;
  1. 从数据库(mysql102)启动复制,并查看状态(着重看 sqlio 线程)
1
mysql -u root -p'Root21..'
1
2
3
4
5
6
7
CHANGE MASTER TO
MASTER_HOST='192.168.10.101',
MASTER_USER='repl',
MASTER_PASSWORD='Repl21..',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1; -- 【核心】这句话告诉它:用 GTID 自动找位置
START SLAVE;
1
SHOW SLAVE STATUS\G;
  1. 验证主从复制

主库操作数据

1
2
3
4
CREATE DATABASE test_db;
USE test_db;
CREATE TABLE user (id INT, name VARCHAR(20));
INSERT INTO user VALUES (1, 'qiankong');

从库验证数据

1
2
3
SHOW DATABASES; -- 看看有没有 test_db
USE test_db;
SELECT * FROM user; -- 看看有没有 'qiankong'

二、安装部署

不要忘记配置好主从。

2.1 安装 node 及其依赖

  • mysql101
  • mysql102
  • mysql103
  • mha-manager
1
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRS

:这里使用了 g.bravexist.cn 反向代理github的 release

1
yum install -y https://g.bravexist.cn/https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

2.2 安装 manager

  • mha-manager :需要先安装 node 及其依赖。
1
yum install -y https://g.bravexist.cn/https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

2.3 主节点创建 mha 用户

1
mysql -uroot -p'Root21..'
1
2
3
CREATE USER 'mha'@'192.168.10.%' IDENTIFIED WITH mysql_native_password BY 'Mha@21..';
GRANT ALL PRIVILEGES ON *.* TO 'mha'@'192.168.10.%';
FLUSH PRIVILEGES;

三、MHA基本配置使用(案例一)

3.1 修改配置

  1. 修改配置文件
1
vi /etc/masterha_default.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[server default]
user=mha
password=Mha@21..
ssh_user=root
repl_user=repl
repl_password=Repl21..
ping_interval=3

manager_workdir=/var/log/mha
manager_log=/var/log/mha/manager.log
# master_ip_failover_script=/usr/local/bin/master_ip_failover

[server1]
hostname=mysql101
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server2]
hostname=mysql102
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server3]
hostname=mysql103
port=3306
candidate_master=1
# candidate_master 候选主库资格
# no_master=1 # mha 和从节点在一起时取消注释,不选举为主
master_binlog_dir=/var/log/mysql/

3.2 配置启动

  1. 检查SSH
1
masterha_check_ssh --conf=/etc/masterha_default.cnf
  1. 检查主从复制
1
masterha_check_repl --conf=/etc/masterha_default.cnf
  1. 启动MHA
1
nohup masterha_manager --conf=/etc/masterha_default.cnf < /dev/null >/var/log/mha/mha_manager.log 2>&1 &
1
ps -ef |grep masterha_manager

3.3 验证自动切换

  1. 验证
1
systemctl stop mysqld
1
mysql -uroot -p'Root21..'
1
show variables like "read_only";
1
SHOW SLAVE STATUS\G;
1
masterha_check_repl --conf=/etc/masterha_default.cnf

3.4 重新加入集群

  1. 重新启动挂掉的主
1
systemctl start mysqld
1
mysql -uroot -p'Root21..'
1
2
3
4
5
6
7
CHANGE MASTER TO
MASTER_HOST='192.168.10.102',
MASTER_USER='repl',
MASTER_PASSWORD='Repl21..',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1; -- 【核心】这句话告诉它:用 GTID 自动找位置
START SLAVE;
1
SHOW SLAVE STATUS\G;
1
SET GLOBAL read_only = 1;

注意,只能进行一次主从切换,进程就会挂掉,需要重新启动。--ignore_last_failover 重要参数。

1
rm -f /var/log/mha/masterha_default.failover.complete
1
nohup masterha_manager --conf=/etc/masterha_default.cnf < /dev/null >/var/log/mha/mha_manager.log 2>&1 &

效果同上两步

1
nohup masterha_manager --conf=/etc/masterha_default.cnf --ignore_last_failover < /dev/null >/var/log/mha/mha_manager.log 2>&1 &

3.5 命令汇总

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 查看mha的状态
masterha_check_status --conf=/etc/masterha_default.cnf

# 查看SSH
masterha_check_ssh --conf=/etc/masterha_default.cnf

# 查看REPL复制
masterha_check_repl --conf=/etc/masterha_default.cnf

# 启动
nohup masterha_manager --conf=/etc/masterha_default.cnf < /dev/null >/var/log/mha/mha_manager.log 2>&1 &

# 启动时额外的参数(配置文件中移除死掉的master)
# --remove_dead_master_conf

# 忽略上次切主,否则会生成一个文件,代表切换过主,第二次不会启动成功。,
# --ignore_last_failover

# 停止
masterha_stop --conf=/etc/masterha_default.cnf

四、虚拟IP自动切换(案例二)

4.1 创建脚本

1
vim  /usr/local/bin/master_ip_failover

注意:修改网卡和ip地址

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.10.100/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}

sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
  1. 转为Linux下字符
1
yum install dos2unix -y
1
dos2unix /usr/local/bin/master_ip_failover
  1. 添加可执行权限
1
chmod +x /usr/local/bin/master_ip_failover 

4.2 修改配置文件

取消注释

1
vi /etc/masterha_default.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[server default]
user=mha
password=Mha@21..
ssh_user=root
repl_user=repl
repl_password=Repl21..
ping_interval=3

manager_workdir=/var/log/mha
manager_log=/var/log/mha/manager.log
# 引用脚本位置
master_ip_failover_script=/usr/local/bin/master_ip_failover

[server1]
hostname=mysql101
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server2]
hostname=mysql102
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server3]
hostname=mysql103
port=3306
candidate_master=1
# candidate_master 候选主库资格
# no_master=1 # mha 和从节点在一起时取消注释,不选举为主
master_binlog_dir=/var/log/mysql/

4.3 手动给master设置ip

1
ifconfig ens33:1 192.168.10.100/24

4.4 启动

注意,只能进行一次主从切换,进程就会挂掉,需要重新启动。--ignore_last_failover 重要参数。

1
rm -f /var/log/mha/masterha_default.failover.complete
1
nohup masterha_manager --conf=/etc/masterha_default.cnf < /dev/null >/var/log/mha/mha_manager.log 2>&1 &

效果同上两步

1
nohup masterha_manager --conf=/etc/masterha_default.cnf --ignore_last_failover < /dev/null >/var/log/mha/mha_manager.log 2>&1 &

4.5 验证自动切换

  1. 验证
1
systemctl stop mysqld
1
mysql -uroot -p'Root21..'
1
show variables like "read_only";
1
SHOW SLAVE STATUS\G;
1
masterha_check_repl --conf=/etc/masterha_default.cnf
1
ip addr

4.6 重新加入集群

  1. 重新启动挂掉的主
1
systemctl start mysqld
1
mysql -uroot -p'Root21..'
1
2
3
4
5
6
7
CHANGE MASTER TO
MASTER_HOST='192.168.10.102',
MASTER_USER='repl',
MASTER_PASSWORD='Repl21..',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1; -- 【核心】这句话告诉它:用 GTID 自动找位置
START SLAVE;
1
SHOW SLAVE STATUS\G;
1
SET GLOBAL read_only = 1;

五、主从数据自动补足

原理,设置一台 binlog server ,实时拉取主库的二进制日志,并且它不参与主库的选举。

5.1 设置 binlog server

  • 目前的 master
1
show master status;
  • mha-manager :选取的 binlog 服务器
  1. 创建对应的目录
1
mkdir -pv /data/binlog_backup
  1. 修改目录对应的权限
1
chown -R mysql:mysql /data/binlog_backup  # 假设你没创建mysql用户,用 root 也行,但要注意权限
  1. 手动拉取 binlog ,将进程放到后台

注意:主机名、密码、日志文件。

1
cd /data/binlog_backup
1
2
3
4
5
6
nohup mysqlbinlog -R \
--host=192.168.10.102 \
--user=mha \
--password='Mha@21..' \
--raw \
--stop-never mysql-bin.000001 &

或配置成 systemd 服务

1
vi /etc/systemd/system/mysql-binlog-server.service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[Unit]
Description=MySQL Binlog Server
After=network.target

[Service]
Type=simple
User=root
Group=root
# 你的备份目录,确保存在且有权限
WorkingDirectory=/data/binlog_backup

# 核心启动命令
# 注意:
# 1. 这里的 --host 先写当前的主库 (mysql101)
# 2. 这里的 mysql-bin.xxxx 需要你先确认主库当前的日志文件
# 3. 这里的密码不能使用 引号包裹
ExecStart=/usr/bin/mysqlbinlog -R --host=192.168.10.102 --user=mha --password=Mha@21.. --raw --stop-never mysql-bin.000001

# 关键设置:挂了自动重启
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target
  1. 启动 systemd
1
systemctl enable --now mysql-binlog-server
  1. 验证拉取的binlog日志
1
ls -l /data/binlog_backup

5.2 MHA配置

注: 主要添加 binlog1 模块。

  1. 修改配置文件
1
vi /etc/masterha_default.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[server default]
user=mha
password=Mha@21..
ssh_user=root
repl_user=repl
repl_password=Repl21..
ping_interval=3

manager_workdir=/var/log/mha
manager_log=/var/log/mha/manager.log
# 引用脚本位置
master_ip_failover_script=/usr/local/bin/master_ip_failover

[server1]
hostname=mysql101
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server2]
hostname=mysql102
port=3306
candidate_master=1
master_binlog_dir=/var/log/mysql/

[server3]
hostname=mysql103
port=3306
# candidate_master 候选主库资格
candidate_master=1
# mha 和从节点在一起时取消注释,不选举为主
# no_master=1
master_binlog_dir=/var/log/mysql/

[binlog1]
# 这里使用管理节点来额外存放 `binlog` 日志
no_master=1
hostname=mha-manager
master_binlog_dir=/data/binlog_backup

5.3 启动

  1. 查看MHA状态
1
masterha_check_status --conf=/etc/masterha_default.cnf
  1. 停止
1
masterha_stop --conf=/etc/masterha_default.cnf
  1. 启动
1
nohup masterha_manager --conf=/etc/masterha_default.cnf --ignore_last_failover< /dev/null >/var/log/mha/mha_manager.log 2>&1 &