快盘下载:好资源、好软件、快快下载吧!

快盘排行|快盘最新

当前位置:首页软件教程电脑软件教程 → MySQL 8.0 MGR网络抖动一例

MySQL 8.0 MGR网络抖动一例

时间:2022-09-19 18:55:46人气:作者:快盘下载我要评论

mysql 8.0 MGR网络抖动怎么办?

今天中午,线上一个MySQL8.0的MGR失联了一阵,之前其实没有遇到过这个场景,觉得挺新鲜,就记录了下当时的状态。

01

背景

首先介绍下我们这套环境,这套环境是由4个MySQL 8.0.20节点组成的MGR集群(建议配置奇数个节点,这套环境比较特殊),配置的是MGR的多主环境。

线上某个业务反馈连接MGR有连接报错,报错内容如下:

报错原因:业务向MGR中写入数据的时候,报错MGR开启了--super-read-only参数,写不进去。

正常运行的MGR是不会将节点设置为--super-read-only的,于是使用SQL查看了下MGR集群的状态,发现果然掉了一个节点。

正常节点:

11:55:29> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 18142f04-95fe-11ec-aff9-246e96be2d00 | 10.xx.xx.184 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 32a10c3f-ce69-11eb-bd13-fa163e311dfa | 10.xxx.1.236 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 63a8b9e3-95f9-11ec-bfc3-fa163e06c313 | 10.xxx.1.129 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

可以看到,正常节点无法和异常的节点通信,当前组里面只有3个节点了。

异常节点:

select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | ca12c7a7-95fd-11ec-9979-6c92bf982df2 | 10.xx.xx.21 |        5562 | ERROR        |             | 8.0.20         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
1 row in set (0.00 sec)

可以看到,异常的节点无法和其他3个正常节点通信,而且自己的状态是ERROR状态。

02

排查过程

首先查看坏掉节点的MySQL日志,报错如下,为了方便理解,我对日志做了一些注释:

2022-03-30T08:35:12.689967+08:00 26 [Warning] [MY-010957] [Server] The replication timestamps have returned to normal values.
----------这部分日志说明它连接不上其他节点了----------------
2022-03-30T11:23:59.451419+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.1.236:5562 has become unreachable.'
2022-03-30T11:23:59.466959+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.17.37:5562 has become unreachable.'
2022-03-30T11:23:59.467008+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.1.129:5562 has become unreachable.'
2022-03-30T11:23:59.467020+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 10.xx.xx.184:5562 has become unreachable.'

---------无法连接其他节点,停止所有更新,并提示可以使用参数group_replication_force_members重开一个组--------------
2022-03-30T11:23:59.467218+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This
server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to forc
e a new group membership.'

----------这部分日志说明它网络恢复了----------------
2022-03-30T11:24:02.141493+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.1.236:5562 is reachable again.'
2022-03-30T11:24:03.219209+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.17.37:5562 is reachable again.'
2022-03-30T11:24:03.245901+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address 10.xxx.1.129:5562 is reachable again.'
2022-03-30T11:24:03.245941+08:00 0 [Warning] [MY-011494] [Repl] Plugin group_replication reported: 'Member with address 10.xx.xx.184:5562 is reachable again.'
2022-03-30T11:24:03.245956+08:00 0 [Warning] [MY-011498] [Repl] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the gro
up. Regular operation is restored and transactions are unblocked.'

--------由于网络原因,它被从MGR复制组中踢出了,状态变更成ERROR-------
2022-03-30T11:24:06.907099+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing membe
r status to ERROR.'

--------一些事务将会回滚--------------
2022-03-30T11:24:06.922962+08:00 0 [Warning] [MY-011630] [Repl] Plugin group_replication reported: 'Due to a plugin error, some transactions were unable to be certified and
will now rollback.'

--------MySQL被自动设置成read only模式-----------
2022-03-30T11:24:06.923035+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was de
tected.'

--------等待冲突检测,执行before_commit函数失败---------
2022-03-30T11:24:06.923078+08:00 312718417 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on
session 312718417'
2022-03-30T11:24:06.923081+08:00 397259854 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on
session 397259854'
2022-03-30T11:24:06.923165+08:00 312718417 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
2022-03-30T11:24:06.923163+08:00 397259854 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed

-------由于自身状态ERROR无法加入组,提示修复错误或者重启MGR------
2022-03-30T11:24:08.587817+08:00 397259854 [ERROR] [MY-011601] [Repl] Plugin group_replication reported: 'Transaction cannot be executed while Group Replication is on ERROR
state. Check for errors and restart the plugin'

得了,那就按照提示,重启下MGR吧:

12:00:38 >stop group_replication;
Query OK, 0 rows affected (1.01 sec)

12:00:46 >start group_replication;                                  
Query OK, 0 rows affected (3.36 sec)


12:00:52 >select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 18142f04-95fe-11ec-aff9-246e96be2d00 | 10.xx.xx.184 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 32a10c3f-ce69-11eb-bd13-fa163e311dfa | 10.xxx.1.236 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 63a8b9e3-95f9-11ec-bfc3-fa163e06c313 | 10.xxx.1.129 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | ca12c7a7-95fd-11ec-9979-6c92bf982df2 | 10.xx.xx.21  |        5562 | RECOVERING   | PRIMARY     | 8.0.20         |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
4 rows in set (0.00 sec)

select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 18142f04-95fe-11ec-aff9-246e96be2d00 | 10.xx.xx.184 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 32a10c3f-ce69-11eb-bd13-fa163e311dfa | 10.xxx.1.236 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | 63a8b9e3-95f9-11ec-bfc3-fa163e06c313 | 10.xxx.1.129 |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
| group_replication_applier | ca12c7a7-95fd-11ec-9979-6c92bf982df2 | 10.xx.xx.21  |        5562 | ONLINE       | PRIMARY     | 8.0.20         |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
4 rows in set (0.00 sec)

重启之后,可以看到,ERROR的状态变成了Recovering,说明节点正在执行恢复操作。不一会儿,就都变成Online状态了。

03

总结

MGR本身的自愈功能还是比较强的,多数派的协议保证了事务的强一致。

从运维经验上讲,一般情况下,MGR中比较棘手的问题就是认证冲突,报错内容形如:

ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.

目前遇到认证冲突这种问题,可以通过重试的办法来解决。

如果是MGR深度用户,提几个建议:

1、使用MySQL8.0 社区版本的MGR,最好8.0.22以后

2、最好使用MGR单主模式,冲突会少些。如果想要体验多主模式,又遇到了不可解决的问题,可以退而求其次,多主模式下,集中在一个节点进行写入,利用它的故障自愈。

3、建议使用GreatSQL 替代社区版本MGR,修复了官方MGR的很多Bug。项目链接:https://gitee.com/GreatSQL/GreatSQL-Doc

今天内容就到这里吧。

相关文章

  • Linux配置HTTP服务

    Linux配置HTTP服务,配置Linux初级httpd服务...
  • Linux系统上安装Zookeeper

    Linux系统上安装Zookeeper,Zookeeper的安装:第一步:安装jdk(必须先安装jdk,否则启动不成功)第二步:解压缩zookeeper压缩包第三步:将conf文件夹下zoo_sample.cfg复制一份,改名为zoo.cfg第四步:修改配置dataDir属性,指定一个真实目录(进入zookeeper解压目录,创建data目录:mkdirdata)  第五步:启动zookeeper:bin/zkSe...

网友评论

快盘下载暂未开通留言功能。

关于我们| 广告联络| 联系我们| 网站帮助| 免责声明| 软件发布

Copyright 2019-2029 【快快下载吧】 版权所有 快快下载吧 | 豫ICP备10006759号公安备案:41010502004165

声明: 快快下载吧上的所有软件和资料来源于互联网,仅供学习和研究使用,请测试后自行销毁,如有侵犯你版权的,请来信指出,本站将立即改正。