PXC在插入数据时其它节点发生mysql crash并报UTC - mysqld got signal 11错误处理

来源:互联网 发布:手机联系人数据恢复 编辑:程序博客网 时间:2024/05/01 14:32


一.基本信息
节点一:10.16.24.107
节点二:10.16.24.108
节点三:10.16.24.109
Percona-XtraDB-Cluster-5.6.21

二.问题描述

在节点一和节点三执行inser into demo1 select 2后,另两个节点就宕机了,mysql进程退出;
在节点三上执行update demo1 set id=3 where id=2后,节点二宕机了,mysql进程退出;
接着在节点三执行delete from demo1 where id=1后,节点一和节点三宕机了,节点一mysql进程退出,节点三执行任何sql报ERROR 1047 (08S01): Unknown command
上面问题每次dml操作都能出现。

检查mysql的error log如下:
2016-06-22 16:51:26 8647 [Note] WSREP: (848d2655, 'tcp://0.0.0.0:4567') turning message relay requesting off
08:55:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=67108864
read_buffer_size=1048576
max_used_connections=1
max_threads=216
thread_count=2
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 317641 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fd8a4000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fd93c1ef998 thread_stack 0x30000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x35)[0x91c1e5]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x4b4)[0x688524]
/lib64/libpthread.so.0[0x3de880f710]
/usr/local/mysql/bin/mysqld(_ZN14Relay_log_info15cleanup_contextEP3THDb+0x34)[0x8f5fc4]
/usr/local/mysql/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xa93)[0x8bb473]
/usr/local/mysql/bin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x77)[0x8bcc97]
/usr/local/mysql/bin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x636)[0x5c03a6]
/usr/local/mysql/lib/libgalera_smm.so(_ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6_+0xb1)[0x7fd93
cf5a511]
/usr/local/mysql/lib/libgalera_smm.so(+0x1cc5a5)[0x7fd93cf925a5]
/usr/local/mysql/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x283)[0x7fd93cf92fc3]
/usr/local/mysql/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x45)[0x7fd93cf97965]
/usr/local/mysql/lib/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x2da)[0x7fd93cf7115a]
/usr/local/mysql/lib/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x63)[0x7fd93cf71773]
/usr/local/mysql/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x93)[0x7fd93cf8fad3]
/usr/local/mysql/lib/libgalera_smm.so(galera_recv+0x23)[0x7fd93cfa67a3]
/usr/local/mysql/bin/mysqld[0x5c1131]
/usr/local/mysql/bin/mysqld(start_wsrep_THD+0x36e)[0x5a863e]
/lib64/libpthread.so.0[0x3de88079d1]
/lib64/libc.so.6(clone+0x6d)[0x3de84e89dd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fd8a400fc60): 
Connection ID (thread ID): 1
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
160622 16:55:40 mysqld_safe Number of processes running now: 0
160622 16:55:40 mysqld_safe WSREP: not restarting wsrep node automatically
160622 16:55:40 mysqld_safe mysqld from pid file /data/mysql/mysql_3306/data/mvxl0783.pid ended

三.问题分析
表结构如下
(product)root@localhost [demo]> show create table demo1\G
*************************** 1. row ***************************
       Table: demo1
Create Table: CREATE TABLE `demo1` (
  `id` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

检查表正常:
(product)root@localhost [demo]> check table demo1;
+------------+-------+----------+----------+
| Table      | Op    | Msg_type | Msg_text |
+------------+-------+----------+----------+
| demo.demo1 | check | status   | OK       |
+------------+-------+----------+----------+
1 row in set (0.01 sec)

尝试如下10个方法后,问题还是无法解决,通过baidu、google都没查到相应处理方法。

方法一:增加主键
alter table `demo1` add column `RID` int not null auto_increment primary key comment '主键' first;
方法二:将高版本的xtrabackup改成低版本的
yum remove percona-xtrabackup-24.x86_64
yum install percona-xtrabackup-2.2.11-1.el6.x86_64.rpm
方法三:  将wsrep_sst_method=xtrabackup-v2改成wsrep_sst_method=rsync
方法四:增加安装yum install openssl-devel
方法五:将在mysql用户下启动pxc改成在root用户下启动pxc
方法六:将在mysql用户下登入mysql库操作改成在root用户下登入mysql
方法七:删除数据文件目录下的grastate.dat和galera.cache
方法八:调整limit参数
vi /etc/security/limits.conf
mysql soft nproc -1
msyql hard nproc -1
* soft nofile 65535
* hard nofile 65535
* soft fsize  -1
* hard fsize  -1
root soft nproc -1
root hard nproc -1

方法九:
将innodb_buffer_pool_size由100M调整为15G

方法十:重新建表
drop table demo1;
CREATE TABLE `demo1` (
  `ID` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',
  `OID` int(11) NOT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;


四.问题解决
经历多次失败,有点绝望,打算升级到pxc 5.6.30版本,看问题是否还存在,但又不甘心去搞升级,于是尝试调整参数,采用比较安全的my.cnf文件配置试试,内容如下:
[mysql]

# CLIENT #
port                           = 3306
socket                         =  /tmp/mysql_3306.sock
[mysqld]
# GENERAL #
user                           = mysql
default-storage-engine         = InnoDB
socket                         = /tmp/mysql_3306.sock
pid-file                       = /data/mysql/mysql_3306/data/mysqld.pid
basedir         = /usr/local/mysql
#datadir         = /data/mysql/mysql_3306/data
tmpdir          = /data/mysql/mysql_3306/tmp
server-id       =1073306
# MyISAM #
key-buffer-size                = 32M
myisam-recover                 = FORCE,BACKUP
# SAFETY #
max-allowed-packet             = 16M
max-connect-errors             = 1000000
skip-name-resolve
sysdate-is-now                 = 1
innodb                         = FORCE
innodb-strict-mode             = 1
# DATA STORAGE #
datadir                        = /data/mysql/mysql_3306/data
# BINARY LOGGING #
log-bin                        = /data/mysql/mysql_3306/logs/mysql-bin
expire-logs-days               = 14
sync-binlog                    = 1
# CACHES AND LIMITS #
tmp-table-size                 = 32M
max-heap-table-size            = 32M
query-cache-type               = 0
query-cache-size               = 0
max-connections                = 500
thread-cache-size              = 50
open-files-limit               = 65535
table-definition-cache         = 1024
table-open-cache               = 2048
# INNODB #
innodb-flush-method            = O_DIRECT
innodb-log-files-in-group      = 2
innodb-log-file-size           = 64M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table          = 1
innodb-buffer-pool-size        = 500M
# LOGGING #
log-error                      = error.log
slow-query-log                 = 1
slow-query-log-file            = slow.log
# Autoextend
#innodb_data_file_path           = ibdata1:128M;ibdata2:10M:autoextend
innodb_data_home_dir = /data/mysql/mysql_3306/data
innodb_log_group_home_dir = /data/mysql/mysql_3306/logs
innodb_data_file_path = ibdata1:100M:autoextend
#
default_storage_engine=innodb
binlog_format=row
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
#wsrep_provider_options="gcache.size=1999M"
wsrep_cluster_name=pcx_zengxw
wsrep_cluster_address=gcomm://10.16.24.107,10.16.24.108,10.16.24.109
wsrep_node_address=10.16.24.107
wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so
wsrep_sst_method=rsync
#wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst:zengxw1

用上面的my.cnf文件配置启动pxc,并重复上面的dml操作,居然三个节点都正常,不再出现crash问题,让我万分大喜。

五.排查原因

但要找到是什么参数引起上面问题,需要一个个参数排除,任重道远,想到前面经过近各种方法尝试无果后,都没有放弃,排查参数,应该是轻松和比较简单的事情了。
在上面安全的配置参数中,再将原my.cnf文件其它参数增加进来,一个个排除,最后定位到如下参数与pxc水火不容:
binlog_rows_query_log_events=1
当我将这个问题产生的原因是由于上面这个参数引起的,在QQ群中问其它同学时,后面有同学发了一个关于pxc 5.6版本的局限性,证实了这个问题,5.6版本不支持binlog_rows_query_log_events参数,摘取如下:
.binlog_rows_query_log_events variable not supported.
具体见:https://www.percona.com/doc/percona-xtradb-cluster/5.6/limitation.html

通过该案例处理,总结出:要掌握一个软件,先要了解软件及版本的局限性,同时,为了查找一个技术问题产生的原因和解决方法,要有不到成功不放弃的思想,万一成功了呢?

0 0
原创粉丝点击