mysql PXC集群脑裂及grastate.dat修改实验



三台服务器做了


mysql PXC


集群


172.31.217.182 bd-dev-mingshuo-182


172.31.217.183 bd-dev-mingshuo-183


172.31.217.89 bd-dev-vertica-89


正常关闭


183


一个节点


mysqladmin -uroot -poracle -S /u01/mysql/3307/data/mysql.sock -P3307 shutdown


关闭节点


log




2018-09-27T07:33:13.222079Z 0 [Note] WSREP: Received shutdown signal. Will sleep for 10 secs before initiating shutdown. pxc_maint_mod开发云主机域名e switched to SHUTDOWN


2018-09-27T07:33:23.230509Z 0 [Note] WSREP: Stop replication


2018-09-27T07:33:23.230619Z 0 [Note] WSREP: Closing send monitor…


2018-09-27T07:33:23.230640Z 0 [Note] WSREP: Closed send monitor.


2018-09-27T07:33:23.230660Z 0 [Note] WSREP: gcomm: terminating thread


2018-09-27T07:33:23.230680Z 0 [Note] WSREP: gcomm: joining thread


2018-09-27T07:33:23.230827Z 0 [Note] WSREP: gcomm: closing backend


2018-09-27T07:33:23.231780Z 0 [Note] WSREP: Current view of cluster as seen by this node


view (view_id(NON_PRIM,12f1e199,11)


memb {


12f1e199,0


}


joined {


}


left {


}


partitioned {


2331d3d7,0


c05737fd,0


}


)


2018-09-27T07:33:23.231867Z 0 [Note] WSREP: Current view of cluster as seen by this node


view ((empty))


2018-09-27T07:33:23.232111Z 0 [Note] WSREP: gcomm: closed


2018-09-27T07:33:23.232165Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1


2018-09-27T07:33:23.232253Z 0 [Note] WSREP: Flow-control interval: [100, 100]


2018-09-27T07:33:23.232260Z 0 [Note] WSREP: Trying to continue unpaused monitor


2018-09-27T07:33:23.232264Z 0 [Note] WSREP: Received NON-PRIMARY.


2018-09-27T07:33:23.232268Z 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 27)


2018-09-27T07:33:23.232279Z 0 [Note] WSREP: Received self-leave message.


2018-09-27T07:33:23.232285Z 0 [Note] WSREP: Flow-control interval: [0, 0]


2018-09-27T07:33:23.232288Z 0 [Note] WSREP: Trying to continue unpaused monitor


2018-09-27T07:33:23.232291Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.


2018-09-27T07:33:23.232295Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 27)


2018-09-27T07:33:23.232302Z 0 [Note] WSREP: RECV thread exiting 0: Success


2018-09-27T07:33:23.232383Z 2 [Note] WSREP: New cluster view: global state: c057dbc5-c16e-11e8-a1a6-825ed9079934:27, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3


2018-09-27T07:33:23.232394Z 2 [Note] WSREP: Setting wsrep_ready to false


2018-09-27T07:33:23.232400Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-27T07:33:23.232439Z 2 [Note] WSREP: New cluster view: global state: c057dbc5-c16e-11e8-a1a6-825ed9079934:27, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3


2018-09-27T07:33:23.232443Z 2 [Note] WSREP: Setting wsrep_ready to false


2018-09-27T07:33:23.232446Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-27T07:33:23.232472Z 2 [Note] WSREP: applier thread exiting (code:0)


2018-09-27T07:33:23.232479Z 0 [Note] WSREP: recv_thread() joined.


2018-09-27T07:33:23.232502Z 0 [Note] WSREP: Closing replication queue.


2018-09-27T07:33:23.232509Z 0 [Note] WSREP: Closing slave action queue.


2018-09-27T07:33:23.232517Z 0 [Note] Giving 2 client threads a chance to die gracefully


2018-09-27T07:33:25.232639Z 0 [Note] WSREP: Waiting for active wsrep applier to exit


2018-09-27T07:33:25.232758Z 1 [Note] WSREP: rollbacker thread exiting


2018-09-27T07:33:25.232994Z 0 [Note] Giving 0 client threads a chance to die gracefully


2018-09-27T07:33:25.233010Z 0 [Note] Shutting down slave threads


2018-09-27T07:33:25.233025Z 0 [Note] Forcefully disconnecting 0 remaining clients


2018-09-27T07:33:25.233044Z 0 [Note] Event Scheduler: Purging the queue. 0 events


2018-09-27T07:33:25.242788Z 0 [Note] WSREP: Service thread queue flushed.


2018-09-27T07:33:25.250399Z 0 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0


2018-09-27T07:33:25.250479Z 0 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 27)


2018-09-27T07:33:25.259428Z 0 [Note] Binlog end


2018-09-27T07:33:25.261702Z 0 [Note] Shutting down plugin ‘ngram’


2018-09-27T07:33:25.261721Z 0 [Note] Shutting down plugin ‘partition’


2018-09-27T07:33:25.261726Z 0 [Note] Shutting down plugin ‘ARCHIVE’


2018-09-27T07:33:25.261729Z 0 [Note] Shutting down plugin ‘BLACKHOLE’


2018-09-27T07:33:25.261733Z 0 [Note] Shutting down plugin ‘INNODB_SYS_VIRTUAL’


2018-09-27T07:33:25.261736Z 0 [Note] Shutting down plugin ‘INNODB_CHANGED_PAGES’


2018-09-27T07:33:25.261739Z 0 [Note] Shutting down plugin ‘INNODB_SYS_DATAFILES’


2018-09-27T07:33:25.261741Z 0 [Note] Shutting down plugin ‘INNODB_SYS_TABLESPACES’


2018-09-27T07:33:25.261744Z 0 [Note] Shutting down plugin ‘INNODB_SYS_FOREIGN_COLS’


2018-09-27T07:33:25.261746Z 0 [Note] Shutting down plugin ‘INNODB_SYS_FOREIGN’


2018-09-27T07:33:25.261749Z 0 [Note] Shutting down plugin ‘INNODB_SYS_FIELDS’


2018-09-27T07:33:25.261751Z 0 [Note] Shutting down plugin ‘INNODB_SYS_COLUMNS’


2018-09-27T07:33:25.261754Z 0 [Note] Shutting down plugin ‘INNODB_SYS_INDEXES’


2018-09-27T07:33:25.261756Z 0 [Note] Shutting down plugin ‘INNODB_SYS_TABLESTATS’


2018-09-27T07:33:25.261759Z 0 [Note] Shutting down plugin ‘INNODB_SYS_TABLES’


2018-09-27T07:33:25.261761Z 0 [Note] Shutting down plugin ‘INNODB_FT_INDEX_TABLE’


2018-09-27T07:33:25.261764Z 0 [Note] Shutting down plugin ‘INNODB_FT_INDEX_CACHE’


2018-09-27T07:33:25.261766Z 0 [Note] Shutting down plugin ‘INNODB_FT_CONFIG’


2018-09-27T07:33:25.261769Z 0 [Note] Shutting down plugin ‘INNODB_FT_BEING_DELETED’


2018-09-27T07:33:25.261771Z 0 [Note] Shutting down plugin ‘INNODB_FT_DELETED’


2018-09-27T07:33:25.261774Z 0 [Note] Shutting down plugin ‘INNODB_FT_DEFAULT_STOPWORD’


2018-09-27T07:33:25.261776Z 0 [Note] Shutting down plugin ‘INNODB_METRICS’


2018-09-27T07:33:25.261778Z 0 [Note] Shutting down plugin ‘INNODB_TEMP_TABLE_INFO’


2018-09-27T07:33:25.261781Z 0 [Note] Shutting down plugin ‘INNODB_BUFFER_POOL_STATS’


2018-09-27T07:33:25.261783Z 0 [Note] Shutting down plugin ‘INNODB_BUFFER_PAGE_LRU’


2018-09-27T07:33:25.261785Z 0 [Note] Shutting down plugin ‘INNODB_BUFFER_PAGE’


2018-09-27T07:33:25.261788Z 0 [Note] Shutting down plugin ‘INNODB_CMP_PER_INDEX_RESET’


2018-09-27T07:33:25.261790Z 0 [Note] Shutting down plugin ‘INNODB_CMP_PER_INDEX’


2018-09-27T07:33:25.261793Z 0 [Note] Shutting down plugin ‘INNODB_CMPMEM_RESET’


2018-09-27T07:33:25.261795Z 0 [Note] Shutting down plugin ‘INNODB_CMPMEM’


2018-09-27T07:33:25.261797Z 0 [Note] Shutting down plugin ‘INNODB_CMP_RESET’


2018-09-27T07:33:25.261800Z 0 [Note] Shutting down plugin ‘INNODB_CMP’


2018-09-27T07:33:25.261802Z 0 [Note] Shutting down plugin ‘INNODB_LOCK_WAITS’


2018-09-27T07:33:25.261805Z 0 [Note] Shutting down plugin ‘INNODB_LOCKS’


2018-09-27T07:33:25.261807Z 0 [Note] Shutting down plugin ‘INNODB_TRX’


2018-09-27T07:33:25.261809Z 0 [Note] Shutting down plugin ‘XTRADB_ZIP_DICT_COLS’


2018-09-27T07:33:25.261812Z 0 [Note] Shutting down plugin ‘XTRADB_ZIP_DICT’


2018-09-27T07:33:25.261814Z 0 [Note] Shutting down plugin ‘XTRADB_RSEG’


2018-09-27T07:33:25.261817Z 0 [Note] Shutting down plugin ‘XTRADB_INTERNAL_HASH_TABLES’


2018-09-27T07:33:25.261819Z 0 [Note] Shutting down plugin ‘XTRADB_READ_VIEW’


2018-09-27T07:33:25.261822Z 0 [Note] Shutting down plugin ‘InnoDB’


2018-09-27T07:33:25.261857Z 0 [Note] InnoDB: FTS optimize thread exiting.


2018-09-27T07:33:25.262097Z 0 [Note] InnoDB: Starting shutdown…


2018-09-27T07:33:25.362428Z 0 [Note] InnoDB: Dumping buffer pool(s) to /u01/mysql/3307/data/ib_buffer_pool


2018-09-27T07:33:25.363022Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 180927 15:33:25


2018-09-27T07:33:25.562786Z 0 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool


2018-09-27T07:33:26.571050Z 0 [Note] InnoDB: Shutdown completed; log sequence number 2569669


2018-09-27T07:33:26.574169Z 0 [Note] InnoDB: Removed temporary tablespace data file: “ibtmp1”


2018-09-27T07:33:26.574193Z 0 [Note] Shutting down plugin ‘MyISAM’


2018-09-27T07:33:26.574210Z 0 [Note] Shutting down plugin ‘MRG_MYISAM’


2018-09-27T07:33:26.574222Z 0 [Note] Shutting down plugin ‘CSV’


2018-09-27T07:33:26.574233Z 0 [Note] Shutting down plugin ‘MEMORY’


2018-09-27T07:33:26.574254Z 0 [Note] Shutting down plugin ‘PERFORMANCE_SCHEMA’


2018-09-27T07:33:26.574287Z 0 [Note] Shutting down plugin ‘sha256_password’


2018-09-27T07:33:26.574296Z 0 [Note] Shutting down plugin ‘mysql_native_password’


2018-09-27T07:33:26.574304Z 0 [Note] Shutting down plugin ‘wsrep’


2018-09-27T07:33:26.574480Z 0 [Note] Shutting down plugin ‘binlog’


正常节点


log


2018-09-27T07:33:22.505216Z 0 [Note] WSREP: declaring c05737fd at tcp://172.31.217.89:4567 stable


2018-09-27T07:33:22.505345Z 0 [Note] WSREP: forgetting 12f1e199 (tcp://172.31.217.183:4567)


2018-09-27T07:33:22.511586Z 0 [Note] WSREP: Node 2331d3d7 state primary


2018-09-27T07:33:22.512245Z 0 [Note] WSREP: Current view of cluster as seen by this node


view (view_id(PRIM,2331d3d7,12)


memb {


2331d3d7,0


c05737fd,0


}


joined {


}


left {


}


partitioned {


12f1e199,0


}


)


2018-09-27T07:33:22.512303Z 0 [Note] WSREP: Save the discovered primary-component to disk


2018-09-27T07:33:22.512547Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2


2018-09-27T07:33:22.513157Z 0 [Note] WSREP: forgetting 12f1e199 (tcp://172.31.217.183:4567)


2018-09-27T07:33:22.513241Z 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 9ccf0351-c227-11e8-ae6a-d3cac5b411a7


2018-09-27T07:33:22.514096Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 9ccf0351-c227-11e8-ae6a-d3cac5b411a7


2018-09-27T07:33:22.514647Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 9ccf0351-c227-11e8-ae6a-d3cac5b411a7 from 0 (bd-dev-mingshuo-182)


2018-09-27T07:33:22.514661Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 9ccf0351-c227-11e8-ae6a-d3cac5b411a7 from 1 (bd-dev-vertica-89)


2018-09-27T07:33:22.514669Z 0 [Note] WSREP: Quorum results:


version = 4,


component = PRIMARY,


conf_id = 11,


members = 2/2 (primary/total),


act_id = 27,


last_appl. = 0,


protocols = 0/8/3 (gcs/repl/appl),


group UUID = c057dbc5-c16e-11e8-a1a6-825ed9079934


2018-09-27T07:33:22.514675Z 0 [Note] WSREP: Flow-control interval: [141, 141]


2018-09-27T07:33:22.514679Z 0 [Note] WSREP: Trying to continue unpaused monitor


2018-09-27T07:33:22.514707Z 2 [Note] WSREP: New cluster view: global state: c057dbc5-c16e-11e8-a1a6-825ed9079934:27, view# 12: Primary, number of nodes: 2, my index: 0, protocol version 3


2018-09-27T07:33:22.514713Z 2 [Note] WSREP: Setting wsrep_ready to true


2018-09-27T07:33:22.514719Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-27T07:33:22.514727Z 2 [Note] WSREP: REPL Protocols: 8 (3, 2)


2018-09-27T07:33:22.514747Z 2 [Note] WSREP: Assign initial position for certification: 27, protocol version: 3


2018-09-27T07:33:22.514830Z 0 [Note] WSREP: Service thread queue flushed.


2018-09-27T07:33:27.691129Z 0 [Note] WSREP: cleaning up 12f1e199 (tcp://172.31.217.183:4567)


182


节点插入数据


mysql> insert into t1 values (4,4);


Query OK, 1 row affected (0.01 sec)


mysql> select * from t1;


+—+——+


| a | b |


+—+——+


| 1 | 1 |


| 2 | 2 |


| 3 | 3 |


| 4 | 4 |


+—+——+


4 rows in set (0.00 sec)


启动


183


节点


mysql -S /u01/mysql/3307/data/mysql.sock -uroot -poracle -P3307


mysql> select * from t1;


+—+——+


| a | b |


+—+——+


| 1 | 1 |


| 2 | 2 |


| 3 | 3 |


| 4 | 4 |


+—+——+


4 rows in set (0.00 sec)


增量数据已经同步过来了。


下面是日志增量应用部分,可以看到收到了一个事务。


2018-09-27T08:05:50.785769Z 0 [Note] WSREP: Signalling provider to continue on SST completion.


2018-09-27T08:05:50.785808Z 0 [Note] WSREP: Initialized wsrep sidno 2


2018-09-27T08:05:50.785833Z 0 [Note] WSREP: SST received: c057dbc5-c16e-11e8-a1a6-825ed9079934:27


2018-09-27T08:05:50.785872Z 2 [Note] WSREP: Receiving IST: 1 writesets, seqnos 27-28


2018-09-27T08:05:50.785985Z 0 [Note]


2018-09-27T08:05:50.785985Z 0 [Note] WSREP: Receiving IST… 0.0% (0/1 events) complete.


2018-09-27T08:05:50.877679Z 0 [Note] WSREP: Receiving IST…100.0% (1/1 events) complete.


2018-09-27T08:05:50.877904Z 2 [Note] WSREP: IST received: c057dbc5-c16e-11e8-a1a6-825ed9079934:28


2018-09-27T08:05:50.878589Z 0 [Note] WSREP: 1.0 (bd-dev-mingshuo-183): State transfer from 0.0 (bd-dev-mingshuo-182) complete.


2018-09-27T08:05:50.878603Z 0 [Note] WSREP: SST leaving flow control


2018-09-27T08:05:50.878608Z 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 28)


2018-09-27T08:05:50.879059Z 0 [Note] WSREP: Member 1.0 (bd-dev-mingshuo-183) synced with group.


2018-09-27T08:05:50.879072Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 28)


2018-09-27T08:05:50.879101Z 2 [Note] WSREP: Synchronized with group, ready for connections


2018-09-27T08:05:50.879115Z 2 [Note] WSREP: Setting wsrep_ready to true


现在测试直接非正常关闭两个节点





182





183


两个节点进程直接


kill -9


杀掉


89


存活节点:


mysql> select * from ming.t1;


ERROR 1047 (08S01): WSREP has not yet prepared node for application use


mysql> insert into ming.t1 values(10,10);


ERROR 1047 (08S01): WSREP has not yet prepared node for application use


存活节点已经无法正常提供读写服务。


mysql> show status where Variable_name IN (‘wsrep_local_state_uuid’,’wsrep_cluster_conf_id’,’wsrep_cluster_size’, ‘wsrep_cluster_status’,’wsrep_ready’,’wsrep_connected’);


+————————+————————————–+


| Variable_name | Value |


+————————+————————————–+


| wsrep_local_state_uuid | c057dbc5-c16e-11e8-a1a6-825ed9079934 |


| wsrep_cluster_conf_id | 18446744073709551615 |


| wsrep_cluster_size | 1 |


| wsrep_cluster_status | non-Primary |


| wsrep_connected | ON |


| wsrep_ready | OFF |


+————————+————————————–+


6 rows in set (0.00 sec)


可以看到


wsrep_cluster_size=1


代表集群节点个数只剩自己了。


wsrep_cluster_status=non-Primary


代表集群状态不一致


wsrep_connected=ON


代表数据库还接受连接


wsrep_read=OFF


代表数据库已经不能正常接受查询服务了。上面的


select


语句也佐证了这一点。


存活节点能否提供读服务,取决于


wsrep_dirty_reads


参数


mysql> show variables like ‘wsrep_dirty_reads’;


+——————-+——-+


| Variable_name | Value |


+——————-+——-+


| wsrep_dirty_reads | OFF |


+——————-+——-+


1 row in set (0.00 sec)


wsrep_dirty_reads


是可以动态调整的。如果设置为


ON


,那么在节点状态是


non-Primary


时,


是可以提供读的服务的。写的服务还需要提升该节点为


primary


,这个是要通过其他参数设定的,


后面会说的。


存活节点一直在尝试连接另外两个节点


2018-09-28T02:57:37.209095Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) reconnecting to 2331d3d7 (tcp://172.31.217.182:4567), attempt 960


2018-09-28T02:58:12.714612Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) reconnecting to 252da778 (tcp://172.31.217.183:4567), attempt 900


2018-09-28T02:58:22.216078Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) reconnecting to 2331d3d7 (tcp://172.31.217.182:4567), attempt 990


2018-09-28T02:58:57.721850Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) reconnecting to 252da778 (tcp://172.31.217.183:4567), attempt 930


2018-09-28T02:59:07.223430Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) reconnecting to 2331d3d7 (tcp://172.31.217.182:4567), attempt 1020


不能提供读写的原因其实就是


PXC


对集群脑裂的判断机制还不完善,对我自己来说我是


kill


掉了两个节点的进程。


但是对


PXC


来说,存活节点不知道另外两个节点的状态,有可能另外两个节点已经死掉了,有可能另外两个节点相互之间还能继续通信对外提供服务,


这样一来就形成了两个信息孤岛,彼此之间不能联系对方,所以存活节点就变成了这样不能读写的状态。


拉起两个节点后存活节点日志


2018-09-28T03:04:57.003914Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) connection established to 252da778 tcp://172.31.217.183:4567


2018-09-28T03:05:03.215507Z 0 [Note] WSREP: declaring 252da778 at tcp://172.31.217.183:4567 stable


2018-09-28T03:05:03.216346Z 0 [Note] WSREP: Current view of cluster as seen by this node


view (view_id(NON_PRIM,252da778,30)


memb {


252da778,0


725136c0,0


}


joined {


}


left {


}


partitioned {


2331d3d7,0


}


)


2018-09-28T03:05:03.216630Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2


2018-09-28T03:05:03.216710Z 0 [Note] WSREP: Flow-control interval: [141, 141]


2018-09-28T03:05:03.216718Z 0 [Note] WSREP: Trying to continue unpaused monitor


2018-09-28T03:05:03.216723Z 0 [Note] WSREP: Received NON-PRIMARY.


2018-09-28T03:05:03.216794Z 1 [Note] WSREP: New cluster view: global state: c057dbc5-c16e-11e8-a1a6-825ed9079934:33, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version 3


2018-09-28T03:05:03.216822Z 1 [Note] WSREP: Setting wsrep_ready to false


2018-09-28T03:05:03.216833Z 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-28T03:05:04.277523Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) connection established to 2331d3d7 tcp://172.31.217.182:4567


2018-09-28T03:05:04.279018Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) connection established to 2331d3d7 tcp://172.31.217.182:4567


2018-09-28T03:05:04.776965Z 0 [Note] WSREP: declaring 2331d3d7 at tcp://172.31.217.182:4567 stable


2018-09-28T03:05:04.777019Z 0 [Note] WSREP: declaring 252da778 at tcp://172.31.217.183:4567 stable


2018-09-28T03:05:04.777487Z 0 [Note] WSREP: re-bootstrapping prim from partitioned components


2018-09-28T03:05:04.778262Z 0 [Note] WSREP: Current view of cluster as seen by this node


view (view_id(PRIM,2331d3d7,31)


memb {


2331d3d7,0


252da778,0


725136c0,0


}


joined {


}


left {


}


partitioned {


}


)


2018-09-28T03:05:04.778307Z 0 [Note] WSREP: Save the discovered primary-component to disk


2018-09-28T03:05:04.778588Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3


2018-09-28T03:05:04.778629Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.


2018-09-28T03:05:05.277931Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 838d4806-c2cb-11e8-8bb1-eeeae1741165


2018-09-28T03:05:05.278435Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 838d4806-c2cb-11e8-8bb1-eeeae1741165 from 0 (bd-dev-mingshuo-182)


2018-09-28T03:05:05.278463Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 838d4806-c2cb-11e8-8bb1-eeeae1741165 from 1 (bd-dev-mingshuo-183)


2018-09-28T03:05:05.278470Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 838d4806-c2cb-11e8-8bb1-eeeae1741165 from 2 (bd-dev-vertica-89)


2018-09-28T03:05:05.278490Z 0 [Warning] WSREP: Quorum: No node with complete state:


Version : 4


Flags : 0x1


Protocols : 0 / 8 / 3


State : NON-PRIMARY


Desync count : 0


Prim state : NON-PRIMARY


Prim UUID : 00000000-0000-0000-0000-000000000000


Prim seqno : -1


First seqno : -1


Last seqno : 33


Prim JOINED : 0


State UUID : 838d4806-c2cb-11e8-8bb1-eeeae1741165


Group UUID : c057dbc5-c16e-11e8-a1a6-825ed9079934


Name : ‘bd-dev-mingshuo-182’


Incoming addr: ‘172.31.217.182:3307’


Version : 4


Flags : 00


Protocols : 0 / 8 / 3


State : NON-PRIMARY


Desync count : 0


Prim state : NON-PRIMARY


Prim UUID : 00000000-0000-0000-0000-000000000000


Prim seqno : -1


First seqno : -1


Last seqno : 33


Prim JOINED : 0


State UUID : 838d4806-c2cb-11e8-8bb1-eeeae1741165


Group UUID : c057dbc5-c16e-11e8-a1a6-825ed9079934


Name : ‘bd-dev-mingshuo-183’


Incoming addr: ‘172.31.217.183:3307’


Version : 4


Flags : 0x2


Protocols : 0 / 8 / 3


State : NON-PRIMARY


Desync count : 0


Prim state : SYNCED


Prim UUID : 19faf204-c2c7-11e8-b642-52dd65ccae43


Prim seqno : 26


First seqno : 33


Last seqno : 33


Prim JOINED : 2


State UUID : 838d4806-c2cb-11e8-8bb1-eeeae1741165


Group UUID : c057dbc5-c16e-11e8-a1a6-825ed9079934


Name : ‘bd-dev-vertica-89’


Incoming addr: ‘172.31.217.89:3307’


2018-09-28T03:05:05.278511Z 0 [Note] WSREP: Partial re-merge of primary 19faf204-c2c7-11e8-b642-52dd65ccae43 found: 1 of 2.


2018-09-28T03:05:05.278520Z 0 [Note] WSREP: Quorum results:


version = 4,


component = PRIMARY,


conf_id = 26,


members = 3/3 (primary/total),


act_id = 33,


last_appl. = 0,


protocols = 0/8/3 (gcs/repl/appl),


group UUID = c057dbc5-c16e-11e8-a1a6-825ed9079934


2018-09-28T03:05:05.278540Z 0 [Note] WSREP: Flow-control interval: [173, 173]


2018-09-28T03:05:05.278544Z 0 [Note] WSREP: Trying to continue unpaused monitor


2018-09-28T03:05:05.278548Z 0 [Note] WSREP: Restored state OPEN -> SYNCED (33)


2018-09-28T03:05:05.278593Z 1 [Note] WSREP: New cluster view: global state: c057dbc5-c16e-11e8-a1a6-825ed9079934:33, view# 27: Primary, number of nodes: 3, my index: 2, protocol version 3


2018-09-28T03:05:05.278612Z 1 [Note] WSREP: Setting wsrep_ready to true


2018-09-28T03:05:05.278621Z 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-28T03:05:05.278661Z 1 [Note] WSREP: REPL Protocols: 8 (3, 2)


2018-09-28T03:05:05.278679Z 1 [Note] WSREP: Assign initial position for certification: 33, protocol version: 3


2018-09-28T03:05:05.278752Z 0 [Note] WSREP: Service thread queue flushed.


2018-09-28T03:05:05.278828Z 1 [Note] WSREP: Synchronized with group, ready for connections


2018-09-28T03:05:05.278849Z 1 [Note] WSREP: Setting wsrep_ready to true


2018-09-28T03:05:05.278863Z 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


2018-09-28T03:05:05.279134Z 0 [Note] WSREP: Member 0.0 (bd-dev-mingshuo-182) synced with group.


2018-09-28T03:05:05.279179Z 0 [Note] WSREP: Member 1.0 (bd-dev-mingshuo-183) synced with group.


2018-09-28T03:05:07.290328Z 0 [Note] WSREP: (725136c0, ‘tcp://0.0.0.0:4567’) turning message relay requesting off


出现脑裂后解决方法:


SET GLOBAL wsrep_provider_options=’pc.bootstrap=YES’;


三个节点正常关闭


依次关闭


183,182,89


三个节点


启动的之前,一定要看一下


grastate.dat


文件内容


183


节点


root@bd-dev-mingshuo-183:/u01/mysql/3307/data#more grastate.dat


# GALERA saved state


version: 2.1


uuid: c057dbc5-c16e-11e8-a1a6-825ed9079934


seqno: 51


safe_to_bootstrap: 0


182


节点


root@bd-dev-mingshuo-182:/opt/mysql/3307/data#more grastate.dat


# GALERA saved state


version: 2.1


uuid: c057dbc5-c16e-11e8-a1a6-825ed9079934


seqno: 51


safe_to_bootstrap: 0


89


节点


root@bd-dev-vertica-89:/opt/mysql/3307/data#more grastate.dat


# GALERA saved state


version: 2.1


uuid: c057dbc5-c16e-11e8-a1a6-825ed9079934


seqno: 51


safe_to_bootstrap: 1


注意:


safe_to_bootstrap=1


的节点,说明这个节点是可以安全的作为主节点启动的。所以启动的时候必须先启动


89


节点。


mysqld_safe –defaults-file=/etc/my.cnf –wsrep-new-cluster &


mysqld_safe –defaults-file=/etc/my3307.cnf &


mysqld_safe –defaults-file=/etc/my3307.cnf &


疑问:


mysql PXC


在启动时是不是只是按照


grastate.dat





safe_to_bootstrap


来验证集群呢?


这个很好证明,还是按照上面的做法关闭集群,然后修改


183





grastate.dat





safe_to_bootstrap


值为


1.


实验过程省略,但是这样做确实是可以启动集群的。


如果在关闭部分节点后有数据变化呢?


关闭


183,182


节点后,在


89


节点插入数据


mysql> insert into ming.t1 values (16,16);


Query OK, 1 row affected (0.01 sec)


然后关闭


89


节点。至此集群全部关闭。


修改


183


的节点的


grastate.dat


root@bd-dev-mingshuo-183:/u01/mysql/3307/data#more grastate.dat


# GALERA saved state


version: 2.1


uuid: c057dbc5-c16e-11e8-a1a6-825ed9079934


seqno: 51


safe_to_bootstrap: 1


启动集群,先启动


183


节点:


mysqld_safe –defaults-file=/etc/my3307.cnf –wsrep-new-cluster &


mysql> select * from ming.t1;


+—-+——+


| a | b |


+—-+——+


| 1 | 1 |


| 2 | 2 |


| 3 | 3 |


| 4 | 4 |


| 5 | 5 |


| 6 | 6 |


| 7 | 7 |


| 8 | 8 |


| 9 | 9 |


| 10 | 10 |


| 11 | 11 |


| 12 | 12 |


| 13 | 13 |


| 14 | 14 |


| 15 | 15 |


+—-+——+


15 rows in set (0.00 sec)


16


那行数据丢失了。


再去启动


89


节点,看看丢失的数据能否找回来


2018-09-28T08:13:33.156722Z 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:group_post_state_exchange():322: Reversing history: 52 -> 51, this member has applied 1 more events than the primary component.Data loss is possible. Aborting.


89


节点的日志序列已经到了


52


,超过了其他节点的


51.


修改


89


节点日志序列为


51


,然后再尝试启动


89


节点


mysqld_safe –defaults-file=/etc/my.cnf &


mysql> select * from ming.t1;


+—-+——+


| a | b |


+—-+——+


| 1 | 1 |


| 2 | 2 |


| 3 | 3 |


| 4 | 4 |


| 5 | 5 |


| 6 | 6 |


| 7 | 7 |


| 8 | 8 |


| 9 | 9 |


| 10 | 10 |


| 11 | 11 |


| 12 | 12 |


| 13 | 13 |


| 14 | 14 |


| 15 | 15 |


| 16 | 16 |


+—-+——+


16 rows in set (0.01 sec)


但是


183


节点的数据还是


15


条。


183


节点删除一条数据


mysql> delete from ming.t1 where a=11;


Query OK, 1 row affected (0.00 sec)


两个存活节点都删除了


11


这条数据。


启动


182


节点,可以正常启动,启动后检查数据,数据与


183


一致,推测数据的


donor


节点被选择成了


183


2018-09-28T08:21:19.486736Z 2 [Note] WSREP: Check if state gap can be serviced using IST


2018-09-28T08:21:19.486832Z 2 [Note] WSREP: IST receiver addr using tcp://172.31.217.182:4568


2018-09-28T08:21:19.487026Z 2 [Note] WSREP:

Prepared IST receiver, listening at: tcp://172.31.217.182:4568


2018-09-28T08:21:19.487050Z 2 [Note] WSREP: State gap can be likely serviced using IST. SST request though present would be void.


2018-09-28T08:21:19.487984Z 0 [Note] WSREP: may fallback to sst. ist_seqno [51]


2018-09-28T08:21:19.488006Z 0 [Note] WSREP: Member 2.0 (bd-dev-mingshuo-182) requested state transfer from ‘*any*’.

Selected 0.0 (bd-dev-mingshuo-183)(SYNCED) as donor.


日志中可以看到,


182


节点被选择成为


IST receiver


,监听端口


4568


端口。选择


183


节点作为


数据的


donor


。那么数据与


183


一致也就不足为奇了。


此时数据出现了不一致,如何解决呢?


可以删除节点数据目录下文件,然后按照启动,通过


SST


全量恢复数据。


Pxc


启动时可以人为选择数据的


doner


节点。


wsrep_sst_donor


参数


关闭两个节点,加


wsrep_sst_donor


参数重新启动


mysqld_safe –defaults-file=/etc/my3307.cnf —

wsrep_sst_donor=172.31.217.89

&


mysqld_safe –defaults-file=/etc/my3307.cnf —

wsrep_sst_donor=172.31.217.89

&



相关推荐: MySQL 4.1+对多字符的支持

MySQL 4.1的字符集支持(Character Set Support)有两个方面:字符集(Character set)和排序方式(Collation)。对于字符集的支持细化到四个层次:服务器(server),数据库(database),数据表(table…

免责声明:本站发布的图片视频文字,以转载和分享为主,文章观点不代表本站立场,本站不承担相关法律责任;如果涉及侵权请联系邮箱:360163164@qq.com举报,并提供相关证据,经查实将立刻删除涉嫌侵权内容。

(0)
打赏 微信扫一扫 微信扫一扫
上一篇 06/05 12:50
下一篇 06/05 12:50

相关推荐