3、Redis 集群特性之容错、数据迁移

来源:互联网 发布:软件系统业务流程图 编辑:程序博客网 时间:2024/06/05 11:28

前言:

该篇中主要讲解一下redis的容错以及数据的迁移(横向拓展)


redis 集群信息

在前面章节中讲到将Node加入到cluster以后打印了如下日志:

[root@localhost src]# ./redis-trib.rb create --replicas 1 192.168.1.103:7000 192.168.1.103:7001 192.168.1.103:7002 192.168.1.103:7003 192.168.1.103:7004 192.168.1.103:7005 >>> Creating clusterConnecting to node 192.168.1.103:7000: OKConnecting to node 192.168.1.103:7001: OKConnecting to node 192.168.1.103:7002: OKConnecting to node 192.168.1.103:7003: OKConnecting to node 192.168.1.103:7004: OKConnecting to node 192.168.1.103:7005: OK>>> Performing hash slots allocation on 6 nodes...Using 3 masters:192.168.1.103:7000192.168.1.103:7001192.168.1.103:7002Adding replica 192.168.1.103:7003 to 192.168.1.103:7000Adding replica 192.168.1.103:7004 to 192.168.1.103:7001Adding replica 192.168.1.103:7005 to 192.168.1.103:7002M  4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000   slots:0-5460(5461 slots) master M: f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001   slots:5461-10922 (5462 slots) masterM: 7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002   slots:10923-16383 (5461 slots) masterS: 778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003   replicates 4bc092eb4731152d15172b065c74c7a795fe6304S: 907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004   replicates f37ec54101536425ce8798e041ad75a582d7e153S: b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005   replicates 7b0ca3978858454051ad572aa816eec450f31a53Can I set the above configuration? (type 'yes' to accept): yes>>> Nodes configuration updated>>> Assign a different config epoch to each node>>> Sending CLUSTER MEET messages to join the clusterWaiting for the cluster to join...>>> Performing Cluster Check (using node 192.168.1.103:7000)M: 4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000   slots:0-5460 (5461 slots) masterM: f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001   slots:5461-10922 (5462 slots) masterM: 7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002   slots:10923-16383 (5461 slots) masterM: 778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003   slots: (0 slots) master   replicates 4bc092eb4731152d15172b065c74c7a795fe6304M: 907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004   slots: (0 slots) master   replicates f37ec54101536425ce8798e041ad75a582d7e153M: b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005   slots: (0 slots) master   replicates 7b0ca3978858454051ad572aa816eec450f31a53[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.

可以得到如下信息


节点名称

7000

7001

7002

7003

7004

7005

NODEID

4bc092eb4731152d15172b065c74c7a795fe6304

f37ec54101536425ce8798e041ad75a582d7e153

7b0ca3978858454051ad572aa816eec450f31a53

778e649f47fa98f6d1f6b1f1043812c6685dc4a8

907feee1b665554cadc64921c7fcb8c05b8a5ab6

b2bea8ede402e2112cced7d7cea52127f18edef2

主从

master

master

master

slave

slave

slave

所属mater节点




7000

7002

7001

slot

0-5460

5461-10922

10923-16383






redis集群提供16384 slot,slot可以理解为存储单元,在一个slot可存放多个key值,集群环境下,16384个slot分配给master节点,这也解释了为什么在slave节点上不能进行写操作。


连接到任意一个节点,查询集群相关信息

192.168.1.103:7001> cluster infocluster_state:okcluster_slots_assigned:16384cluster_slots_ok:16384cluster_slots_pfail:0cluster_slots_fail:0cluster_known_nodes:6cluster_size:3cluster_current_epoch:6cluster_my_epoch:2cluster_stats_messages_sent:10436cluster_stats_messages_received:10106

查询集群节点信息

192.168.1.103:7001> cluster nodes778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439569459017 4 connected7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439569460528 3 connected 10923-16383907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439569461031 5 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439569460025 1 connected 0-5460b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439569459520 6 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 myself,master - 0 0 2 connected 5461-10922192.168.1.103:7001> 

该命令能有效观察集群信息!!!


集群容错:选举

增加节点 node-7006

启动该节点

[root@localhost my-redis-cluster]# lsnode-7000.conf  node-7001.conf  node-7002.conf  node-7003.conf  node-7004.conf  node-7005.conf  node-7006.conf[root@localhost my-redis-cluster]# redis-server  node-7006.conf [root@localhost my-redis-cluster]# ps -ef | grep redisroot      2382     1  0 8月14 ?       00:00:05 redis-server *:7000 [cluster]root      2394     1  0 8月14 ?       00:00:05 redis-server *:7001 [cluster]root      2398     1  0 8月14 ?       00:00:05 redis-server *:7002 [cluster]root      2402     1  0 8月14 ?       00:00:05 redis-server *:7003 [cluster]root      2408     1  0 8月14 ?       00:00:05 redis-server *:7004 [cluster]root      2414     1  0 8月14 ?       00:00:05 redis-server *:7005 [cluster]root      2929     1  0 00:29 ?        00:00:00 redis-server *:7006 [cluster]root      2933  2248  0 00:29 pts/0    00:00:00 grep --color=auto redis

此时该节点不在当前的集群中,将该节点加入到集群,使用 cluster meet ip port,注意是在客户端执行该命令哦~~

 

192.168.1.103:7001> cluster meet 192.168.1.103 7006OK

重新查询集群信息

192.168.1.103:7001> cluster nodes778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439570083805 4 connected7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439570085318 3 connected 10923-16383907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439570085822 5 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439570085822 1 connected 0-5460b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439570084814 6 connectedc75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 master - 0 1439570084309 0 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 myself,master - 0 0 2 connected 5461-10922


默认加入的节点为master,由于该master没有分配任何的slot,可以讲该节点挂在其他的master下,作为master下的slave。


接下来看下如何将7006挂到7000下

切换到7006客户端

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7006


执行 cluster  replicate master-node-id

192.168.1.103:7006> cluster replicate 4bc092eb4731152d15172b065c74c7a795fe6304OK

再次查询cluster nodes

192.168.1.103:7006> cluster nodes 907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439570605532 2 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439570604025 1 connected 0-5460778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439570603524 1 connectedb2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439570604528 3 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439570603524 2 connected 5461-109227b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439570604025 3 connected 10923-16383c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected

redis 集群中提高容错,在master宕机后,会在master下的slave下自主选举, 当选的slave会升级为mater,同时接管master中的数据。


接下来带大家看下redis的选举


在生产环境中,一般都建议将节点数目设置为奇数,enhance容错能力。

7000下现在有两个slave 7003 、7006,这里再增加一个7007挂在7000下,操作类似前面,不再废话,直接把结果展现出来

[root@localhost redis-3.0.3]# cd my-redis-cluster/[root@localhost my-redis-cluster]# lsnode-7000.conf  node-7001.conf  node-7002.conf  node-7003.conf  node-7004.conf  node-7005.conf  node-7006.conf  node-7007.conf[root@localhost my-redis-cluster]# redis-server node-7007.conf [root@localhost my-redis-cluster]# ps -ef | grep redisroot      2382     1  0 8月14 ?       00:00:07 redis-server *:7000 [cluster]root      2394     1  0 8月14 ?       00:00:07 redis-server *:7001 [cluster]root      2398     1  0 8月14 ?       00:00:07 redis-server *:7002 [cluster]root      2402     1  0 8月14 ?       00:00:07 redis-server *:7003 [cluster]root      2408     1  0 8月14 ?       00:00:07 redis-server *:7004 [cluster]root      2414     1  0 8月14 ?       00:00:07 redis-server *:7005 [cluster]root      2929     1  0 00:29 ?        00:00:01 redis-server *:7006 [cluster]root      3128     1  0 00:47 ?        00:00:00 redis-server *:7007 [cluster]root      3132  2248  0 00:48 pts/0    00:00:00 grep --color=auto redis

192.168.1.103:7006> cluster meet 192.168.1.103 7007OK192.168.1.103:7006> cluster nodes907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571019922 2 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439571020927 1 connected 0-5460778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571020424 1 connectedb2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571020927 3 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571019416 2 connected 5461-10922054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 master - 0 1439571019214 0 connected7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571020927 3 connected 10923-16383c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7007 192.168.1.103:7007> cluster replicate 4bc092eb4731152d15172b065c74c7a795fe6304OK192.168.1.103:7007> cluster nodesb2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571183241 3 connected054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439571182232 1 connected 0-54607b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571183241 3 connected 10923-16383907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571182736 2 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571184249 2 connected 5461-10922778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571182736 1 connectedc75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571183745 1 connected

经过以上操作,7007将挂在7000下。

模拟7000宕机,直接kill process

[root@localhost my-redis-cluster]# ps -ef | grep redisroot      2382     1  0 8月14 ?       00:00:07 redis-server *:7000 [cluster]root      2394     1  0 8月14 ?       00:00:07 redis-server *:7001 [cluster]root      2398     1  0 8月14 ?       00:00:07 redis-server *:7002 [cluster]root      2402     1  0 8月14 ?       00:00:07 redis-server *:7003 [cluster]root      2408     1  0 8月14 ?       00:00:07 redis-server *:7004 [cluster]root      2414     1  0 8月14 ?       00:00:07 redis-server *:7005 [cluster]root      2929     1  0 00:29 ?        00:00:01 redis-server *:7006 [cluster]root      3128     1  0 00:47 ?        00:00:00 redis-server *:7007 [cluster]root      3132  2248  0 00:48 pts/0    00:00:00 grep --color=auto redis[root@localhost my-redis-cluster]# kill -9 2382


等待一段时间以后,通过客户端连接上任意一个节点,查询集群情况

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7003192.168.1.103:7003> cluster nodes7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571326184 3 connected 10923-16383054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439571324671 7 connectedc75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 master - 0 1439571324671 7 connected 0-5460778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 myself,slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 0 4 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571324671 2 connected 5461-10922b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571325177 6 connected907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571326183 5 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master,fail - 1439571226713 1439571225807 1 disconnected


  可以看到 slave通过自主选举功能,7006接替了7000的工作,分配了0-5460slot


数据迁移:slot的变迁

随着业务的发展,redis的节点承载的压力也会增大,redis的集群可通过水平横向的拓展,在集群中加入新的master-slave去分担集群中其他节点的压力。由于redis cluster中数据存放在slot中,可以将线上的reids数据

slot迁移到新加入的master-slave。接下来讲述下如何操作slot。


将指定的slot迁移到指定节点id的master上

192.168.1.103:7006> cluster setslot  6 node  7b0ca3978858454051ad572aa816eec450f31a53OK192.168.1.103:7006> cluster nodes907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439574079852 2 connected4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master,fail - 1439571226672 1439571224258 1 disconnected778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439574077836 7 connectedb2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439574078342 3 connectedf37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439574079852 2 connected 5461-10922054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439574078342 7 connected7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439574078342 3 connected 6 10923-16383c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,master - 0 0 7 connected 0-5 7-5460

这里只是举个例子,迁移还有很多的其他的操作,大家可以自行搜索。













0 0
原创粉丝点击