Rabbitmq队列高可用的策略

来源：互联网发布：电脑双肩包知乎编辑：程序博客网时间：2024/06/05 17:54

1、 RabbitMQ will keep the existing master around until at least one other slave has synchronised (even if this is a long time). However, once synchronisation has occurred things will proceed just as if the node had failed: consumers will be disconnected from the master and will need to reconnect.

当rabbitmq 设置了队列高可用后，会先把至少一个slave节点同步完成，然后才提供作用。

2、If you stop a RabbitMQ node which contains the master of a mirrored queue, some slave on some other node will be promoted to the master (assuming there is a synchronised slave; seebelow). If you continue to stop nodes then you will reach a point where a mirrored queue has no more slaves: it exists only on one node, which is now its master. If the mirrored queue was declared durable then, if its last remaining node is shutdown, durable messages in the queue will survive the restart of that node.

如果不断的关掉queue所在节点，则最后一个成为主节点的queue的信息在重启后会保存。

3、It's possible that when you shut down a master node that all available slaves are unsynchronised. A common situation in which this can occur is rolling cluster upgrades. By default, RabbitMQ will refuse to fail over to an unsynchronised slave on controlled master shutdown (i.e. explicit stop of the RabbitMQ service or shutdown of the OS) in order to avoid message loss; instead the entire queue will shut down as if the unsynchronised slaves were not there. An uncontrolled master shutdown (i.e. server or node crash, or network outage) will still trigger a failover even to an unsynchronised slave.

master队列所在节点挂掉后，slave 队列能否成为slaver，在于挂掉的节点是怎样的。如果是机器挂了活着网络故障，则slave节点会接替master节点

If you would prefer to have master nodes fail over to unsynchronised slaves in all circumstances (i.e. you would choose availability of the queue over avoiding message loss) then you can set theha-promote-on-shutdown policy key to always rather than its default value of when-synced.

可以设置不管什么原因挂掉，slave 队列都能接替master 队列

4、it is possible to lose the master for a queue while all slaves for the queue are shut down. In normal operation the last node for a queue to shut down will become the master, and we want that node to still be the master when it starts again (since it may have received messages that no other slave saw).

However, when you invoke rabbitmqctl forget_cluster_node, RabbitMQ will attempt to find a currently stopped slave for each queue which has its master on the node we are forgetting, and "promote" that slave to be the new master when it starts up again. If there is more than one candidate, the most recently stopped slave will be chosen.

It's important to understand that RabbitMQ can only promote stopped slaves duringforget_cluster_node, since any slaves that are started again will clear out their contents as described at "stopping nodes and synchronisation" above. Therefore when removing a lost master in a stopped cluster, you must invoke rabbitmqctl forget_cluster_node before starting slaves again.

当所有节点都挂掉后，可以设置最后挂掉的slave队列称为master队列

5、As discussed, for each mirrored queue there is one master and severalslaves, each on a different node. The slaves apply the operations that occur to the master in exactly the same order as the master and thus maintain the same state. All actions other than publishes go only to the master, and the master then broadcasts the effect of the actions to the slaves. Thus clients consuming from a mirrored queue are in fact consuming from the master.

除了publish的动作是同时作用在master和slave节点上的之外，其他的动作都是先做作用于master节点，然后有master节点作用到slaver节点上去的。

6、Mirrored queues support both Publisher Confirms and Transactions. The semantics chosen are that in the case of both confirms and transactions, the action spans all mirrors of the queue. So in the case of a transaction, atx.commit-ok will only be returned to a client when the transaction has been applied across all mirrors of the queue. Equally, in the case of publisher confirms, a message will only be confirmed to the publisher when it has been accepted by all of the mirrors. It is correct to think of the semantics as being the same as a message being routed to multiple normal queues, and of a transaction with publications within that similarly are routed to multiple queues.

publish启用ack机制，则消息要收到所有slaver的确认后才返回确认

7、lients that are consuming from a mirrored queue may wish to know that the queue from which they have been consuming has failed over. When a mirrored queue fails over, knowledge of which messages have been sent to which consumer is lost, and therefore all unacknowledged messages are redelivered with the redelivered flag set. Consumers may wish to know this is going to happen.

If so, they can consume with the argument x-cancel-on-ha-failover set totrue. Their consuming will then be cancelled on failover and aconsumer cancellation notification sent. It is then the consumer's responsibility to reissuebasic.consume to start consuming again.

mirrord的队列failover后，没有收到确认的消息会有一个redeliverd的标志。当master挂掉，slaver成为master时，consumer如果设置了x-cancel-on-ha-failover为true，则会发送consumer一个通知并断开连接，这时要consumer自己重连了

0 0