（笔记）理解和学习分布式一致性协议：raft

来源：互联网发布：结对编程恶搞编辑：程序博客网时间：2024/05/18 03:52

1.理解和学习分布式一致性协议：raft

要点：分布式、一致性、协议

1.1 什么叫分布式一致性，举个例子？

说明：这里的节点指的是相对于client的服务端节点（这个服务端节点可以做存储）
一个节点的场景，想想客户端如何更新（或者存储）一个数据？
多个节点的场景，要保证所有节点一致性，如何解决这问题，这个问题就是分布式一致性问题
Raft is a protocol for implementing distributed consensus.

1.2 A node can be in 1 of 3 states:

一个节点可以有三种状态：
The Follower state,
the Candidate state,
or the Leader state.
All our nodes start in the follower state. #所有的节点都是从follower状态开始的。
If followers don’t hear from a leader then they can become a candidate.#follower和主没有心跳（要么是压根就没有主，或者主已经失联）
The candidate then requests votes from other nodes.
Nodes will reply with their vote.
The candidate becomes the leader if it gets votes from a majority of nodes.
This process is called Leader Election.#一个节点发起选举请求比较好理解，那么同时多个节点同时发起选举请求呢？
All changes to the system now go through the leader.
All changes to the system now go through the leader.
Each change is added as an entry in the node’s log.#第一个阶段
This log entry is currently uncommitted so it won’t update the node’s value.
To commit the entry the node first replicates it to the follower nodes…
then the leader waits until a majority of nodes have written the entry.#第二个阶段
The entry is now committed on the leader node and the node state is “5”.
The leader then notifies the followers that the entry is committed.
The cluster has now come to consensus about the system state.
This process is called Log Replication.

1.3 选举：

In Raft there are two timeout settings which control elections.
First is the election timeout.
The election timeout is the amount of time a follower waits until becoming a candidate.#无法联系上“主”的超时时间
The election timeout is randomized to be between 150ms and 300ms.
After the election timeout the follower becomes a candidate and starts a new election term…#follower开始发起广播“选举自己”的请求
…votes for itself…
…and sends out Request Vote messages to other nodes.
If the receiving node hasn’t voted yet in this term then it votes for the candidate…#接受选举请求的节点“正常回应”即可
…and the node resets its election timeout.
Once a candidate has a majority of votes it becomes leader.
The leader begins sending out Append Entries messages to its followers.#成为“leader”后该做的事情，同步日志（Append Entries）
These messages are sent in intervals specified by the heartbeat timeout.#如果数据量比较大，此时如果client读取了follower，那么岂不是还是“脏”数据，或者说，选举阶段不接受请求
还是，选举开始一直到同步完成，这中间不接受请求呢？
Followers then respond to each Append Entries message.
This election term will continue until a follower stops receiving heartbeats and becomes a candidate.#只要网络或者其他正常，这届政府选举term就是正常的，可以运作和服务了；
Let’s stop the leader and watch a re-election happen.#当前leader挂掉了。
Node A is now leader of term 2.
Requiring a majority of votes guarantees that only one leader can be elected per term.#当leader挂掉后，看其他的follower，谁发起选举请求快（每个follower的超时是不一样的）
，因此一定是有先后顺序的，在时间上；
If two nodes become candidates at the same time then a split vote can occur.#最坏的情况，多个节点同时（也不是非常精确的同时，可能是同一秒，但是肯定不是同一毫秒，因为
各自的时钟是不一样，即没有一个全局时钟）；
Let’s take a look at a split vote example…
Two nodes both start an election for the same term…#最坏情况：冲突情况
…and each reaches a single follower node before the other.
Now each candidate has 2 votes and can receive no more for this term.
The nodes will wait for a new election and try again.
Node B received a majority of votes in term 5 so it becomes leader.#A和C没当上，被B捡了一个便宜当上了leader

1.4 日志复制：

Once we have a leader elected we need to replicate all changes to our system to all nodes.
This is done by using the same Append Entries message that was used for heartbeats.
Let’s walk through the process.
First a client sends a change to the leader.
The change is appended to the leader’s log…#第一个阶段
…then the change is sent to the followers on the next heartbeat.#第一个阶段结束
An entry is committed once a majority of followers acknowledge it…#第二个阶段
…and a response is sent to the client.#几乎在leader响应给client的同时，follower也进行提交
Now let’s send a command to increment the value by “2”.
Our system value is now updated to “7”.
Raft can even stay consistent in the face of network partitions.
Let’s add a partition to separate A & B from C, D & E.
Because of our partition we now have two leaders in different terms.
Let’s add another client and try to update both leaders.#多个客户端让分区变得更加复杂
One client will try to set the value of node B to “3”.
Node B cannot replicate to a majority so its log entry stays uncommitted.#这里的大多数是不包含自己的大多数；因此，两个节点，一个主，一个从，就没有大多数，必须是2n+1
个节点才有大多数的概念；
The other client will try to set the value of node C to “8”.
This will succeed because it can replicate to a majority.
Now let’s heal the network partition.
Both nodes A & B will roll back their uncommitted entries and match the new leader’s log.#在非“选举”期间，follower只和leader有连接，从和从之间无连接；
Our log is now consistent across our cluster.

2. 带给出答案的问题(发散思维想到的问题)

2.1 什么叫Lead选举？

2.2 什么叫Log复制？

描述问题要结构化，要从简单到复杂，策略慢慢精细和全面；

raft中，控制选举的超时时间有几个，分别是什么，做什么用？

一致性的分类：顺序一致性（某个客户端看到的不是最新的，但是一定是顺序看到的“改变”），最终一致性，强一致性，弱一致性；

集群中单paxos，多paxos的例子？
zookeeper是单paxos,kafka是多paxos;

无中心化，有中心化的例子？有无中心化怎么判断，是通过是否广播吗？
elasticsearch是无中心化的？

0 0