Accrual Failure Detector

来源：互联网发布：人工智能ppt免费下载编辑：程序博客网时间：2024/05/21 10:24

本文是对论文《The Phi Accrual Failure Detector》的理解以及在cassandra中的实现。

背景

众所周知，故障探测(failure detector)是分布式系统的基础模块。有人建议把故障探测做成一个基本的服务，类似于DNS、NTP(始终同步)，其在分布式系统的重要性可见一斑。

妨碍故障探测称为一个服务的主要障碍在于：分布式应用需要调整故障检测以适用于不同的QOS需求，传统的故障探测算法只能提供bool结果。

accrual failure detector的创新在于：产生结果是被监测的server crash的置信度(the degree of confidence)，置信度是随着时间变化的连续的值，分布式应用根据自身的QOS需求定义适合自己的suspicion threshold，定义一个较低的threshold会导致探测到一个real creash时间短，但是其正确性不高；定义一个较高的 threshold会导致特测到real crash的时间长，但是其正确性高。

举例。在一个分布式系统中，有一个master server和多个worker server, master server需要把很多job分发到worker server上。很显然，master server需要探测worker server的状态。利用accrual failuer detector，当某个worker server的置信度达到low threshold时，master server不向此worker server派发新的job；当置信度达到moderate threshold时，master server会把在此woker server上的job派发到其它的worker server；当置信度达到high threshold，master server会把释放关于此worker server的通信资源(比如关闭socket)。

failure detector基本概念

■基本概念之一：Unreliable failure detectors

failure detectors是不可靠的，原因是crash server很难与 slow server区分。

性质1: There is a time after which every process that crashes is permanently suspected by all correct processes.

性质2：There is a time after which correct processes are not suspected by any correct process.

■基本概念之二：Quality of service of failure detectors

定义1 (Detection time TD):The detection time is the time that elapses since the crash of p and until q begins to suspect p permanently.

定义2 (Average mistake rate _M): This measures the rate at which a failure detector generates wrong suspicions.

■基本概念之三：Heartbeat failure detectors

Q监控P，P会定期的向Q发送心跳，发送心跳的间隔记为△i。Q在△to时间内没有收到新的心跳，则认为P已经crash。△tr定义为消息在网路上传输的时延。

第一个方案,把△to设定为一个固定的值。缺点：当△to设置的过低时，crash很快会被检测到，但是结果的正确性不高。反之，crash被检测到时间长，但正确性高。

第二个方案，根据心跳的网络时延△tr 和心跳发送间隔△i 来确定△to。缺点：需要依赖△i，因为the regularity of the sending of heartbeats cannot be ensured and a short interval makes the timing inaccuracies due to operating system scheduling take more importance。[s1]

■基本概念之四：Adaptive failure detectors

随着网络状况不断的变化，adaptive failuer detector也会不断地调整。

A Chen-FD。根据最近一段时间收到心跳的间隔来预测收到下个心跳的时间。作者提供了2个版本的协议：基于同步时钟和基于异步始终的。

B Bertier-FD。基本算法与上同，不同的是在Chen的算法的基础上增加了round-trip time 的考虑。测试表明：Bertier-FD 算法比 Chen-FD算法更快的检测到crash，但是其正确性降低了。

C 如何确定心跳发送间隔△i[s2] 。从常识来看，△i应该由应用自己的需求来确定。但是，M¨uller认为应该由下层的系统(网络，OS)来确定，如果△i远远小于△tr，故障特测时间由△tr确定，减小△i不会减少故障特测时间，相反还会增加网络拥塞，从而增加△tr，进而增加故障探测时间；如果△i远远大于△tr，故障特测时间由△i确定，增加△i会增大故障特测时间，且不会明显的减少网络负载。所以△i应该与平均△tr相当。