分布式原理整理1

来源:互联网 发布:汽车行业数据查询 编辑:程序博客网 时间:2024/05/22 14:32

CAP理论


  • 一致性(Consistency): all nodes see the same data at the same time

A service that is consistent operate fully or not at all.

  • 可用性(Availability): a guarantee that every request receives a response about whether it succeeded or failed
  • 分区容忍性(Partition Tolerance): the system continues to operate despite arbitrary partitioning due to network failures
    No set of failures less than total netowork failure is allowed to cause the system to respond incorrectly.

  • CAP 三者不可兼得

    Dynamo设计时面临的问题及解决方案

    摘录自 杨传辉,《大规模分布式存储系统》

    问题 采取的技术 数据分布 改进的一致性哈希(虚拟节点) 复制协议 复制写协议(Replicated-write protocol, NWR参数可调) 数据冲突协议 向量时钟 临时故障处理 数据回传机制(Hinted handoff) 永久故障后的恢复 Merkle哈希树 成员资格及错误检测 基于Gossip的成员资格和错误检测协议

    DHT

    (整理好再补充)

    NWR策略(Quorum协议)

    NWR是一种在分布式存储系统中用于控制一致性级别的策略。
    * N: 同一份数据的Replica的份数;
    * W: 更新一个数据对象时需要确保成功更新的份数;
    * R: 读取一个数据需要读取的Replica的份数
    * W+R>N : 保证某个数据不能被两个不同的事务同时读或写
    * W>N/2 : 保证两个事务不能并发写一个数据

    在分布式系统中,数据的单点是不允许存在的。一旦这个Replica出现错误,就可能发生数据的永久性错误。如果N设置为2,那么只要一个存储节点出错,就会有单点的存在,所以N>2。

    以下整理自卡耐基梅隆大学CMU 的课件

    Vector Clock

    Lamport’s Logical Clock

    • hapened-before relation

      • if a and b are events in the same process, and a occurs before b, then a->b is true
      • if a is an event of message m being sent by a process, and b is the event of m being received by another process, then a->b
    • happened-before relation is transitive

      if a->b and b->c, then a->c

    • property of logical clock

      • if two eventa a and b occur within the same process and a->b, then assign the logical timve value C(a) and C(b), then C(a) < C(b)
      • the clock time C must always go forward, and never backward
    • lamport’s clock alogrithm

      • when a message is being sent: each message carries a timestamp according to the sender’s logical clock
      • when a message is received: if the receiver logical clock is less than message sending time in the packet, then adjust the receiver’s clock suck that currentTime = tiemstamp + 1

    Vector clock

    Lamport’s clock cannot guarantee perfect ordering of events by just observing the time values of two arbitrary events

    defination


    • vector clocks was proposed to overcome the limition of lamport’s clock(ie., C(a) < C(b) doesn’t mean that a->b)
    • a vector clock for a system of N processes is an array of N integers

    • every process Pi stores its own vector clock VCi
    • Lamport’s time values for events are stored in VCi,VCi(a) is assigned to an event a
    • VCi(a) < VCi(b) ==> a->b

    update algorithm


    • whenever ther is a new event at Pi, increment VCi[i]
    • when a p process Pi sends a message m to Pj:

    1. increment VCi[i]
    2. set m’s timestamp ts(m) to the vector VCi
    3. when message m is received by process Pj:
    4. for k in ts(m):
      VCj = max(VCi[k], ts(m)[k]);
    5. increment VCj[j]

    causal communication

    to enforce causally-ordered multicasting, the delivery of message m sent from Pi to Pj can be delay until the following two conditions are met:
    * ts(m)[i] = VCj[i] + 1
    * ts(m)[k] <= VCj[k] for k in ts(m) and k!=i

    Merkle tree

    Merkle tree is a tree in which every non-leaf node is labelled with the hash of the labels or values (in case of leaves) of its children nodes.
    (整理完之后补充)

    0 0