[摘]连接算法伪代码

来源:互联网 发布:拼多多源码 编辑:程序博客网 时间:2024/05/16 11:00

Nested Join算法:

for each row R1 in the outer table

  for each row R2 in the inner table

    if R1 joins with R2

      return (R1,R2)

 

Left Outer Join算法:

for each row R1 in the outer table

  begin

    for each row R2 in the inner table

      if R1 joins with R2

        output (R1,R2)

      if R1 did not join

        output (R1,NULL)

  end

 

Merge Join算法:

get first row R1 from input 1

get first row R2 from input 2

while not at the end of either input

  begin

    if R1 joins with R2

      begin

        output (R1,R2)

        get next row R2 from input 2

      end

    else if R1<R2

      get next row R1 from input 1

    else

      get next row R2 from input 2

  end

 

 

Hash Join算法:

for each row R1 in the build table

  begin

    calculate hash value on R1 join key(s)

    insert R1 into the appropriate hash bucket

  end

for each row R2 in the probe table

  begin

    calculate hash value on R2 join key(s)

    for each row R1 in the corresponding hash bucket

      if R1 join with R2

        output (R1,R2)

  end

 

 

 

Nested Join对于小的数据集很有用,Merge Join对于中型数据集很有用,Hash Join 对于大型数据集很有用.Hash Join在并行性和比例性方面优于其他的连接,并且对于数据仓库的查询请求反应很快.

hash Join和Merge Join有很多共同的特性.如同Merge Join,Hash Join需要至少一个等值连接谓词,支持剩余谓词,也支持所有的外连接和半连接.不同于Merge Join,Hash Join不需要排序的输入集,在支持Full Outer Join时需要一个等值连接谓词.

原创粉丝点击