hash join VS merge join
来源:互联网 发布:铭党规知敬畏守规矩 编辑:程序博客网 时间:2024/05/24 23:13
A "sort merge" join is performed by sorting the two data sets to be joined according to the join keys and then merging them together. The merge is very cheap, but the sort can be prohibitively expensive especially if the sort spills to disk. The cost of the sort can be lowered if one of the data sets can be accessed in sorted order via an index, although accessing a high proportion of blocks of a table via an index scan can also be very expensive in comparison to a full table scan.
A hash join is performed by hashing one data set into memory based on join columns and reading the other one and probing the hash table for matches. The hash join is very low cost when the hash table can be held entirely in memory, with the total cost amounting to very little more than the cost of reading the data sets. The cost rises if the hash table has to be spilled to disk in a one-pass sort, and rises considerably for a multipass sort.
The cost of a hash join can be reduced by partitioning both tables on the join key(s). This allows the optimiser to infer that rows from a partition in one table will only find a match in a particular partition of the other table, and for tables having n partitions the hash join is executed as n independent hash joins. This has the following effects:
- The size of each hash table is reduced, hence reducing the maximum amount of memory required and potentially removing the need for the operation to require temporary disk space.
- For parallel query operations the amount of inter-process messaging is vastly reduced, reducing CPU usage and improving performance, as each hash join can be performed by one pair of PQ processes.
- For non-parallel query operations the memory requirement is reduced by a factor of n, and the first rows are projected from the query earlier.
You should note that hash joins can only be used for equi-joins, but merge joins are more flexible.
In general, if you are joining large amounts of data in an equi-join then a hash join is going to be a better bet.
This topic is very well covered in the documentation.
http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/optimops.htm#i51523
Ref: http://stackoverflow.com/questions/1111707/what-is-the-difference-between-a-hash-join-and-a-merge-join-oracle-rdbms
- hash join VS merge join
- Merge Join vs. Hash Join vs. Nested Loop
- sort merge join,hash join,netsloop join
- {LOOP | MERGE | HASH } JOIN
- Hash Join vs Nest Loop vs Sort Merge
- Sql优化(一) Merge Join vs. Hash Join vs. Nested Loop
- SQL优化(一) Merge Join VS. Hash Join VS. Nested Loop
- HASH JOIN ,SORT MERGE JOIN ,NESTED LOOP
- HASH JOIN/MERGE JOIN/NESTED LOOP
- 描述HASH JOIN ,MERGE JOIN ,NESTED LOOP
- Hash Join, Nested Loop, Sort-Merge Join
- HASH JOIN ,MERGE JOIN ,NESTED LOOP 比较
- HASH JOIN ,MERGE JOIN ,NESTED LOOP 比较
- hash join、nested loop,sort merge join
- HASH JOIN ,MERGE JOIN ,NESTED LOOP 比较
- HASH JOIN ,MERGE JOIN ,NESTED LOOP 比较
- HASH JOIN ,MERGE JOIN ,NESTED LOOP 比较
- HASH JOIN , MERGE JOIN ,NESTED LOOP
- 尽量用类型化的常量替代预处理器的 #DEFINE 方法
- 【数据结构】平衡二叉树
- jstl模板引擎1.0.8发布
- 设置myeclipse控制台输出到文件中
- 日期转换参数,值得收藏
- hash join VS merge join
- C++11 Concurrency Tutorial – Part 2 : Protect shared data
- sublime text3
- 游戏开发之二维向量角度计算
- win7 32位系统内存限制破解支持到128G内存
- cocos2dx Android发布问题汇总
- Linux网络防火墙【2】 Linux内核网络netfilter module
- SQLServer事务的隔离级别
- 百度网盘的使用技巧