HIVE中join、semi join、outer join举例详解

来源:互联网 发布:多特vs希腊数据 编辑:程序博客网 时间:2024/05/16 12:01

hive> select * from zz0; 
OK 
111111 
222222 
888888 
Time taken: 0.147 seconds 
hive> select * zz1; 
FAILED: Parse Error: line 0:-1 mismatched input '<EOF>' expecting FROM in 
from clause 

hive> select * from zz1; 
OK 
111111 
333333 
444444 
888888 

Time taken: 0.042 seconds 
hive> select * from zz0 join zz1 on zz0.uid = zz1.uid; 
Total MapReduce jobs = 1 
Launching Job 1 out of 1 
Number of reduce tasks not specified. Estimated from input data size: 1 
In order to change the average load for a reducer (in bytes): 
  set hive.exec.reducers.bytes.per.reducer=<number> 
In order to limit the maximum number of reducers: 
  set hive.exec.reducers.max=<number> 
In order to set a constant number of reducers: 
  set mapred.reduce.tasks=<number> 
Starting Job = job_201104010013_14456, Tracking URL = 
http://hmaster:50030/jobdetails.jsp?jobid=job_201104010013_14456 
Kill Command = /usr/local/hadoop/bin/hadoop job 
 -Dmapred.job.tracker=hmaster:9001 -kill job_201104010013_14456 
2011-04-09 14:31:39,082 Stage-1 map = 0%,  reduce = 0% 
2011-04-09 14:31:48,149 Stage-1 map = 100%,  reduce = 0% 
2011-04-09 14:32:02,261 Stage-1 map = 100%,  reduce = 100% 
Ended Job = job_201104010013_14456 
OK 
111111  111111 
888888  888888 
Time taken: 28.236 seconds 
hive> select * from zz0 left outer join zz1 on zz0.uid = zz1.uid; 
Total MapReduce jobs = 1 
Launching Job 1 out of 1 
Number of reduce tasks not specified. Estimated from input data size: 1 
In order to change the average load for a reducer (in bytes): 
  set hive.exec.reducers.bytes.per.reducer=<number> 
In order to limit the maximum number of reducers: 
  set hive.exec.reducers.max=<number> 
In order to set a constant number of reducers: 
  set mapred.reduce.tasks=<number> 
Starting Job = job_201104010013_14457, Tracking URL = 
http://hmaster:50030/jobdetails.jsp?jobid=job_201104010013_14457 
Kill Command = /usr/local/hadoop/bin/hadoop job 
 -Dmapred.job.tracker=hmaster:9001 -kill job_201104010013_14457 
2011-04-09 14:32:38,253 Stage-1 map = 0%,  reduce = 0% 
2011-04-09 14:32:46,301 Stage-1 map = 100%,  reduce = 0% 
2011-04-09 14:32:59,396 Stage-1 map = 100%,  reduce = 100% 
Ended Job = job_201104010013_14457 
OK 
111111  111111 
222222  NULL 
888888  888888 
Time taken: 26.103 seconds 
hive> select * from zz0 right outer join zz1 on zz0.uid = zz1.uid; 
Total MapReduce jobs = 1 
Launching Job 1 out of 1 
Number of reduce tasks not specified. Estimated from input data size: 1 
In order to change the average load for a reducer (in bytes): 
  set hive.exec.reducers.bytes.per.reducer=<number> 
In order to limit the maximum number of reducers: 
  set hive.exec.reducers.max=<number> 
In order to set a constant number of reducers: 
  set mapred.reduce.tasks=<number> 
Starting Job = job_201104010013_14458, Tracking URL = 
http://hmaster:50030/jobdetails.jsp?jobid=job_201104010013_14458 
Kill Command = /usr/local/hadoop/bin/hadoop job 
 -Dmapred.job.tracker=hmaster:9001 -kill job_201104010013_14458 
2011-04-09 14:33:16,092 Stage-1 map = 0%,  reduce = 0% 
2011-04-09 14:33:23,134 Stage-1 map = 50%,  reduce = 0% 
2011-04-09 14:33:24,145 Stage-1 map = 100%,  reduce = 0% 
2011-04-09 14:33:37,233 Stage-1 map = 100%,  reduce = 100% 
Ended Job = job_201104010013_14458 
OK 
NULL 
111111  111111 
NULL    333333 
NULL    444444 
888888  888888 
Time taken: 26.1 seconds 
hive> select * from zz1; 
OK 
111111 
333333 
444444 
888888 

Time taken: 0.042 seconds 
hive> select * from zz0 full outer join zz1 on zz0.uid = zz1.uid; 
Total MapReduce jobs = 1 
Launching Job 1 out of 1 
Number of reduce tasks not specified. Estimated from input data size: 1 
In order to change the average load for a reducer (in bytes): 
  set hive.exec.reducers.bytes.per.reducer=<number> 
In order to limit the maximum number of reducers: 
  set hive.exec.reducers.max=<number> 
In order to set a constant number of reducers: 
  set mapred.reduce.tasks=<number> 
Starting Job = job_201104010013_14459, Tracking URL = 
http://hmaster:50030/jobdetails.jsp?jobid=job_201104010013_14459 
Kill Command = /usr/local/hadoop/bin/hadoop job 
 -Dmapred.job.tracker=hmaster:9001 -kill job_201104010013_14459 
2011-04-09 14:34:36,255 Stage-1 map = 0%,  reduce = 0% 
2011-04-09 14:34:43,304 Stage-1 map = 100%,  reduce = 0% 
2011-04-09 14:34:58,416 Stage-1 map = 100%,  reduce = 100% 
Ended Job = job_201104010013_14459 
OK 
NULL 
111111  111111 
222222  NULL 
NULL    333333 
NULL    444444 
888888  888888 
Time taken: 28.105 seconds 



hive> select * from zz0 left semi join zz1 on zz0.uid = zz1.uid; 

1

8
-- 
Stay Hungry. Stay Foolish. 

原创粉丝点击