Hive实战之Youtube数据集
来源:互联网 发布:淘宝浏览量有什么用 编辑:程序博客网 时间:2024/04/30 09:01
大家可以看SparkSQL实战之Youtube数据集来进行比较
1 数据来源
本次实战的数据来自于”YouTube视频统计与社交网络”的数据集,是西蒙弗雷泽大学计算机学院在2008年所爬取的数据
数据集地址
1. 1 Youtube视频表格式如下:
数据之间采用”\t”作为分隔符
具体数据如下:
数据量大小为1G,条数为500万+
1.2 用户表
2 实战演练准备
2.1 环境搭建
使用环境为
hive-1.1.0-cdh5.4.5
hadoop-2.6.0-cdh5.4.5
演示形式为使用hive shell
2.2 数据清洗
我们一起来看看数据
主要的问题在于category和relatedIDs处理,由于Hive是支持array格式的,所以我们想到的是使用array来存储category和relatedIDs,但是我们发现category的分割符是”&”而realatedIDs的分隔符是”\t”,我们在创建表格的时候能够指定array的分隔符,但是只能指定一个,所以再将数据导入到Hive表格之前我们需要对数据进行一定转换和清洗
并且数据中肯定会存在一些不完整数据和一些奇怪的格式,所以数据的清洗是必要的,我在这里所使用的数据清洗方式是使用Spark进行清洗,也可以使用自定义UDF函数来进行清洗
数据清洗注意点
1)我们可以看到每行数据以”\t”作为分隔符,每行有十列数据,最后一列关联ID可以为空,那么我们对数据进行split之后数组的大小要大于8
2)数据中存在"uNiKXDA8eyQ KRQE 1035 News & Politics 107"
这样格式的数据,所以在处理category时需要注意 News & Politics中间的&
处理后的数据如下:
下面的实战都是基于数据清洗后的数据进行的
2.3 创建表格和数据导入
2.3.1 youtube表格的创建和导入
1)youtube1的创建,文件格式为textfile
create table youtube1(videoId string, uploader string, age int, category array, length int, views int, rate float, ratings int, comments int,relatedId array)
row format delimited
fields terminated by “\t”
collection items terminated by “&”
stored as textfile;
2)youtube2的创建,文件格式为orc
create table youtube2(videoId string, uploader string, age int, category array, length int, views int, rate float, ratings int, comments int,relatedId array)
row format delimited
fields terminated by “\t”
collection items terminated by “&”
stored as orc;
3)youtube3的创建,文件格式为orc,进行桶分区
create table youtube3(videoId string, uploader string, age int, category array, length int, views int, rate float, ratings int, comments int,relatedId array)
clustered by (uploader) into 8 buckets
row format delimited
fields terminated by “\t”
collection items terminated by “&”
stored as orc;
数据导入:
1)load data inpath “path” into table youtube1;
2)由于无法将textfile格式的数据导入到orc格式的表格,所以数据需要从youtube1导入到youtube2和youtube3:
insert into table youtube2 select * from youtube1;
insert into table youtube3 select * from youtube1;
2.3.2 user表格的创建和导入
1)user_tmp的创建,文件格式textfile,24buckets
create table user_tmp(uploader string,videos int,friends int)
clustered by (uploader) into 24 buckets
row format delimited
fields terminated by “\t”
stored as textfile;
2)user的创建,文件格式orc,24buckets
create table user(uploader string,videos int,friends int)
clustered by (uploader) into 24 buckets
row format delimited
fields terminated by “\t”
stored as orc;
user表的数据导入也是同理
数据导入:
1)load data inpath “path” into table user_tmp;
2)由于无法将textfile格式的数据导入到orc格式的表格,所以数据需要从user_tmp导入到user:
insert into table user select * from user_tmp;
3 实战需求
1)统计出观看数最多的10个视频
2)统计出视频类别热度的前10个类型
3)统计出视频观看数最高的50个视频的所属类别
4)统计出观看数最多的前N个视频所关联的视频的所属类别排行
5)筛选出每个类别中热度最高的前10个视频
6)筛选出每个类别中评分最高的前10个视频
7)找出用户中上传视频最多的10个用户的所有视频
8)筛选出每个类别中观看数Top10
4 实战演练
4.1 统计出观看数最多的10个视频
select * from youtube3 order by views desc limit 10;
结果如下:
hive> select * from youtube3 order by views desc limit 10;Query ID = hadoop_20170710155353_4bc057f0-bbd5-4bfe-a66c-ec0e17cb3ca9Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1Starting Job = job_1499153664137_0101, Tracking URL = http://master:8088/proxy/application_1499153664137_0101/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0101Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 12017-07-10 15:53:55,297 Stage-1 map = 0%, reduce = 0%...2017-07-10 15:56:06,210 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 106.93 secMapReduce Total cumulative CPU time: 1 minutes 46 seconds 930 msecEnded Job = job_1499153664137_0101MapReduce Jobs Launched: Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 106.93 sec HDFS Read: 574632526 HDFS Write: 2916 SUCCESSTotal MapReduce CPU Time Spent: 1 minutes 46 seconds 930 msecOKdMH0bHeiRNg judsonlaipply 415 ["Comedy"] 360 79897120 4.65 287260 131356 ["HSoVKUVOnfQ","pv5zWaTEVkI","v1U8f7TmYkA","ZCT3MIUTc3o","mDiBOF8XI44","P9LmHXXWiJs","fo_QVq2lGMs","EtGQgSY9Nn4","5P6UU6m3cqk","61heClTFc5w","h3gdSHGcUU4","OPmYbP0F4Zw","gsOaQGF7kiQ","H2gw9VE16mo","rGkIUlYEQT8","innfyQZHPpo","cu8tUy14zQo","cQ25-glGRzI","WqDbn2iLwwY","jvz0bvYmnto"]cQ25-glGRzI RCARecords 742 ["Music"] 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]12Z3J1uzd0Q kaejane 404 ["Film","Animation"] 615 65341925 3.03 9189 5508 ["innfyQZHPpo","-_CSo1gOd48","1Al4crLW9EM","nhSZs-aAZbo","Af9aPlG2WFw","5P6UU6m3cqk","uBHsOTYO6AI","lW19DnWz6vg","vr3x_RRJdd4","5GE82tqcYYQ","D2kJZOfq7zk","lsO6D1rwrKc","dMH0bHeiRNg","kNLXjXxj3J8","vQbYvjmfbr4","G-HonZGBWus","MuOvqeABHvQ","-2caf6KlSyw","cQ25-glGRzI","pv5zWaTEVkI"]LpAI8TzQDes IMVUinc 848 ["Entertainment"] 68 65078772 1.38 5678 2499 ["c2eP-UADAi8","5P6UU6m3cqk","cQ25-glGRzI","FYbqcgd97NQ","12Z3J1uzd0Q","VS4wFOWFUt8","RB-wUgnyGv0","sBQLq2VmZcA","w2xUzv6iZWo","oOF3T9Zvizc","244qR7SvvX0","pv5zWaTEVkI","dMH0bHeiRNg","b3u65f4CRLk"]7AVHXe-ol-s internmarket 603 ["Music"] 264 60349673 3.16 1033 594 ["trrRo3kGRv0","CCsGkN1PEec","l5mQKJDO-nY","DP_4rQcU0G8","7xz5aOvP2YA","kCjKIus3iBA","nD0MILEXZP0","IQcyLMa716k","d05KUL8aW4E","szeeHnu2DM8","7b_wObF6vgs","UazqVaOg9uc","LRBD9l3Nv6Y","108YigkYFgM","_qCau44yfX0","NkkGj4_1m9A","kuyRSrklGU8","uLr7IJYyNnM","IEuyk_277xY","XmAaYo-CQNw"]244qR7SvvX0 donotasyoudo 960 ["Entertainment"] 6 57790943 1.45 66412 14913 ["v3ARyAb_1Bs","nhSZs-aAZbo","2pNTrYd-4FQ","dMH0bHeiRNg","kHmvkRoEowc","yh0xYO3dYx8","5GE82tqcYYQ","ktUSIJEiOug","6PrDw6T3-rM","-xEzGIuY7kw","innfyQZHPpo","pv5zWaTEVkI","E7QOUJfRs3Y","9oRQJK61MWs"]ePyRrb2-fzs 1988basti 826 ["Music"] 204 45984219 4.87 51039 27065 ["fm0T7_SGee4","h8oBykb_Pqs","iWg3IMN_rhU","MdOAr_4FJvc","xNp7OxgFJJM","nzaw_Gk9BAU","u9rl4apDFfY","fMzMm4sJYGo","b5eEa5a2Rio","aNR0tW_afdk","Kj_MceyjS6g","GojTUmjxVHU","CGPUuPHdHQg","SIM4DCn7AlE","ktUSIJEiOug","Dn6kGzRSjhU","gxxU68z8quM","HRSrFm_8YK8","2__Qdd11rfA","XewBty95JVg"]xsRWpK4pf90 universalmusicgroup 902 ["Music"] 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]ktUSIJEiOug aliciakeys 953 ["Music"] 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]b3u65f4CRLk srcrecords 729 ["Music"] 256 43511791 4.76 55148 27818 ["4fZF9UsS8LY","5f7kfhHHH_A","H1vaszd6NnA","6Di-dAIR9RA","HiBcQeIax4g","JVciSFc5Wrw","Md6rURKhZmA","IG5ReXP0SSg","D9g2szHsoz0","EflOarvoeVs","7PxBGHjABnU","Gjq1g3j7WFc","mDvCcrU-Ob8","V9YE8dHMxvw","iWg3IMN_rhU","Un2dbprtZFE","WpgMuHwAdC4","RtQ5JENdx5I","ZEYgAgKVuO4","D_2jb8D8AOI"]Time taken: 172.566 seconds, Fetched: 10 row(s)
4.2 统计出视频类别热度的前10个类型
select tagId, count(a.videoid) as sum from (select videoid,tagId from youtube3 lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc limit 10;
结果:
hive> select tagId, count(a.videoid) as sum from (select videoid,tagId from youtube3 lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc limit 10;Query ID = hadoop_20170710155757_614ec9dd-ffa0-465d-8d62-c47b5ad585f0Total jobs = 2Launching Job 1 out of 2Number of reduce tasks not specified. Defaulting to jobconf value of: 2Starting Job = job_1499153664137_0102, Tracking URL = http://master:8088/proxy/application_1499153664137_0102/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0102Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 22017-07-10 15:58:23,797 Stage-1 map = 0%, reduce = 0%...2017-07-10 15:59:16,341 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 19.51 secMapReduce Total cumulative CPU time: 19 seconds 510 msecEnded Job = job_1499153664137_0102Launching Job 2 out of 2Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0103, Tracking URL = http://master:8088/proxy/application_1499153664137_0103/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0103Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 12017-07-10 15:59:51,645 Stage-2 map = 0%, reduce = 0%2017-07-10 16:00:18,373 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.97 sec2017-07-10 16:00:48,410 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.57 secMapReduce Total cumulative CPU time: 2 seconds 570 msecEnded Job = job_1499153664137_0103MapReduce Jobs Launched: Stage-Stage-1: Map: 4 Reduce: 2 Cumulative CPU: 19.51 sec HDFS Read: 47327614 HDFS Write: 900 SUCCESSStage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.57 sec HDFS Read: 5596 HDFS Write: 148 SUCCESSTotal MapReduce CPU Time Spent: 22 seconds 80 msecOKEntertainment 1304724Music 1274825Comedy 449652Blogs 447581People 447581Film 442109Animation 442109Sports 390619Politics 186753News 186753Time taken: 185.399 seconds, Fetched: 10 row(s)
4.3 统计出视频观看数最高的50个视频的所属类别
select tagId, count(a.videoid) as sum from (select videoid,tagId from (select * from youtube3 order by views desc limit 20) e lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc;
结果:
hive> select tagId, count(a.videoid) as sum from (select videoid,tagId from (select * from youtube3 order by views desc limit 20) e lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc;Query ID = hadoop_20170710160909_c6cbbe29-4df3-4c0b-ad70-bd34857acc80Total jobs = 3Launching Job 1 out of 3Number of reduce tasks determined at compile time: 1Starting Job = job_1499153664137_0104, Tracking URL = http://master:8088/proxy/application_1499153664137_0104/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0104Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 12017-07-10 16:09:48,197 Stage-1 map = 0%, reduce = 0%2017-07-10 16:10:17,734 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 3.09 sec...2017-07-10 16:10:59,048 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 47.74 secMapReduce Total cumulative CPU time: 47 seconds 740 msecEnded Job = job_1499153664137_0104Launching Job 2 out of 3Number of reduce tasks not specified. Defaulting to jobconf value of: 2In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0105, Tracking URL = http://master:8088/proxy/application_1499153664137_0105/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0105Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 22017-07-10 16:11:35,860 Stage-2 map = 0%, reduce = 0%2017-07-10 16:12:04,633 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.07 sec2017-07-10 16:12:23,187 Stage-2 map = 100%, reduce = 50%, Cumulative CPU 2.61 sec2017-07-10 16:12:32,480 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 3.93 secMapReduce Total cumulative CPU time: 3 seconds 930 msecEnded Job = job_1499153664137_0105Launching Job 3 out of 3Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0106, Tracking URL = http://master:8088/proxy/application_1499153664137_0106/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0106Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 12017-07-10 16:13:09,244 Stage-3 map = 0%, reduce = 0%2017-07-10 16:13:37,263 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 2.18 sec2017-07-10 16:14:05,048 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 3.5 secMapReduce Total cumulative CPU time: 3 seconds 500 msecEnded Job = job_1499153664137_0106MapReduce Jobs Launched: Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 47.74 sec HDFS Read: 57839263 HDFS Write: 228 SUCCESSStage-Stage-2: Map: 1 Reduce: 2 Cumulative CPU: 3.93 sec HDFS Read: 6065 HDFS Write: 324 SUCCESSStage-Stage-3: Map: 2 Reduce: 1 Cumulative CPU: 3.5 sec HDFS Read: 6600 HDFS Write: 53 SUCCESSTotal MapReduce CPU Time Spent: 55 seconds 170 msecOKMusic 12Comedy 3Entertainment 3Film 2Animation 2Time taken: 297.466 seconds, Fetched: 5 row(s)
4.4 统计出观看数最多的前50个视频所关联的视频的所属类别排行
思路:
- 首先筛选出前50个视频所关联的视频,
select * from youtube3 order by views desc limit 50 - 再将结果和youtube3进行join
select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId - 然后去重
select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId - 最后得出所有视频的类别排行
select tagId, count(e.videoid) as sum from (select videoid,tagId from (select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId) d lateral view explode(category) catetory as tagId) e group by tagId order by sum desc;
结果:
hive> select tagId, count(e.videoid) as sum from (select videoid,tagId from (select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId) d lateral view explode(category) catetory as tagId) e group by tagId order by sum desc;Query ID = hadoop_20170710170808_cfbfde68-c016-4c5d-860b-fe840a2d50cbTotal jobs = 7Launching Job 1 out of 7Number of reduce tasks determined at compile time: 1Starting Job = job_1499153664137_0111, Tracking URL = http://master:8088/proxy/application_1499153664137_0111/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0111Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 12017-07-10 17:08:48,662 Stage-1 map = 0%, reduce = 0%2017-07-10 17:09:25,175 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 20.83 sec...2017-07-10 17:10:15,276 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 81.01 secMapReduce Total cumulative CPU time: 1 minutes 21 seconds 10 msecEnded Job = job_1499153664137_0111Stage-10 is filtered out by condition resolver.Stage-11 is selected by condition resolver.Stage-2 is filtered out by condition resolver.17/07/10 17:10:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableExecution log at: /tmp/hadoop/hadoop_20170710170808_cfbfde68-c016-4c5d-860b-fe840a2d50cb.log2017-07-10 05:10:23 Starting to launch local task to process map join; maximum memory = 5189795842017-07-10 05:10:34 Dump the side-table for tag: 0 with group count: 743 into file: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-10_17-08-08_628_1950388697756865753-1/-local-10009/HashTable-Stage-8/MapJoin-mapfile30--.hashtable2017-07-10 05:10:34 Uploaded 1 File to: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-10_17-08-08_628_1950388697756865753-1/-local-10009/HashTable-Stage-8/MapJoin-mapfile30--.hashtable (22611 bytes)2017-07-10 05:10:34 End of local task; Time Taken: 11.392 sec.Execution completed successfullyMapredLocal task succeededLaunching Job 3 out of 7Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1499153664137_0112, Tracking URL = http://master:8088/proxy/application_1499153664137_0112/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0112Hadoop job information for Stage-8: number of mappers: 4; number of reducers: 02017-07-10 17:11:13,493 Stage-8 map = 0%, reduce = 0%2017-07-10 17:11:57,229 Stage-8 map = 100%, reduce = 0%, Cumulative CPU 15.79 secMapReduce Total cumulative CPU time: 15 seconds 790 msecEnded Job = job_1499153664137_0112Launching Job 4 out of 7Number of reduce tasks not specified. Defaulting to jobconf value of: 2Starting Job = job_1499153664137_0113, Tracking URL = http://master:8088/proxy/application_1499153664137_0113/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0113Hadoop job information for Stage-3: number of mappers: 3; number of reducers: 22017-07-10 17:12:31,982 Stage-3 map = 0%, reduce = 0%2017-07-10 17:13:17,467 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 6.53 secMapReduce Total cumulative CPU time: 6 seconds 530 msecEnded Job = job_1499153664137_0113Launching Job 5 out of 7Number of reduce tasks not specified. Defaulting to jobconf value of: 2Starting Job = job_1499153664137_0114, Tracking URL = http://master:8088/proxy/application_1499153664137_0114/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0114Hadoop job information for Stage-4: number of mappers: 2; number of reducers: 22017-07-10 17:13:55,123 Stage-4 map = 0%, reduce = 0%2017-07-10 17:14:22,876 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 0.91 sec2017-07-10 17:14:23,905 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 1.99 sec2017-07-10 17:14:40,601 Stage-4 map = 100%, reduce = 50%, Cumulative CPU 3.35 sec2017-07-10 17:14:52,033 Stage-4 map = 100%, reduce = 100%, Cumulative CPU 4.96 secMapReduce Total cumulative CPU time: 4 seconds 960 msecEnded Job = job_1499153664137_0114Launching Job 6 out of 7Number of reduce tasks determined at compile time: 1Starting Job = job_1499153664137_0115, Tracking URL = http://master:8088/proxy/application_1499153664137_0115/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0115Hadoop job information for Stage-5: number of mappers: 2; number of reducers: 12017-07-10 17:15:27,647 Stage-5 map = 0%, reduce = 0%2017-07-10 17:15:54,560 Stage-5 map = 50%, reduce = 0%, Cumulative CPU 0.91 sec2017-07-10 17:15:55,587 Stage-5 map = 100%, reduce = 0%, Cumulative CPU 2.03 sec2017-07-10 17:16:23,357 Stage-5 map = 100%, reduce = 100%, Cumulative CPU 3.59 secMapReduce Total cumulative CPU time: 3 seconds 590 msecEnded Job = job_1499153664137_0115MapReduce Jobs Launched: Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 81.01 sec HDFS Read: 463074822 HDFS Write: 26949 SUCCESSStage-Stage-8: Map: 4 Cumulative CPU: 15.79 sec HDFS Read: 47322670 HDFS Write: 30829 SUCCESSStage-Stage-3: Map: 3 Reduce: 2 Cumulative CPU: 6.53 sec HDFS Read: 43694 HDFS Write: 1126 SUCCESSStage-Stage-4: Map: 2 Reduce: 2 Cumulative CPU: 4.96 sec HDFS Read: 8888 HDFS Write: 725 SUCCESSStage-Stage-5: Map: 2 Reduce: 1 Cumulative CPU: 3.59 sec HDFS Read: 7001 HDFS Write: 207 SUCCESSTotal MapReduce CPU Time Spent: 1 minutes 51 seconds 880 msecOKMusic 399Entertainment 103Comedy 94People 36Blogs 36Film 19Animation 19Pets 12Animals 12UNA 11Style 6Politics 6News 6Howto 6Sports 4Vehicles 3Autos 3Travel 2Events 2Technology 1Science 1Time taken: 496.822 seconds, Fetched: 21 row(s)
4.5 筛选出某个类别(如music)中热度最高的前10个视频
思路:
- 创建一个表格,存储每个视频对应一个标签的信息
create table youtube_category(videoId string, uploader string, age int, categoryId string, length int, views int, rate float, ratings int, comments int,relatedId array)
row format delimited
fields terminated by “\t”
collection items terminated by “&”
stored as orc; - 将转换后的数据进行插入
insert into table youtube_category select videoid,uploader,age,categoryId,length,views,rate,ratings,comments,relatedId from youtube3 lateral view explode(category) catetory as categoryId; - 根据观看数和类别进行查询
select * from youtube_category where categoryId=”Music” order by views desc limit 10;
结果如下:
hive> create table youtube_category(videoId string, uploader string, age int, categoryId string, length int, views int, rate float, ratings int, comments int,relatedId array<string>) > row format delimited > fields terminated by "\t" > collection items terminated by "&" > stored as orc;OKTime taken: 0.256 secondshive> insert into youtube_category select videoid,uploader,age,categoryId,length,views,rate,ratings,comments,relatedId from youtube3 lateral view explode(category) catetory as categoryId;Query ID = hadoop_20170711091010_7c76d7bd-5af0-43f9-812f-e30c612ee60bTotal jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1499153664137_0116, Tracking URL = http://master:8088/proxy/application_1499153664137_0116/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0116Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 02017-07-11 09:11:18,741 Stage-1 map = 0%, reduce = 0%...2017-07-11 09:19:05,239 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 402.71 secMapReduce Total cumulative CPU time: 6 minutes 42 seconds 710 msecEnded Job = job_1499153664137_0116Stage-4 is selected by condition resolver.Stage-3 is filtered out by condition resolver.Stage-5 is filtered out by condition resolver.Moving data to: hdfs://bydcluster1/user/hive/warehouse/youtube_category/.hive-staging_hive_2017-07-11_09-10-38_970_8191962088714912782-1/-ext-10000Loading data to table default.youtube_categoryTable default.youtube_category stats: [numFiles=4, numRows=6742209, totalSize=608748677, rawDataSize=10187373874]MapReduce Jobs Launched: Stage-Stage-1: Map: 4 Cumulative CPU: 404.25 sec HDFS Read: 574634998 HDFS Write: 608749047 SUCCESSTotal MapReduce CPU Time Spent: 6 minutes 44 seconds 250 msecOKTime taken: 511.914 secondshive> select * from youtube_category where categoryId="music" order by views desc limit 10;Query ID = hadoop_20170711091919_a48aa98b-9cfa-4f58-a106-9da85484f0ddTotal jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0117, Tracking URL = http://master:8088/proxy/application_1499153664137_0117/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0117Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 12017-07-11 09:20:37,607 Stage-1 map = 0%, reduce = 0%...2017-07-11 09:21:32,382 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 36.15 secMapReduce Total cumulative CPU time: 36 seconds 150 msecEnded Job = job_1499153664137_0117MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 36.15 sec HDFS Read: 607706126 HDFS Write: 0 SUCCESSTotal MapReduce CPU Time Spent: 36 seconds 150 msecOKTime taken: 96.836 secondshive> select * from youtube_category where catagoryId="Music" order by views desc limit 10;FAILED: SemanticException [Error 10004]: Line 1:37 Invalid table alias or column reference 'catagoryId': (possible column names are: videoid, uploader, age, categoryid, length, views, rate, ratings, comments, relatedid)hive> select * from youtube_category where catagoryid="Music" order by views desc limit 10;FAILED: SemanticException [Error 10004]: Line 1:37 Invalid table alias or column reference 'catagoryid': (possible column names are: videoid, uploader, age, categoryid, length, views, rate, ratings, comments, relatedid)hive> select * from youtube_category where categoryid="Music" order by views desc limit 10;Query ID = hadoop_20170711092525_53f32ccf-7d04-4615-b446-9009cf16dc7fTotal jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0118, Tracking URL = http://master:8088/proxy/application_1499153664137_0118/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0118Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 12017-07-11 09:26:08,727 Stage-1 map = 0%, reduce = 0%...2017-07-11 09:27:17,297 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 57.89 secMapReduce Total cumulative CPU time: 57 seconds 890 msecEnded Job = job_1499153664137_0118MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 58.14 sec HDFS Read: 607706243 HDFS Write: 3043 SUCCESSTotal MapReduce CPU Time Spent: 58 seconds 140 msecOKcQ25-glGRzI RCARecords 742 Music 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]7AVHXe-ol-s internmarket 603 Music 264 60349673 3.16 1033 594 ["trrRo3kGRv0","CCsGkN1PEec","l5mQKJDO-nY","DP_4rQcU0G8","7xz5aOvP2YA","kCjKIus3iBA","nD0MILEXZP0","IQcyLMa716k","d05KUL8aW4E","szeeHnu2DM8","7b_wObF6vgs","UazqVaOg9uc","LRBD9l3Nv6Y","108YigkYFgM","_qCau44yfX0","NkkGj4_1m9A","kuyRSrklGU8","uLr7IJYyNnM","IEuyk_277xY","XmAaYo-CQNw"]ePyRrb2-fzs 1988basti 826 Music 204 45984219 4.87 51039 27065 ["fm0T7_SGee4","h8oBykb_Pqs","iWg3IMN_rhU","MdOAr_4FJvc","xNp7OxgFJJM","nzaw_Gk9BAU","u9rl4apDFfY","fMzMm4sJYGo","b5eEa5a2Rio","aNR0tW_afdk","Kj_MceyjS6g","GojTUmjxVHU","CGPUuPHdHQg","SIM4DCn7AlE","ktUSIJEiOug","Dn6kGzRSjhU","gxxU68z8quM","HRSrFm_8YK8","2__Qdd11rfA","XewBty95JVg"]xsRWpK4pf90 universalmusicgroup 902 Music 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]ktUSIJEiOug aliciakeys 953 Music 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]b3u65f4CRLk srcrecords 729 Music 256 43511791 4.76 55148 27818 ["4fZF9UsS8LY","5f7kfhHHH_A","H1vaszd6NnA","6Di-dAIR9RA","HiBcQeIax4g","JVciSFc5Wrw","Md6rURKhZmA","IG5ReXP0SSg","D9g2szHsoz0","EflOarvoeVs","7PxBGHjABnU","Gjq1g3j7WFc","mDvCcrU-Ob8","V9YE8dHMxvw","iWg3IMN_rhU","Un2dbprtZFE","WpgMuHwAdC4","RtQ5JENdx5I","ZEYgAgKVuO4","D_2jb8D8AOI"]iWg3IMN_rhU TimbalandMusic 853 Music 216 43323757 4.8 58761 36142 ["SIM4DCn7AlE","GojTUmjxVHU","ePyRrb2-fzs","2__Qdd11rfA","P_TKpULdDvo","Dn6kGzRSjhU","1iLDIj0pDHk","xsRWpK4pf90","XewBty95JVg","y4jiyjDQGLY","h8oBykb_Pqs","xNp7OxgFJJM","b3u65f4CRLk","S-783rzHIS4","cZd1Js0QaOI","kZGEgVxyPHU","43o0vwAmFM8","BMMUWvavORI","v_-1peCW6Ok","9gkjyM7ZOUc"]innfyQZHPpo chai0322 468 Music 210 41564032 2.61 13564 6126 ["nhSZs-aAZbo","hh0nVc0NYTE","12Z3J1uzd0Q","V1s9queYhF8","1Al4crLW9EM","61e1h4vALS0","8h98jb9Lk74","Z-HjmP7BCVc","o_pIQIV_NuU","82BDV1gYjdM","Af9aPlG2WFw","cC27PTFP4V8","-_CSo1gOd48","00c08ijIlHo","cIKxlWuTviE","dMH0bHeiRNg","iWg3IMN_rhU","v3ARyAb_1Bs","5P6UU6m3cqk","QjA5faZF1A8"]Lt6o8NlrbHg seankingston 867 Music 257 41171303 4.74 101352 67579 ["qwflMOAaOf0","aVB5ViG_0yE","_QmajsQI9SM","Skz6h2gb-t8","t2dkCGDgVzk","HJgsbsMSe5w","BXhN2DbI0hc","sFJQAfW_jgw","RZJ32D-lrTE","U1bf7XLGGRE","ktUSIJEiOug","TWEez0U-9ag","BhkIjh-nuRo","a4X7eFbP3u4","b3u65f4CRLk","sm2fTDpuyyM","9pzro0T8rgY","TBzfqR4mrgE","c2cyose42jU","iWg3IMN_rhU"]QjA5faZF1A8 guitar90 308 Music 320 40294882 4.83 329108 169563 ["r2BOApUvFpw","ATub40Npxik","m7Jh1BV1EOc","owAj5LiXG5w","Ddn4MGaS3N4","GxplDa3M5Io","by8oyJztzwo","aZpD0btOZx8","pZ9jrBg4Lwc","heISA256CRo","6wpPk8qk3uQ","i4BYMvVvMg0","2xjJXT0C0X4","p-VR-cXghko","-9ao_vOsZkg","wfqu4YEufYc","WetVXbYRfWk","JdxkVQy7QLM","fdjamy75C4A","dVUgd8ot6BE"]Time taken: 111.631 seconds, Fetched: 10 row(s)
4.6 筛选出每个类别中评分最高的前10个视频
select * from youtube_category where categoryId=”Music” order by ratings desc limit 10;
结果如下:
hive> select * from youtube_category where categoryId="Music" order by ratings desc limit 10;Query ID = hadoop_20170711093838_6ab5573a-964f-486e-b0dc-77e5853fcfb5Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0119, Tracking URL = http://master:8088/proxy/application_1499153664137_0119/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0119Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 12017-07-11 09:39:27,903 Stage-1 map = 0%, reduce = 0%...2017-07-11 09:40:36,298 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 57.41 secMapReduce Total cumulative CPU time: 57 seconds 410 msecEnded Job = job_1499153664137_0119MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 57.41 sec HDFS Read: 607706243 HDFS Write: 2957 SUCCESSTotal MapReduce CPU Time Spent: 57 seconds 410 msecOKQjA5faZF1A8 guitar90 308 Music 320 40294882 4.83 329108 169563 ["r2BOApUvFpw","ATub40Npxik","m7Jh1BV1EOc","owAj5LiXG5w","Ddn4MGaS3N4","GxplDa3M5Io","by8oyJztzwo","aZpD0btOZx8","pZ9jrBg4Lwc","heISA256CRo","6wpPk8qk3uQ","i4BYMvVvMg0","2xjJXT0C0X4","p-VR-cXghko","-9ao_vOsZkg","wfqu4YEufYc","WetVXbYRfWk","JdxkVQy7QLM","fdjamy75C4A","dVUgd8ot6BE"]cQ25-glGRzI RCARecords 742 Music 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]pv5zWaTEVkI OkGo 531 Music 184 32022043 4.83 121516 39974 ["dMH0bHeiRNg","x49WZRyXGe0","RbdbVhBGETQ","yRmqZRPgK1w","DjCL0_0Il7w","gq7r3F1SoX0","k66epna2Sss","1LNbzqoOPu4","vr3x_RRJdd4","ERV-wh4VwZI","xsRWpK4pf90","iAQZ_uui1SY","5P6UU6m3cqk","bav63MWNUKg"]ktUSIJEiOug aliciakeys 953 Music 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]Lt6o8NlrbHg seankingston 867 Music 257 41171303 4.74 101352 67579 ["qwflMOAaOf0","aVB5ViG_0yE","_QmajsQI9SM","Skz6h2gb-t8","t2dkCGDgVzk","HJgsbsMSe5w","BXhN2DbI0hc","sFJQAfW_jgw","RZJ32D-lrTE","U1bf7XLGGRE","ktUSIJEiOug","TWEez0U-9ag","BhkIjh-nuRo","a4X7eFbP3u4","b3u65f4CRLk","sm2fTDpuyyM","9pzro0T8rgY","TBzfqR4mrgE","c2cyose42jU","iWg3IMN_rhU"]xsRWpK4pf90 universalmusicgroup 902 Music 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]K2cYWfq--Nw FrEckleStudios 841 Music 224 17005186 4.82 90991 50788 ["SyIC3Munnyw","cZd1Js0QaOI","lLYD_-A_X5E","alqM0IYeH54","wQVEPFzkhaM","oGECJP3phyY","nPLOiBM8hLk","VpBDqtUEWcM","MJPdVVOmbz4","bPZJYQXQsm8","nPBmXEO3yUU","3jzSh_MLNcY","_EXeBLvmllg","Cva_sGN_0VA","Sr2JneittqQ","SYpYmkcadRA","71eBjNbdXDg","DgBgnoEY4iM","bl6RJyZdBSU","FAK_jtOf70g"]-xEzGIuY7kw alyankovic 580 Music 171 24095019 4.84 90091 45517 ["HYokLWfqbaU","N26KWq7MmSc","JCAt9WcCFbM","Rt1_6uz_sVU","Nh9mVsBKwYs","p9Zt8mn14hY","8n7ncJEFuSw","FT060JGp9sQ","E6Zc9NyYH-k","ixyTNd-Ln38","zIllRdSzSug","GsfVw9xxoNY","XkDeJgGrtdU","fqz1ojIQTBk","5GE82tqcYYQ","ODdGhOOUOpI","v3ARyAb_1Bs","XbVtbc_XzrI","U1ULxKM75rY","Jw00EUh0GT4"]EwTZ2xpQwpA TayZonday 796 Music 292 16841569 4.23 83514 129200 ["2x2W12A8Qow","P6dUCOS1bM0","NattlyH0IeM","nTQOpibv_OA","9mSKBgvHdoE","hjD6iigdB-g","caIBKOztlAo","xUz2YMmiq0k","aWY3eYOX3U0","mD5_GUovjiM","1oFS-q8BIps","deXAEN70CDY","0pElTyjfxe0","eyDuGwlrFRs","m6SjPfc_xNA","qYGvGWY1FDs","N0amCfgnwY8","JPu4uErBFks","pgSA-ErKd8c","ZZgGGlOGyUg"]xWHf_vYZzQ8 universalmusicgroup 771 Music 262 25607299 4.79 83419 70529 ["jvz0bvYmnto","zElEs8yw7fw","9FcBnaLjxY4","95wgKdSJGDo","ueOgZPBXY4A","k3O5uy-MBBk","O3sGTaQ9s9c","aT434G38OBg","4PdDPrwIwhI","Y-VifE8EK8w","9wpxno6qUd0","B17SYROT3GI","wJtUuxmm-B0","LBDnkJ5h1ho","CfzpBjKpSPE","QF2AmC2xyXM","eoa6Gx4HxTc","mvPvcV44rCc"]Time taken: 109.053 seconds, Fetched: 10 row(s)
4.7 找出用户中上传视频最多的10个用户的所有视频
思路:
- 筛选出用户表中上传视频数最多的前十位
select * from user order by videos desc limit 10; - 将筛选结果跟youtube3进行join得出结果
select b.,a.videos,a.friends from (select from user order by videos desc limit 10) a join youtube3 b on a.uploader = b.uploader order by views desc limit 20;
结果如下:
hive> select b.*,a.videos,a.friends from (select * from user order by videos desc limit 10) a join youtube3 b on a.uploader = b.uploader order by views desc limit 20;Query ID = hadoop_20170711101616_d9ebb69e-7fbe-4f09-ad55-54f4dfd11ebeTotal jobs = 5Launching Job 1 out of 5Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1499153664137_0125, Tracking URL = http://master:8088/proxy/application_1499153664137_0125/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0125Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12017-07-11 10:17:30,594 Stage-1 map = 0%, reduce = 0%2017-07-11 10:18:03,439 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.03 sec2017-07-11 10:18:25,025 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 10.28 secMapReduce Total cumulative CPU time: 10 seconds 280 msecEnded Job = job_1499153664137_0125Stage-8 is filtered out by condition resolver.Stage-9 is selected by condition resolver.Stage-2 is filtered out by condition resolver.17/07/11 10:18:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableExecution log at: /tmp/hadoop/hadoop_20170711101616_d9ebb69e-7fbe-4f09-ad55-54f4dfd11ebe.log2017-07-11 10:18:33 Starting to launch local task to process map join; maximum memory = 5189795842017-07-11 10:18:44 Dump the side-table for tag: 0 with group count: 10 into file: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-11_10-16-50_653_2968482866464413222-1/-local-10007/HashTable-Stage-6/MapJoin-mapfile90--.hashtable2017-07-11 10:18:44 Uploaded 1 File to: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-11_10-16-50_653_2968482866464413222-1/-local-10007/HashTable-Stage-6/MapJoin-mapfile90--.hashtable (611 bytes)2017-07-11 10:18:44 End of local task; Time Taken: 11.244 sec.Execution completed successfullyMapredLocal task succeededLaunching Job 3 out of 5Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1499153664137_0126, Tracking URL = http://master:8088/proxy/application_1499153664137_0126/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0126Hadoop job information for Stage-6: number of mappers: 4; number of reducers: 02017-07-11 10:19:25,285 Stage-6 map = 0%, reduce = 0%...2017-07-11 10:20:26,162 Stage-6 map = 100%, reduce = 0%, Cumulative CPU 44.09 secMapReduce Total cumulative CPU time: 44 seconds 90 msecEnded Job = job_1499153664137_0126Launching Job 4 out of 5Number of reduce tasks determined at compile time: 1Starting Job = job_1499153664137_0127, Tracking URL = http://master:8088/proxy/application_1499153664137_0127/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0127Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 12017-07-11 10:21:01,746 Stage-3 map = 0%, reduce = 0%2017-07-11 10:21:30,462 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 4.27 sec2017-07-11 10:21:48,968 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 6.36 secMapReduce Total cumulative CPU time: 6 seconds 360 msecEnded Job = job_1499153664137_0127MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 10.28 sec HDFS Read: 9796823 HDFS Write: 434 SUCCESSStage-Stage-6: Map: 4 Cumulative CPU: 44.09 sec HDFS Read: 574641054 HDFS Write: 2133846 SUCCESSStage-Stage-3: Map: 2 Reduce: 1 Cumulative CPU: 6.36 sec HDFS Read: 2144607 HDFS Write: 6335 SUCCESSTotal MapReduce CPU Time Spent: 1 minutes 0 seconds 730 msecOKlUOe76YPY7M expertvillage 801 ["Howto","Style"] 119 1014983 3.36 596 689 ["5cEsa-rswrg","HOoQPbcF8Rk","ht3DMDx1spo","VFnf17cp3Eg","IP35rm_31RQ","hXuiuwZPX0M","s8pH5jgSuP8","KIEkwm_6DD4","ovFSahIPaVQ","rQtwcSDh3Hk","COeT3WR7SFc","ZNLBKRpWt7w","qncPBZ9DRRk","tOBqg3yDlyE","YgfWUXsbr7w","qBB3AsgjN1M","XX8ouWx6xm8","BtF9dYRuiGU","NRanI5qR81I","evfl0NGLjVE"] 86228 56592Aj2wx9PYxI expertvillage 815 ["Howto","Style"] 170 918553 3.82 1596 1533 ["hb5QaCfm7bg","YVDPSZe21KI","C1elpMjZ4wE","8gD1RG-tnNk","d-IDrRSVkCQ","IAOr6SYGAyw","3mbx03mP5eg","o-pN8qAiZhQ","K_qw03-3gFg","JgsO3A_vbtk","ffKr98Ium-M","WKYcBKa8_7Q","dgeX8CZmqvo","jgUCs72RDHs","PaNhROW-wEo","WhRCVm-1r2k","5I5O8P-r5Rk","reJSIZ3ugsE","-51iHvqP0Rs","9wItsn3r_kc"] 86228 5659VBGHer3yFyc libertaddigitaltv 998 ["News","Politics"] 29 828616 4.68 1337 10763 ["HL9p8I3lOFA","Fb3HZ7K1gTk","AARX6XOA5h0","t3DPDKbRxio","VAmz8MNZdlk","WrC6_uBe4eA","d9X8DYOA5DE","pZuZv8UK7YA","NwswzwoA4pA","g1J2gvjp4fs","3SMhrQZdABc","h6M42Il4kN8","izM_JwEkWPs","_FM_IoPIFq4","uMZCCPtl_Do","5cZFinVFubQ","om0iHwO4d6o","lUZxlXkbaxM","iSJ3qKpC-3o","tRdq9_Si0Ws"] 6874 107jnqD8wvyE expertvillage 776 ["Howto","Style"] 217 574365 4.55 1073 1526 ["3wNpOp50uGI","-taU9d26wT4","_kly-fVUi1I","kZtRmnly_sU","LKe50AJvhz8","0GAUnuuBkW4","eM6Bpz6-dio","i9xf62PKC5M","mwFfk7igYC0","ilLTA4p4MMw","qyoLuTjguJA","aC-KOYQsIvU","0V6LWQZjYRk","_uZkvzYEXp0","auQbi_fkdGE","qgiUSEpg8Xc","4I79yc3uKfE","7Lz-XGjynN8","taHOwsdQXWI","2Aj2wx9PYxI"] 86228 5659NDwZcLGekE8 expertvillage 801 ["Howto","Style"] 94 491582 4.35 553 588 ["CEmYX76UJy0","0sF-PoXR9ZU","qDfoNYpozxk","T4P3yp6mFfY","rzcj_Wq6BIE","A2LTVhqHAdo","4gyH0mJPqY8","m3PsQCMz3sY","wckYj_PAhgM","YkvwdM9Xstc","Ph2UAjGfeB4","vT_uXPfSAVE","89mAouUu8YM","0t5pPZ4AXX0","sLNFN75CYbw","QQGHXEPfMcQ","p6gcz4hkOmw","qk6R-X-YmUw","NR_9VlHFOu8","0gA_3BAxtVM"] 86228 5659CEmYX76UJy0 expertvillage 801 ["Howto","Style"] 90 422284 4.48 448 459 ["NDwZcLGekE8","jRAKgd50Jh0","0sF-PoXR9ZU","IjjJWXgqWL4","m3PsQCMz3sY","8Di58l1dfPA","DpEVqy954OY","dmYMeImNy1s","qk6R-X-YmUw","wckYj_PAhgM","qZSD3JLPURk","YkvwdM9Xstc","fdSFh_z0HJY","n5LDyWyHJTo","Dmm5O1DPQYc","A2LTVhqHAdo","I2mM0r76b7o","iE16Z9Gh3TQ","rzcj_Wq6BIE","8szPfWiK-ag"] 86228 5659E6VWi7IaroA expertvillage 946 ["Howto","Style"] 228 338889 4.46 300 251 ["fLy3M2jfe4w","cC1FBmbYgMk","xEJhInSxsQg","RxMLoqzKZ8s","Z9a--4RQ9O8","7neOXKLrgzk","JS-lS9ngMtY","aKIFfgPLY8I","EsqPNhtkWIo","NJUPNtZN2rc","vV7pajY0zOk","GTtZapaRDbg","-0UVDD6ZwYE","B9jNaUP59Vo","2QDGPe50E4Y","lJDZO_pX4aw","wuD2CemlvP0","ZTx2ZFI1spg","CzZghYdupiQ","5ueEKSJ8J-U"] 86228 56590sF-PoXR9ZU expertvillage 801 ["Sports"] 77 326365 4.21 512 663 ["24FJ58Q22D4","m3PsQCMz3sY","qk6R-X-YmUw","CEmYX76UJy0","NDwZcLGekE8","QiIunH47qew","I2mM0r76b7o","c_qSPdLPReM","5eYBaQ5Cg-c","fdSFh_z0HJY","8szPfWiK-ag","Dmm5O1DPQYc","A2LTVhqHAdo","n-Cvzk84X8o","T4P3yp6mFfY","1vo3CCBIcaY","fnGFR1AFCJc","n5LDyWyHJTo","YkvwdM9Xstc","emNfLmLTQb0"] 86228 5659ohdiRHLSw5Y expertvillage 447 ["Howto","Style"] 78 321521 3.79 139 102 ["12_nJamoyTk","dXLhjYgMZ68","ukzFwRoFj4k","sOUgrJo2kIg","HhO39nCDfMg","3iVP0tzwhVc","MFNA0PqLynY","ZlcMzLhiBjg","7F8ajh_DDYs","6vaPIz6S6sk","-L_uhGXOtF0","-libzR5AV58","CSdSH7XKTtQ","KZX5jXPAWIk","Vkj5fbrQpNs","YCgnFIk5Acg","hjPDmVf6KJw","BZnhMl85dq4","iuqVpMdb1NM","GO2_3q6euug"] 86228 56590sF-PoXR9ZU expertvillage 801 ["Howto","Style"] 77 294737 4.21 438 599 ["24FJ58Q22D4","m3PsQCMz3sY","CEmYX76UJy0","NDwZcLGekE8","A2LTVhqHAdo","qk6R-X-YmUw","5eYBaQ5Cg-c","fdSFh_z0HJY","jGdXcitOUzY","n-Cvzk84X8o","wckYj_PAhgM","I2mM0r76b7o","QMZaxtjhZ5k","Dmm5O1DPQYc","vT_uXPfSAVE","8szPfWiK-ag","QiIunH47qew","YkvwdM9Xstc","DpEVqy954OY","c_qSPdLPReM"] 86228 56593Xc-kxSznZ0 expertvillage 430 ["Howto","Style"] 107 292329 3.33 565 964 ["9YAKfkVvvEc","wEkcPjBaHjs","zMl3ixv1kHw","Q4ZUPEgbzu0","-jIEGZwLPvo","YxdxGLBCCRA","Jp0LaJ_ftT0","PgCRWvwdFHM","yZL2MOFk6I0","qDJk3-ofbk0","e4nHAAuDiPE","cuRk9bTbU6A","z53cPMciVio","MREw0dIWaQ0","237AQAvXtw8","a3dnOupmt7Y","3-va5g_QVss","mFdaiY8YhEY","D6li4tYmKf0","bEVHBHKqBfQ"] 86228 5659VxYkdycN3WE expertvillage 801 ["Howto","Style"] 77 257702 4.19 153 137 ["ZpCE7mnYJK0","NYDdCGtbr8E","904S9W9AZg8","Rl5KEODnCfI","7-i2gl9288I","TC1XlhRwhsU","rD_4nCKf5Ns","pFKZZ120fyU","oN4Esih54V8","N1Ov6cCUXkc","rDM97S4jPiw","IK2uLNND3i0","OZXbM6kRuuA","5pvyXkFTa18","bExUJAnCLZw","VoREkZvkyI8","BWhfpzAItx4","EEybT3IeyzA","QteH0KOJx0g","dbXHvd_SGkk"] 86228 5659-libzR5AV58 expertvillage 447 ["Howto","Style"] 391 249682 3.34 73 62 ["-L_uhGXOtF0","4LvTYBbgMjE","-wO07skEauo","CSdSH7XKTtQ","kzQDoXKDgTM","e1kzLj-FZ6k","Guhgb3pWRAQ","KIbyySDSGEc","DGUBzs-GNxE","KZX5jXPAWIk","ohdiRHLSw5Y","6vaPIz6S6sk","5MrAuq_F2dU","gOv-yPqZ3LE","BZnhMl85dq4","oCDdDCqygOg","12_nJamoyTk","7F8ajh_DDYs","HhO39nCDfMg","X3-6IT-7S80"] 86228 5659HL9p8I3lOFA libertaddigitaltv 998 ["News","Politics"] 17 235313 4.61 114 581 ["VBGHer3yFyc","Fb3HZ7K1gTk","Z3Pe8ff37gk","t3DPDKbRxio","pZuZv8UK7YA","_FM_IoPIFq4","1P_EQjd7lq4","yEzNx2g7ors","_neKPvLdG8Y","7i_WDprxACU","zCM0nsym0cE","om0iHwO4d6o","h6M42Il4kN8","Bjk1McjrWtg","VAmz8MNZdlk","-t2hwljZmA8","gH6PHFrA6g8","ysAwzxqbWD8","U5lX_EIFOOA","4zu_X_3qHLA"] 6874 1oE5ZhBHy6Rs expertvillage 776 ["Howto","Style"] 143 234551 4.64 358 280 ["rCWRUtqJgzI","476vNb6thyM","eg3eo8x_9rw","grK43Poye1U","HGncDLMNPRI","iylaKSDfLSc","JFqiDcvRW2Y","XrbXGRLIDTc","NmxoR9lsu-c","JW8FJefw82k","n0Mi550FDng","QXsW44Wh6tY","6oJDYT5MlEA","IJy05ssELhg","TvT-M7tTQHE","GYNVEWCO_X0","5pJ_o19GVRA","q_Nn_CnvD9s","uqpQNkFxVAU","KKP8vWeEc30"] 86228 5659wOnP_oAXUMA expertvillage 429 ["Howto","Style"] 120 228842 3.56 236 601 ["BlDWdfTAx8o","WF1kRLJ_4B8","6bu9csQC45c","xTAYAl4g7HE","871VTEF7qIY","UheCchftswc","TKg_cdwq9l4","ADQ4Bjq42wE","cSJCDcAKShA","SYklbxHP2tI","fktuPPNkQpQ","L6267-JZu3E","_IwJ3Aj2XrQ","hf-4ppNmH-U","aO8evzfTR8E","E-kNUEv0YgA","Hsr1-xcXFL8","VCiLZMjo_PQ","rMEgKgZOJi8","YeYU_OE5oIU"] 86228 5659MVPCc4eODys expertvillage 448 ["Howto","Style"] 69 225879 4.29 171 315 ["iPPgmvcsJNw","xc8sysT31mQ","svt_fnKE_80","kBjUDCyDCuI","ysa50-plo48","yutDlQL9Ct0","gCra4qOrjFw","CN1QJDGinwE","dFJjaj7pXsA","LvJzFdcYSag","4I79yc3uKfE","AcryZ1o4RQE","WyyB0M6wAV8","A0sOVKbYR20","8XCyd0XEejc","5BmVKKpu_CU","8_QNzZ9WPRo","pNiX_l-HEGM","CQJSZs-euZU","ZwHHYzK8UCg"] 86228 56597neOXKLrgzk expertvillage 946 ["Howto","Style"] 111 215943 4.07 215 314 ["xEJhInSxsQg","PSR5kinMX3Y","E6VWi7IaroA","5ovGW0CzQp4","DeMcNhz0Vz8","CzZghYdupiQ","72EYsQEw35k","tScm-eZInBE","Hz4lnt7d88s","dgLUbXSqwSs","G3P4VFrB_zM","pU0O2-o67C0","pDGP8Q3Zzb8","hQhdK8l6CuY","VK9VflDsunI","LWgzG8I9bWY","csL7WNTWK9U","LE93Q5k5-fI","Uirspi_t3xM","wGLNJK4zcdE"] 86228 5659AaO1aLwk53Y expertvillage 443 ["Howto","Style"] 156 204450 4.14 210 290 ["Q8YYZGvKM-8","Bn59zha-uAQ","BjnkKuI-YAY","aQeXHXkL6ow","Qtp2ibwd2Ms","nstbrkjk3OM","Odj4ATUPh9w","v2mm2-9JQ-I","328YVJVtMKU","B-jzkyKxDLo","yB-yGNxNEZg","b0msCCnzmNA","sfh5AJhRto8","SEILiSkGzrY","jm4uxgVDU6o","svOBjuvvy-k","y_1nj8fBQOE","zb8ArkvxOxo","VgWayeJdV0w","Yor1dHH6orE"] 86228 5659m3PsQCMz3sY expertvillage 801 ["Howto","Style"] 99 203912 4.38 177 207 ["CEmYX76UJy0","0sF-PoXR9ZU","NDwZcLGekE8","qk6R-X-YmUw","DpEVqy954OY","Ph2UAjGfeB4","fdSFh_z0HJY","YkvwdM9Xstc","8szPfWiK-ag","Dmm5O1DPQYc","T4P3yp6mFfY","I2mM0r76b7o","7GtVwKZBf9U","mugebyUqK6o","qDfoNYpozxk","XA7dRqkp8YA","a28QcMXjHyI","n5LDyWyHJTo","ovoNU_CbmfA","Q2j6MlACIoc"] 86228 5659Time taken: 300.414 seconds, Fetched: 20 row(s)
4.8 筛选出每个类别中观看数Top10
select a.* from (select videoId,categoryId,views,row_number() over(partition by categoryId order by views desc) rank from youtube_category) a where rank<=10;
hive> select a.* from (select videoId,categoryId,views,row_number() over(partition by categoryId order by views desc) rank from youtube_category) a where rank<=10;Query ID = hadoop_20170713170101_76ffdd80-e72c-4c7f-b9ad-9b8c3f7e17abTotal jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 3Starting Job = job_1499153664137_0143, Tracking URL = http://master:8088/proxy/application_1499153664137_0143/Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0143Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 32017-07-13 17:01:52,690 Stage-1 map = 0%, reduce = 0%2017-07-13 17:02:24,374 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 10.49 sec...2017-07-13 17:03:08,803 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 89.07 secMapReduce Total cumulative CPU time: 1 minutes 29 seconds 70 msecEnded Job = job_1499153664137_0143MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 3 Cumulative CPU: 89.07 sec HDFS Read: 73070026 HDFS Write: 7487 SUCCESSTotal MapReduce CPU Time Spent: 1 minutes 29 seconds 70 msecOKqruSOZq-wJg Activism 2673823 1tFxk7glmMbo Activism 1063928 2dYN278GB2Kg Activism 831836 3n-Akqik3hME Activism 778987 4LtqRAYWjz2Q Activism 768422 52v-whrgAbf4 Activism 742172 6ja4d02s0WKQ Activism 733014 7IpAOYskH1s8 Activism 714580 8ieuVTOJyLBs Activism 640994 96CXBgbP_ZKs Activism 562163 1012Z3J1uzd0Q Animation 65341925 1bFytHZHFXhA Animation 38937813 2sdUUx5FdySs Animation 16151661 3vQbYvjmfbr4 Animation 12114231 4rNKefRRvV7g Animation 11884311 56HIavxnUHls Animation 11735507 6gm5DHKI8o5o Animation 9381674 7316BF17k5d8 Animation 9372064 8ZuQMn6Z0k_w Animation 9195588 96B26asyGKDo Animation 8915176 10dMH0bHeiRNg Comedy 79897120 15P6UU6m3cqk Comedy 42525795 2Tx1XIm6q4r4 Comedy 38378910 3Q5im0Ssyyus Comedy 21170471 4k66epna2Sss Comedy 17944367 5AYxu_MQSTTY Comedy 15665340 6pYak2F1hUYA Comedy 15634357 7_OBlgSz8sSM Comedy 14567743 8wCF3ywukQYA Comedy 14227802 9sHzdsFiBbFc Comedy 13606292 10Mz9BlgiV45I Education 2807232 1VnGb6UQvwJs Education 2441432 2j6lADdhmAec Education 2306689 3pYfTQpsErsM Education 2066429 47GNE6AhL0qI Education 2037257 5NiGZf15W9xc Education 1867126 6-FeAK-q5Cok Education 1741504 7N5P6o2YaqVI Education 1656512 8Dl-agXA63nw Education 1529566 9JX3VmDgiFnY Education 1373008 10cQ25-glGRzI Music 77674728 17AVHXe-ol-s Music 60349673 2ePyRrb2-fzs Music 45984219 3xsRWpK4pf90 Music 44614530 4ktUSIJEiOug Music 43583367 5b3u65f4CRLk Music 43511791 6iWg3IMN_rhU Music 43323757 7innfyQZHPpo Music 41564032 8Lt6o8NlrbHg Music 41171303 9QjA5faZF1A8 Music 40294882 10qruSOZq-wJg Nonprofits 2673823 1tFxk7glmMbo Nonprofits 1063928 2dYN278GB2Kg Nonprofits 831836 3n-Akqik3hME Nonprofits 778987 4LtqRAYWjz2Q Nonprofits 768422 52v-whrgAbf4 Nonprofits 742172 6ja4d02s0WKQ Nonprofits 733014 7IpAOYskH1s8 Nonprofits 714580 8ieuVTOJyLBs Nonprofits 640994 96CXBgbP_ZKs Nonprofits 562163 10SkELRp4wKPs Politics 12730681 1AFFQrUyi8-s Politics 11953598 2YgW7or1TuFk Politics 8516433 3JgiGrXpOhYg Politics 8211043 4up5jmbSjWkw Politics 7869023 5hr23tpWX8lM Politics 7209885 6Kje7NUNebL8 Politics 6707868 7I4u3449L5VI Politics 6675798 8jjXyqcx-mYY Politics 6462051 9a9Vde3FHMmc Politics 6028642 108h98jb9Lk74 UNA 33880568 1AR5yq0aMxI8 UNA 18409445 2DH7FVB15EPU UNA 15487752 3Qvfx6UiAAG0 UNA 13977592 4n-FdoZdfpmE UNA 12241555 5MddPeH1DAvY UNA 12211192 6sI8Sus_KRpY UNA 10460047 714oqm9ywLIw UNA 10344366 8LytRWuhn4BM UNA 9899270 9iSByZARPJSU UNA 9316897 10Oro28yg7W74 Gaming 727015 10to1HDxtkM4 Gaming 529346 2M-w-gLVfi30 Gaming 396471 3i6aH2F1WjsY Gaming 382350 4EgbUSsblCSQ Gaming 266718 57SMnWr9MqwQ Gaming 210147 6NQMBIRipp5A Gaming 207783 7vi1lVqJSbsM Gaming 186709 8aId2hK4pI2M Gaming 186054 9tWPWcyLdrko Gaming 181979 10sLGLum5SyKQ Howto 31121122 1eMhGpzyFdhE Howto 16320501 2KPOOWvP_dd8 Howto 11131527 391wuBqlny50 Howto 8579613 4mr5ghuaTK14 Howto 8361812 5GfPJeDssBOM Howto 6101232 66gmP4nk0EOE Howto 4970382 7mM-30cmM33s Howto 4936417 8STQ3nhXuuEM Howto 4655542 9XZGgeGHU1Bs Howto 4424698 10LU8DDYz68kM Pets 27721690 1epUk3T2Kfno Pets 10352882 2z3U0udLH974 Pets 9461084 3kkT7A3jegBc Pets 9269896 4TZ860P4iTaM Pets 9009434 57tRWRSfcDuQ Pets 8538635 6Qit3ALTelOo Pets 7939352 7Zi9GOvR3Ynw Pets 7351184 8Kxa0mnDj0bs Pets 7289545 9PadauuWF94w Pets 6271287 10W1czBcnX1Ww Science 3234852 1D99NHb6B03s Science 3176792 2tk_F2Y-F2kE Science 3121903 3nhyH7lQ6D2k Science 2879861 4JCbKv9yiLiQ Science 2672391 5U5vs2ly_grk Science 2611389 6M0ODskdEPnQ Science 2555284 7p4ebtj1jR7c Science 2536109 88wTlureUMP8 Science 2477804 9NZNTgglPbUA Science 2230729 10vt4X7zFfv4k Sports 12598542 1P-bWsOK-h98 Sports 12101588 2OS5tQvQOB-Y Sports 9047732 3P9LmHXXWiJs Sports 7415813 4NIKdK-T-jZM Sports 7175403 5euMu1SKi-ak Sports 7040023 6zKQgTiqhPbw Sports 7017699 7yeXoxNP8_xY Sports 6422039 8q8t7iSGAKik Sports 6312667 9COcczatkNP4 Sports 6258611 10sLGLum5SyKQ Style 31121122 1eMhGpzyFdhE Style 16320501 2KPOOWvP_dd8 Style 11131527 391wuBqlny50 Style 8579613 4mr5ghuaTK14 Style 8361812 5GfPJeDssBOM Style 6101232 66gmP4nk0EOE Style 4970382 7mM-30cmM33s Style 4936417 8STQ3nhXuuEM Style 4655542 9XZGgeGHU1Bs Style 4424698 10ZeBd_F2Bz5Y Vehicles 8623041 1ju6t-yyoU8s Vehicles 7325889 2uUurALr_Ckk Vehicles 7258142 3WShY1ObPvhQ Vehicles 7047156 4q3idQKi5EqM Vehicles 6470816 59JWywFpZkg4 Vehicles 6465830 6tth9krDtxII Vehicles 6394563 7npTRXr4Sgxg Vehicles 5450317 8aCamHfJwSGU Vehicles 5354163 9S-ppGiwc0wQ Vehicles 5039415 10LU8DDYz68kM Animals 27721690 1epUk3T2Kfno Animals 10352882 2z3U0udLH974 Animals 9461084 3kkT7A3jegBc Animals 9269896 4TZ860P4iTaM Animals 9009434 57tRWRSfcDuQ Animals 8538635 6Qit3ALTelOo Animals 7939352 7Zi9GOvR3Ynw Animals 7351184 8Kxa0mnDj0bs Animals 7289545 9PadauuWF94w Animals 6271287 10ZeBd_F2Bz5Y Autos 8623041 1ju6t-yyoU8s Autos 7325889 2uUurALr_Ckk Autos 7258142 3WShY1ObPvhQ Autos 7047156 4q3idQKi5EqM Autos 6470816 59JWywFpZkg4 Autos 6465830 6tth9krDtxII Autos 6394563 7npTRXr4Sgxg Autos 5450317 8aCamHfJwSGU Autos 5354163 9S-ppGiwc0wQ Autos 5039415 10v3ARyAb_1Bs Blogs 31812447 15GE82tqcYYQ Blogs 30209692 2ervaMPt4Ha0 Blogs 23859297 3uWow42TCwzg Blogs 22389389 4-_CSo1gOd48 Blogs 21176701 5D2kJZOfq7zk Blogs 20458159 6nhSZs-aAZbo Blogs 20423158 7GuMMfgWhm3g Blogs 18634621 84jbkRGPxvaM Blogs 15980887 9hMnk7lh9M3o Blogs 13553069 10LpAI8TzQDes Entertainment 65078772 1244qR7SvvX0 Entertainment 57790943 21uwOL4rB-go Entertainment 39883413 3w2xUzv6iZWo Entertainment 25222946 4vr3x_RRJdd4 Entertainment 25093671 5lj3iNxZ8Dww Entertainment 23737579 6RB-wUgnyGv0 Entertainment 23067889 7lsO6D1rwrKc Entertainment 21758300 87iYWxfNSjYk Entertainment 19029677 95pGJCkCDK5A Entertainment 18858695 10p0aQvKDA1K0 Events 12239023 1bNF_P281Uu4 Events 9125026 2AlPqL7IUT6M Events 6441558 3J833f9fqWBA Events 4776832 4xIvIWJbzimo Events 4284770 5QuTj9a04o-s Events 4053317 6eejQPUyeNiY Events 3573812 73QL97xldoXc Events 3010296 8r43yCiKlbCo Events 2806995 9z42fchrzhHY Events 2565257 1012Z3J1uzd0Q Film 65341925 1bFytHZHFXhA Film 38937813 2sdUUx5FdySs Film 16151661 3vQbYvjmfbr4 Film 12114231 4rNKefRRvV7g Film 11884311 56HIavxnUHls Film 11735507 6gm5DHKI8o5o Film 9381674 7316BF17k5d8 Film 9372064 8ZuQMn6Z0k_w Film 9195588 96B26asyGKDo Film 8915176 10SkELRp4wKPs News 12730681 1AFFQrUyi8-s News 11953598 2YgW7or1TuFk News 8516433 3JgiGrXpOhYg News 8211043 4up5jmbSjWkw News 7869023 5hr23tpWX8lM News 7209885 6Kje7NUNebL8 News 6707868 7I4u3449L5VI News 6675798 8jjXyqcx-mYY News 6462051 9a9Vde3FHMmc News 6028642 10v3ARyAb_1Bs People 31812447 15GE82tqcYYQ People 30209692 2ervaMPt4Ha0 People 23859297 3uWow42TCwzg People 22389389 4-_CSo1gOd48 People 21176701 5D2kJZOfq7zk People 20458159 6nhSZs-aAZbo People 20423158 7GuMMfgWhm3g People 18634621 84jbkRGPxvaM People 15980887 9hMnk7lh9M3o People 13553069 10W1czBcnX1Ww Technology 3234852 1D99NHb6B03s Technology 3176792 2tk_F2Y-F2kE Technology 3121903 3nhyH7lQ6D2k Technology 2879861 4JCbKv9yiLiQ Technology 2672391 5U5vs2ly_grk Technology 2611389 6M0ODskdEPnQ Technology 2555284 7p4ebtj1jR7c Technology 2536109 88wTlureUMP8 Technology 2477804 9NZNTgglPbUA Technology 2230729 10p0aQvKDA1K0 Travel 12239023 1bNF_P281Uu4 Travel 9125026 2AlPqL7IUT6M Travel 6441558 3J833f9fqWBA Travel 4776832 4xIvIWJbzimo Travel 4284770 5QuTj9a04o-s Travel 4053317 6eejQPUyeNiY Travel 3573812 73QL97xldoXc Travel 3010296 8r43yCiKlbCo Travel 2806995 9z42fchrzhHY Travel 2565257 10Time taken: 121.381 seconds, Fetched: 250 row(s)
5 总结
上面的8个例子向大家展示Hive中简单和稍微复杂的操作,有的启动一个Job就可以完成,有的则需要启动多个Job才能完成,我们也可以看到,启动的Job越多运行时间就会越长,但是实际工作中的操作只会远比我们所演示要更加复杂,越是复杂的操作就更加需要去优化,来达到减少运行时间的目的,所以下一篇我们来看看Hive的优化实践
- Hive实战之Youtube数据集
- SparkSQL实战之Youtube数据集
- 大数据时代--Hive实战之Hive命令
- Hive数据挖掘实战教程
- 5 大数据实战-hive实战分析
- Hive实战之日志分析
- Hive实战之日志分析
- 【分享】YouTube随机抽样用户数据集
- YouTube-8M 数据集简介
- Hive之数据查询
- Hive之数据查询
- Hive之 数据存储
- 基于hive的日志数据统计实战
- 基于hive的日志数据统计实战
- 基于Hive的日志数据统计实战
- 基于 hive 的日志数据统计实战
- 基于hive的日志数据统计实战
- 基于 hive 的日志数据统计实战
- TestController
- 字符串操作
- union 查询树结构
- springmvc.xml web.xml
- java--静态绑定和动态绑定
- Hive实战之Youtube数据集
- mysql索引总结----mysql 索引类型以及创建
- 在fragmnet中获取父activity为null
- Java基础----抽象类
- STL之优先队列
- C语言中的static 详细分析
- hdu1394
- spring的事务控制
- Http请求头和响应头