Lateral View语法

来源:互联网 发布:彩虹岛禁用小草网络 编辑:程序博客网 时间:2024/04/29 07:10

语法

lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*  

数据准备

假设我们有一张表pageAds,它有两列数据,第一列是pageid string,第二列是adid_list,即用逗号分隔的广告ID集合。

mahao@ubuntu:~$ cat pageAds.txt "front_page"    1,2,3"contact_page"  3,4,5hive> CREATE TABLE pageAds(pageid STRING,adid_list Array<INT>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',';OKTime taken: 3.458 secondshive> LOAD DATA LOCAL INPATH 'pageAds.txt' INTO TABLE pageAds;Loading data to table default.pageadsOKTime taken: 1.377 secondshive> SELECT * FROM pageAds;OK"front_page"    [1,2,3]"contact_page"  [3,4,5]Time taken: 2.127 seconds, Fetched: 2 row(s)hive> 

统计所有广告ID出现的次数。

首先要拆分广告ID,explode()指出要拆分的行,AS子句指出拆分后的列名:

hive> SELECT pageid,adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;OK"front_page"    1"front_page"    2"front_page"    3"contact_page"  3"contact_page"  4"contact_page"  5

按照adid分组,进行统计:

hive>SELECT adid,count(1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid;OK1   12   13   24   15   1
多个lateral view语句

一个FROM语句后可以跟多个lateral view语句,后面的lateral view语句可以引用它前面的所有表和列名,例子如下:

表的数据:

hive> select * from baseTable;OK[1,2]   ["'a'","'b'","'c'"][3,4]   ["'d'","'e'","'f'"]

两个lateral view语句:

hive> select mycol1,col2 from baseTable lateral view explode(col1) tb1 as mycol1    > lateral view explode(col2) tb2 as mycol2;OK1   ["'a'","'b'","'c'"]1   ["'a'","'b'","'c'"]1   ["'a'","'b'","'c'"]2   ["'a'","'b'","'c'"]2   ["'a'","'b'","'c'"]2   ["'a'","'b'","'c'"]3   ["'d'","'e'","'f'"]3   ["'d'","'e'","'f'"]3   ["'d'","'e'","'f'"]4   ["'d'","'e'","'f'"]4   ["'d'","'e'","'f'"]4   ["'d'","'e'","'f'"]

注意上面语句中,两个lateral view按照出现的次序被执行。
转自http://yugouai.iteye.com/blog/1849902

0 0
原创粉丝点击