hive lateral view 与 explode详解
来源:互联网 发布:scp -r linux 编辑:程序博客网 时间:2024/05/29 07:37
1.explode
Then running the query:
- 1
- 1
will produce:
The usage with Maps is similar:
- 1
- 1
总结起来一句话:explode就是将hive一行中复杂的array或者map结构拆分成多行。
使用实例:
xxx表中有一个字段mvt为string类型,数据格式如下:
[{“eid”:”38”,”ex”:”affirm_time_Android”,”val”:”1”,”vid”:”31”,”VR”:”var1”},{“eid”:”42”,”ex”:”new_comment_Android”,”val”:”1”,”vid”:”34”,”vr”:”var1”},{“eid”:”40”,”ex”:”new_rpname_Android”,”val”:”1”,”vid”:”1”,”vr”:”var1”},{“eid”:”19”,”ex”:”hotellistlpage_Android”,”val”:”1”,”vid”:”1”,”vr”:”var01”},{“eid”:”29”,”ex”:”bookhotelpage_Android”,”val”:”0”,”vid”:”1”,”vr”:”var01”},{“eid”:”17”,”ex”:”trainMode_Android”,”val”:”1”,”vid”:”1”,”vr”:”mode_Android”},{“eid”:”44”,”ex”:”ihotelList_Android”,”val”:”1”,”vid”:”36”,”vr”:”var1”},{“eid”:”47”,”ex”:”ihotelDetail_Android”,”val”:”0”,”vid”:”38”,”vr”:”var1”}]
用explode小试牛刀一下:
- 1
- 1
最后出来的结果如下:
{“eid”:”38”,”ex”:”affirm_time_Android”,”val”:”1”,”vid”:”31”,”vr”:”var1”
“eid”:”42”,”ex”:”new_comment_Android”,”val”:”1”,”vid”:”34”,”vr”:”var1”
“eid”:”40”,”ex”:”new_rpname_Android”,”val”:”1”,”vid”:”1”,”vr”:”var1”
“eid”:”19”,”ex”:”hotellistlpage_Android”,”val”:”1”,”vid”:”1”,”vr”:”var01”
“eid”:”29”,”ex”:”bookhotelpage_Android”,”val”:”0”,”vid”:”1”,”vr”:”var01”
“eid”:”17”,”ex”:”trainMode_Android”,”val”:”1”,”vid”:”1”,”vr”:”mode_Android”
“eid”:”44”,”ex”:”ihotelList_Android”,”val”:”1”,”vid”:”36”,”vr”:”var1”
“eid”:”47”,”ex”:”ihotelDetail_Android”,”val”:”0”,”vid”:”38”,”vr”:”var1”}
{“eid”:”38”,”ex”:”affirm_time_Android”,”val”:”1”,”vid”:”31”,”vr”:”var1”
“eid”:”42”,”ex”:”new_comment_Android”,”val”:”1”,”vid”:”34”,”vr”:”var1”
2.lateral view
hive wiki 上的解释如下:
Example
Consider the following base table named pageAds. It has two columns: pageid (name of the page) and adid_list (an array of ads appearing on the page)
An example table with two rows:
and the user would like to count the total number of times an ad appears across all pages.
A lateral view with explode() can be used to convert adid_list into separate rows using the query:
- 1
- 2
- 1
- 2
The resulting output will be
Then in order to count the number of times a particular ad appears, count/group by can be used:
- 1
- 2
- 3
- 1
- 2
- 3
The resulting output will be
由此可见,lateral view与explode等udtf就是天生好搭档,explode将复杂结构一行拆成多行,然后再用lateral view做各种聚合。
3.实例
还是第一部分的例子,上面我们explode出来以后的数据,不是标准的json格式,我们通过lateral view与explode组合解析出标准的json格式数据:
- 1
- 2
- 3
- 4
- 5
- 6
- 1
- 2
- 3
- 4
- 5
- 6
查询出来的结果:
xxx
{“eid”:”38”,”ex”:”affirm_time_Android”,”val”:”1”,”vid”:”31”,”vr”:”var1”}
xxx
{“eid”:”42”,”ex”:”new_comment_Android”,”val”:”1”,”vid”:”34”,”vr”:”var1”}
xxx
{“eid”:”40”,”ex”:”new_rpname_Android”,”val”:”1”,”vid”:”1”,”vr”:”var1”}
xxx
{“eid”:”19”,”ex”:”hotellistlpage_Android”,”val”:”1”,”vid”:”1”,”vr”:”var01”}
xxx
{“eid”:”29”,”ex”:”bookhotelpage_Android”,”val”:”0”,”vid”:”1”,”vr”:”var01”
xxx
{“eid”:”17”,”ex”:”trainMode_Android”,”val”:”1”,”vid”:”1”,”vr”:”mode_Android”}
xxx
{“eid”:”44”,”ex”:”ihotelList_Android”,”val”:”1”,”vid”:”36”,”vr”:”var1”}
xxx
{“eid”:”47”,”ex”:”ihotelDetail_Android”,”val”:”1”,”vid”:”38”,”vr”:”var1”}
xxx
{“eid”:”38”,”ex”:”affirm_time_Android”,”val”:”1”,”vid”:”31”,”vr”:”var1”}
xxx
{“eid”:”42”,”ex”:”new_comment_Android”,”val”:”1”,”vid”:”34”,”vr”:”var1”}
4.Ending
Lateral View通常和UDTF一起出现,为了解决UDTF不允许在select字段的问题。
Multiple Lateral View可以实现类似笛卡尔乘积。
Outer关键字可以把不输出的UDTF的空结果,输出成NULL,防止丢失数据。
- hive lateral view 与 explode详解
- hive lateral view 与 explode详解
- hive lateral view 与 explode详解
- hive:explode() 与 lateral view
- Lateral View用法 与 Hive UDTF explode
- Lateral View用法 与 Hive UDTF explode
- Lateral View用法 与 Hive UDTF explode
- hive lateral view explode 使用
- 行转多列lateral view explode详解
- hive行转列lateral view explode用法
- hive常用UDF and UDTF函数介绍-lateral view explode()
- hive sql 中lateral view explode/json_tuple的使用
- hive collect_set,lateral view,explode 实现行列转换
- SQL 之 lateral view explode()
- hive中的lateral view 用法详解上篇
- hive中的lateral view 用法详解下篇
- Hive--行转列(Lateral View explode())和列转行(collect_set() 去重)
- Hive Lateral view介绍
- 安全漏洞--字符串格式化(FSV)漏洞分析
- 爬取天眼查数据 附代码
- J. Java Beans
- 容器安全、物联网、区块链,中国一马当先:解读今年云计算趋势(7)
- 如何用O(1)的时间复杂度求栈中最小元素
- hive lateral view 与 explode详解
- Android--开发:由模块化到组件化
- http协议总结
- 高德地图定位SDK集成keystore遇坑及解决方案
- 想要成功的经营一个网站就像照顾婴儿一个细心
- Codeforces Round # 409 C. Voltage Keepsake (二分)
- 【Hadoop需要的Jar包】
- 嵌入式系统移植时,关于堆栈具体作用的总结
- Fiddler的特殊功能