hive中的分析函数的典型应用
来源:互联网 发布:lol老是网络断开连接 编辑:程序博客网 时间:2024/06/05 20:21
大家:
好!今天看到了一个hive的分析函数的题,感觉很有意思。把答案整理了整理,分享出来,希望对大家有用。需求如下所示:
起初我看到这道题时,感觉应该要用到分析函数,但就不知道怎么用。好在最终,写出来了,思路如下所示:
----测试表的表结构以及数据如下所示:
hive> desc sales;
OK
id int
produce_name string
start_time date
end_time date
days int
Time taken: 0.354 seconds, Fetched: 5 row(s)
hive> select * from sales;
OK
1 nike 2011-09-01 2011-09-055
2 nike 2011-09-03 2011-09-064
3 nike 2011-09-09 2011-09-157
4 oppo 2011-08-04 2011-08-052
5 oppo 2011-08-04 2011-08-1512
6 vivo 2011-08-15 2011-08-217
7 vivo 2011-09-02 2011-09-1211
Time taken: 0.223 seconds, Fetched: 7 row(s)
---第一步:求出每个开始时间的前一次促销的开始时间和结束时间
select id,produce_name,start_time,end_time,
lag(start_time) over(partition by produce_name order by id) before_start_time,
lag(end_time) over(partition by produce_name order by id) before_end_time
from sales;
---第一步:执行结果(第一步)
1 nike 2011-09-01 2011-09-05NULL NULL
2 nike 2011-09-03 2011-09-062011-09-01 2011-09-05
3 nike 2011-09-09 2011-09-152011-09-03 2011-09-06
4 oppo 2011-08-04 2011-08-05NULL NULL
5 oppo 2011-08-04 2011-08-152011-08-04 2011-08-05
6 vivo 2011-08-15 2011-08-21NULL NULL
7 vivo 2011-09-02 2011-09-122011-08-15 2011-08-21
---第二步 依据前一次促销的开始时间和结束时间,开始时间统一为最早的,为后面分组做准备
select produce_name,start_time,max(end_time) end_time from
(select id,produce_name,case when start_time>=before_start_time and start_time<=before_end_time then before_start_time else start_time end as start_time, end_time
from (select id,produce_name,start_time,end_time,
lag(start_time) over(partition by produce_name order by id) before_start_time,
lag(end_time) over(partition by produce_name order by id) before_end_time
from sales) t) d
group by produce_name,start_time;
--第二步 执行结果(第二步)
nike 2011-09-012011-09-06
nike 2011-09-092011-09-15
oppo 2011-08-042011-08-15
vivo 2011-08-152011-08-21
vivo 2011-09-022011-09-12
---第三步 依据合并后的开始时间,算出每个时间段内的促销天数之和
select produce_name,start_time,end_time,datediff(end_time,start_time)+1 days from
(select produce_name,start_time,max(end_time) end_time from
(select id,produce_name,case when start_time>=before_start_time and start_time<=before_end_time then before_start_time else start_time end as start_time, end_time
from (select id,produce_name,start_time,end_time,
lag(start_time) over(partition by produce_name order by id) before_start_time,
lag(end_time) over(partition by produce_name order by id) before_end_time
from sales) t) d
group by produce_name,start_time) e;
--执行结果(第三步)
nike 2011-09-012011-09-06 6
nike 2011-09-092011-09-15 7
oppo 2011-08-042011-08-15 12
vivo 2011-08-152011-08-21 7
vivo 2011-09-022011-09-12 11
--第四步:依据产品名称,求出最终的促销天数之和
select produce_name,sum(days) from
(select produce_name,start_time,end_time,datediff(end_time,start_time)+1 days from
(select produce_name,start_time,max(end_time) end_time from
(select id,produce_name,case when start_time>=before_start_time and start_time<=before_end_time then before_start_time else start_time end as start_time, end_time
from (select id,produce_name,start_time,end_time,
lag(start_time) over(partition by produce_name order by id) before_start_time,
lag(end_time) over(partition by produce_name order by id) before_end_time
from sales) t) d
group by produce_name,start_time) e) f
group by produce_name;
---执行结果(第四步)
nike 13
oppo 12
vivo 18
说明: 个人的见解,希望对大家有帮助!
- hive中的分析函数的典型应用
- hive分析函数应用
- 函数指针的典型应用
- hive中的分析函数以及时间戳的使用
- 运营效果分析的典型应用
- SG函数的一个典型应用
- HIVE 窗口及分析函数 应用场景
- HIVE 窗口及分析函数 应用场景
- Hive 分析函数lead、lag实例应用
- HIVE 窗口及分析函数 应用场景
- HIVE 窗口及分析函数 应用场景
- hive中的时间函数具体应用
- Hive中的简单窗口函数应用
- Hive的分析函数操作
- Hive典型应用场景之行列转换
- 构造函数与析构函数的典型应用
- SSIS典型应用场景分析
- SSIS典型应用场景分析
- Cognos服务启动过程
- HDU 1059:Dividing
- 筛选求素数和普通求法(C/python略)
- 第十一周项目1图基本算法库
- mysql 主备
- hive中的分析函数的典型应用
- 双重while循环实现打印数字11-99
- iLnkView隐私政策
- linux用户管理
- rpath和runpath的区别
- Magento获取商品CategoryIds
- ArrayList源码分析
- CAS客户端使用Ajax登陆(即保留原有客户端登录页面)
- 单精度实型和双精度实型的有效位数