Hive内置函数的使用(1)

来源:互联网 发布:淘宝网上眼镜店哪家好 编辑:程序博客网 时间:2024/05/22 14:03

工作中用到了一些,总结一下:

1、  行拆分、合并

1.  分拆explode(ARRAY)

返回值:多行

SELECT explode(myCol) AS myNewCol FROM myTable;

注:1.使用UDTF函数时,select中不可以包含其他表达式;

2.UDTF函数不能嵌套使用;

3.UDTF不支持GROUP BY / CLUSTER BY / DISTRIBUTE BY / SORT BY;

2.  合并去重collect_set(col)

返回值:数组(去重的效果)

Xjt2原始数据:

1                   one

2                   two

2                   two

2                   two

3                   three

4                   four

4                   four

1                   one one

1        one two

2                   twotwo

2                   twoone

select id,collect_set(name) from xjt2 group by id;

1                   ["one two","oneone","one"]

2                   ["twoone","twotwo","two"]

3                   ["three"]

4                   ["four"]

select collect_set(name) from xjt2;

["three","one","four","two"]

         注:当collect_set(col)与其他字段同时在select语句中时,必须使用group by other_fields;

2、  时间函数

1.      获取当前Unix时间戳unix_timestamp()

返回值类型:BIGINT

select unix_timestamp() from xjt1;

1383012276

2.      将日期转时间戳unix_timestamp(string date)

返回值类型:BIGINT,若转换失败,则返回0

select unix_timestamp('2013-01-13 00:00:00') from xjt1;

1358006400

3.      转化指定格式(pattern)日期转时间戳unix_timestamp(string date, string pattern)

返回值类型:BIGINT,若转化失败,则返回0

select unix_timestamp('2013-01-13 00:00:00','yyyyMMdd')from xjt1;

1354291200

4.      将Unix时间戳转日期from_unixtime(BIGINT,’format’)

select from_unix(unix_timestamp(),'yyyyMMdd') from xjt1;

20131029

5.      取日期to_date()、取年year()、取月month()、取天数day()

返回值类型:SRING

select to_date('1990-10-10 00:00:00') from xjt1;

1990-10-10

6.      日期增加函数date_add(string startdate, int days)

返回值类型:STRING

select date_add('2013-10-29',10) from xjt1;

2013-11-08

7.      日期减少函数date_sub(string startdate, intdays)

返回值类型:STRING

select date_sub('2013-10-29',10) from xjt1;

2013-10-19

8.      日期比较函数datediff(string enddate, string startdate)

返回值类型:INT(结束日期减去开始日期,结束日期放在前面)

select datediff('2013-10-29','2013-12-10') from xjt1;

-42

3、  条件判断函数CASE

返回值:T/F

语法:CASE a WHEN b THENc [WHEN d THEN e]* [ELSE f] END

说明:如果 a 等于 b ,那么返回 c ;如果 a 等于 d ,那么返回 e ;否则返回 f

4、  字符串分割函数split(stringstr, string pat)

返回值类型:ARRAY

selectsplit('hello world hello hive',' ') from xjt1;

["hello","world","hello","hive"]


原创粉丝点击