hive UDF 例子

来源：互联网发布：英语解题的软件编辑：程序博客网时间：2024/05/17 06:39

过滤字符串首尾字符

1、源码

package com.hive.udf;import org.apache.commons.lang.StringUtils;import org.apache.hadoop.hive.ql.exec.UDF;import org.apache.hadoop.io.Text;public class Trim extends UDF {private Text res = new Text();public Text evaluate(String str) {if (str == null) {return null;}res.set(StringUtils.strip(str.toString()));return res;}public Text evaluate(Text str,String stripChars){if (str == null) {return null;}res.set(StringUtils.strip(str.toString(),stripChars));return res;}}

2、导出jar包

3、在hive中注册jar包

方式1：

hive> add jar Trim.jar

方式2：

在启动时在命令后传递 --auxpath选项， --auxpath后面为jar包所在的路径

例如，我的jar包所在目录的路径为：/usr/hive/myjar/，则可以用以下几种方式启动hive

hive --auxpath /usr/hive/myjar/Trim.jar 或

hive --auxpath /usr/hive/myjar/*.jar(这种方式jar包名称和类名称要一致，否则出错) 或

hive --auxpath /usr/hive/myjar/

4、为Java的类名起一个别名

hive> create temporary function strip as "com.hive.udf.Trim";

5、测试方法

hive> select strip(" dfdf") from users limit 1;

结果为：dfdf

hive> select strip("abc123456ab234a","abc") from users limit 1;

结果为：123456ab234

2 0