hive UDF实操及解析说明

来源：互联网发布：淘宝代运营公司诈骗编辑：程序博客网时间：2024/06/05 08:05

hive类UDF工作原理

Hive自身查询语言HQL能完成大部分的功能，但遇到特殊需求时，需要自己写UDF实现。以下是一个完整的案例。

要继承org.apache.hadoop.hive.ql.exec.UDF类，实现evaluate方法
代码如下：

package cn.itcast.hive.udf;import java.util.HashMap;import java.util.Map;import org.apache.hadoop.hive.ql.exec.UDF;import org.apache.hadoop.io.Text;public class NationUDF extends UDF{    public static Map nationMap = new HashMap();    static {        nationMap.put("China", "中国");        nationMap.put("Japan", "小日本");        nationMap.put("USA", "美帝");    }        Text text = new Text();    // 1000 sum(income)    // 中国 getNation(nation)    public Text evaluate(Text nation){        String nation_e = nation.toString();        String name = nationMap.get(nation_e);        if (name == null){        name = "火星人";    }    text.set(name);    return text;}

自定义函数调用过程：
1.添加jar包（在hive命令行里面执行）

add jar /root/NUDF.jar;

2.创建临时函数

create temporary function getNation as 'cn.itcast.hive.udf.NationUDF';

3.调用

select id, name, getNation(nation) from beauties;

4.将查询结果保存到另外一张表中

create table result row format delimited fields terminated by '\t' as select * from beauties order by id desc;

5.使用UDF,将查询结果输出到另外一张表中

create table result_beauties row format delimited fields terminated by '\t' as select id, getNation(nation) from beauties;

阅读全文

1 0