hive udaf 例子

来源:互联网 发布:mac好玩的网游 编辑:程序博客网 时间:2024/05/21 19:22

查找单列数据的最大值


1、源码

package com.hive.udaf;import org.apache.hadoop.hive.ql.exec.UDAF;import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;import org.apache.hadoop.io.IntWritable;public class Maximum extends UDAF {public static class MaximumIntUDAFEvaluator implements UDAFEvaluator {private IntWritable result;public void init() {result = null;}public boolean iterate(IntWritable value) {if (value == null) {return true;}if (result == null) {result = new IntWritable(value.get());} else {result.set(Math.max(result.get(), value.get()));}return true;}public IntWritable terminatePartial() {return result;}public boolean merge(IntWritable other) {return iterate(other);}public IntWritable terminate() {return result;}}}

2、导出jar包


3、在hive中注册jar包

hive --auxpath /usr/hive/myjar/


4、为Java的类名起一个别名

hive> create temporary function maxvalue as "com.hive.udaf.Maximum";


5、测试方法

hive> select age from userinfos;

10
20
30
56
60
70
80
88


hive> select maxvalue(age) from userinfos;

结果为:88


6、错误总结

刚开始userinfos表结构直接由sqoop导入,在执行时报错误信息如下:

hive> select maxvalue(age) from userinfos;
FAILED: NoMatchingMethodException No matching method for class com.hive.udaf.Maximum with (bigint). Possible choices: _FUNC_(int)


该信息提示参数类型不正确,查看hive表结构,发现直接由sqoop导入的表结构,字段类型发生了变化,age字段在mysql表中的类型为integer,同步到hive表后,age字段类型变成了bigint,将userinfos表删除后,手工创建,导入数据后测试通过!


用sqoop把mysql表结构同步到hive后,字段类型发生变化,有待研究,哪位同仁知道,请留言,thanks;


1 0
原创粉丝点击