Hive表导入Elasticsearch

来源:互联网 发布:mysql 建立索引语句 编辑:程序博客网 时间:2024/06/15 14:24

1,添加elasticsearch-hadoop-hive-2.1.2.jar到Hive。Hive添加第三方包,查看:http://blog.csdn.net/qianshangding0708/article/details/50381966 

2,在hive中建立Elasticsearch外表:

@Testpublic void testESTable() {        try {HiveHelper.excuteNonQuery("CREATE EXTERNAL TABLE es_user(id String ,name String ,age int ,create_date String) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'es_hive/user_{create_date}','es.index.auto.create' = 'true','es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200')");} catch (Exception e) {         e.printStackTrace();}}

为了让SQL语句看的清晰点,再贴一次SQL语句:

CREATE EXTERNAL TABLE es_user (id String,NAME String,age INT,create_date String) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES ('es.resource' = 'es_hive/user_{create_date}','es.index.auto.create' = 'true','es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200');

es.index.auto.create:设置是否自动创建索引

es.nodes:Elasticsearch集群地址。


3,创建Hive表

@Testpublic void testHiveTable() {try {HiveHelper.excuteNonQuery("CREATE TABLE IF NOT EXISTS hive_user(id String ,name String ,age int) PARTITIONED BY (create_date String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE");} catch (Exception e) {e.printStackTrace();}}

SQL语句:

CREATE TABLE IF NOT EXISTS hive_user(    id String ,    name String ,    age int) PARTITIONED BY (create_date String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'STORED AS TEXTFILE;


4,上传数据文件,并将文件导入到hive表(hive_user)

数据原文:kkk.txt

1,fish1,12,fish2,23,fish3,34,fish4,45,fish5,56,fish6,67,fish7,78,fish8,89,fish9,9
把kkk.txt上传到HDFS的/fish/hive/目录


@Testpublic void testLoadHiveTable() {try {HiveHelper.excuteNonQuery("LOAD DATA INPATH '/fish/hive/kkk.txt' INTO TABLE hive_user PARTITION(CREATE_DATE='2015-12-22')");} catch (Exception e) {e.printStackTrace();}}
Load到Hive表。

查看hive_user表:

hive> select * from hive_user;OK1fish112015-12-222fish222015-12-223fish332015-12-224fish442015-12-225fish552015-12-226fish662015-12-227fish772015-12-228fish882015-12-229fish992015-12-22Time taken: 0.041 seconds, Fetched: 9 row(s)

OK,数据已经Load到Hive。


5,将Hive表的数据插入到Elasticsearch

@Testpublic void testInsertElasticSearch() {try {   HiveHelper   .excuteNonQuery("INSERT OVERWRITE TABLE es_user SELECT s.id, s.name, s.age, s.create_date FROM hive_user s where s.create_date='2015-12-22'");} catch (Exception e) {e.printStackTrace();}}
查看Elasticsearch:

数据已经成功上传到Elasticsearch。

更多细节,参考:https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html

0 0
原创粉丝点击