hive的jdbc
来源:互联网 发布:林珊珊的淘宝店 编辑:程序博客网 时间:2024/04/30 08:25
原文地址: http://blog.csdn.net/nsrainbow/article/details/43002387 最新课程请关注原作者博客,获得更好的显示体验
声明
说到Hive就一定要说到写程序的时候怎么调用Hive。以下我通过一个例子说明如果通过java来调用hive查询数据
服务准备
使用Jdbc方式链接hive,首先需要启动hive的Thrift Server,否则会导致错误“Could not establish connection to localhost:10000/default: java.net.ConnectException: Connection refused”
hive --service hiveserver 是两”-“,
数据准备
建立一个文本文件叫 a.txt,内容是
- 1,terry
- 2,alex
- 3,jimmy
- 4,mike
- 5,kate
并上传到hive服务器的 /data/ 目录下
JDBC调用方法
加载Driver
加载driver (只说hive2的jdbc)
- Class.forName("org.apache.hive.jdbc.HiveDriver");
连接数据库
- Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");
- 这里的 host1 是主机名
- 10000是hive默认的端口名
- default是默认的database
- hive是默认的用户名,默认密码是空
数据库操作语句
删除表
- stmt.execute("drop table if exists " + tableName);
创建表
- stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");
查询数据
- ResultSet res = stmt.executeQuery("select * from " + tableName);
导入数据
- stmt.execute("load data local inpath '" + filepath + "' into table " + tableName);
例子
建立项目
先打开eclipse建立一个maven项目
pom.xml
- <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
- <modelVersion>4.0.0</modelVersion>
-
- <groupId>org.crazycake</groupId>
- <artifactId>play-hive</artifactId>
- <version>0.0.1-SNAPSHOT</version>
- <packaging>jar</packaging>
-
- <name>play-hive</name>
- <url>http://maven.apache.org</url>
-
- <properties>
- <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
- </properties>
-
- <dependencies>
- <dependency>
- <groupId>junit</groupId>
- <artifactId>junit</artifactId>
- <version>3.8.1</version>
- <scope>test</scope>
- </dependency>
- <dependency>
- <groupId>org.apache.hive</groupId>
- <artifactId>hive-jdbc</artifactId>
- <version>0.14.0</version>
- </dependency>
- <dependency>
- <groupId>org.apache.hadoop</groupId>
- <artifactId>hadoop-common</artifactId>
- <version>2.2.0</version>
- </dependency>
- </dependencies>
-
- <build>
- <plugins>
- <plugin>
- <artifactId>maven-compiler-plugin</artifactId>
- <version>2.0.2</version>
- <configuration>
- <source>1.6</source>
- <target>1.6</target>
- <encoding>UTF-8</encoding>
- <optimise>true</optimise>
- <compilerArgument>-nowarn</compilerArgument>
- </configuration>
- </plugin>
- <plugin>
- <groupId>org.apache.maven.plugins</groupId>
- <artifactId>maven-shade-plugin</artifactId>
- <version>2.3</version>
- <configuration>
- <transformers>
- <transformer
- implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
- </transformer>
- </transformers>
- </configuration>
- <executions>
- <execution>
- <phase>package</phase>
- <goals>
- <goal>shade</goal>
- </goals>
- </execution>
- </executions>
- </plugin>
- </plugins>
- </build>
- </project>
其中最重要的就是这两段
- <dependency>
- <groupId>org.apache.hive</groupId>
- <artifactId>hive-jdbc</artifactId>
- <version>0.14.0</version>
- </dependency>
- <dependency>
- <groupId>org.apache.hadoop</groupId>
- <artifactId>hadoop-common</artifactId>
- <version>2.2.0</version>
- </dependency>
其他的都无所谓
建表、导入以及查询数据
建立一个类 HiveJdbcClient
- package org.crazycake.play_hive;
- import java.sql.SQLException;
- import java.sql.Connection;
- import java.sql.ResultSet;
- import java.sql.Statement;
- import java.sql.DriverManager;
-
-
-
-
-
-
- public class HiveJdbcClient {
-
-
-
-
- private static String driverName = "org.apache.hive.jdbc.HiveDriver";
-
-
-
-
-
- public static void main(String[] args) throws SQLException {
- try {
- Class.forName(driverName);
- } catch (ClassNotFoundException e) {
-
- e.printStackTrace();
- System.exit(1);
- }
-
-
- Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");
-
- Statement stmt = con.createStatement();
-
- String tableName = "testhivedrivertable";
-
-
- stmt.execute("drop table if exists " + tableName);
-
-
- stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");
-
- String sql = "show tables '" + tableName + "'";
- System.out.println("Running: " + sql);
- ResultSet res = stmt.executeQuery(sql);
- if (res.next()) {
- System.out.println(res.getString(1));
- }
-
-
- sql = "describe " + tableName;
- System.out.println("Running: " + sql);
- res = stmt.executeQuery(sql);
- while (res.next()) {
- System.out.println(res.getString(1) + "\t" + res.getString(2));
- }
-
-
-
-
-
- String filepath = "/data/a.txt";
- sql = "load data local inpath '" + filepath + "' into table " + tableName;
- System.out.println("Running: " + sql);
- stmt.execute(sql);
-
-
- sql = "select * from " + tableName;
- System.out.println("Running: " + sql);
- res = stmt.executeQuery(sql);
- while (res.next()) {
- System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
- }
- }
- }
输出是
- Running: show tables 'testhivedrivertable'
- testhivedrivertable
- Running: describe testhivedrivertable
- key int
- value string
- Running: load data local inpath '/data/a.txt' into table testhivedrivertable
- Running: select * from testhivedrivertable
- 1 terry
- 2 alex
- 3 jimmy
- 4 mike
- 5 kate
- Running: select count(1) from testhivedrivertable
其实java调用很简单的,就是把你在hive shell 里面执行的语句用jdbc执行一遍而已,所以你传输过去的语句的环境是hive server机器,里面写的路径也是从hive server主机的根目录路径出发去寻找数据,所以我们的 a.txt 得上传到服务器上,这段代码才会运行正常。
0 0