Hive-JDBCli

来源：互联网发布：relief f算法编辑：程序博客网时间：2024/05/21 00:55

可以同步参考

使用hive-site.xml自动连接到HiveServer2

从Hive 2.2.0（HIVE-14063）开始，BeeLine 增加了使用类路径中存在的hive-site.xml来自动生成基于hive-site.xml中的配置属性的连接url和另外一个用户配置文件。并非所有的url属性都可以从hive-site.xml派生，因此为了使用此功能，用户必须创建一个名为“beeline-hs2-connection.xml”的配置文件，该配置文件是Hadoop xml格式文件。此文件用于为连接URL提供用户特定的连接属性。BeeLine在$ {user.home} /。beeline /（基于Unix的操作系统）或$ {user.home} \ beeline \目录（在Windows的情况下）查找此配置文件。如果在以上位置找不到该文件，BeeLine将在$ {HIVE_CONF_DIR}位置和/ etc / hive / conf中查找它（检查HIVE-16335，这些位置从Hive 2.2.0中的/ etc / conf / hive中修复）那个订单。一旦找到该文件，BeeLine将使用beeline-hs2-connection.xml与类路径中的hive-site.xml一起确定连接URL。

beeline-hs2-connection.xml中的url连接属性必须具有前缀“beeline.hs2.connection”，后跟url属性名称。例如，为了提供属性ssl，beeline-hs2-connection.xml中的属性键应为“beeline.hs2.connection.ssl”。以下示例beeline.hs2.connection.xml提供了直线连接网址的用户和密码值。在这种情况下，使用类路径中的hive-site.xml来获取其他属性，如HS2主机名和端口信息，kerberos配置属性，SSL属性，传输模式等。如果密码为空，请删除beeline.hs2.connection.password属性。在大多数情况下，以下配置值在beeline-hs2-connection.xml和正确的配置单元中。

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

  <name>beeline.hs2.connection.user</name>

  <value>hive</value>

</property>

<property>

  <name>beeline.hs2.connection.password</name>

  <value>hive</value>

</property>

</configuration>

在beeline-hs2-connection.xml和hive-site.xml中存在属性的情况下，从beeline-hs2-connection.xml派生的属性值优先。例如在下面的beeline-hs2-connection.xml文件中，为启用了Kerberos的环境中的BeeLine连接提供了主体的值。在这种情况下，只要连接URL，beeline.hs2.connection.principal的属性值将从hive-site.xml覆盖HiveConf.ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL的值。

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>beeline.hs2.connection.hosts</name>
  <value>localhost:10000</value>
</property>
<property>
  <name>beeline.hs2.connection.principal</name>
  <value>hive/dummy-hostname@domain.com</value>
</property>
</configuration>
在属性beeline.hs2.connection.hosts的情况下，beeline.hs2.connection.hiveconf和beeline.hs2.connection.hivevar属性值是逗号分隔的值列表。例如，以下beeline-hs2-connection.xml以逗号分隔格式提供hiveconf和hivevar值。

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>beeline.hs2.connection.user</name>
<value>hive</value>
</property>
<property>
<name>beeline.hs2.connection.hiveconf</name>
<value>hive.cli.print.current.db=true, hive.cli.print.header=true</value>
</property>
<property>
<name>beeline.hs2.connection.hivevar</name>
<value>testVarName1=value1, testVarName2=value2</value>
</property>
</configuration>

当beeline-hs2-connection.xml存在时，何时没有提供其他参数BeeLine自动连接到使用配置文件生成的URL。当提供连接参数（-u，-n或-p）时，BeeLine使用它们，并且不使用beeline-hs2-connection.xml自动连接。删除或重命名beeline-hs2-connection.xml禁用此功能。

步骤：

、

加载HiveServer2 JDBC驱动程序。从1.2.0开始，应用程序不再需要使用Class.forName（）显式加载JDBC驱动程序。

例如：
```
的Class.forName（ “org.apache.hive.jdbc.HiveDriver”）;
```
通过Connection使用JDBC驱动程序创建对象来连接到数据库。

例如：
```
连接cnct = DriverManager.getConnection（“jdbc：hive2：// <host>：<port>”，“<user>”，“<password>”）;
```
默认<port>值为10000.在非安全配置中，<user>为查询运行时指定一个。<password>在非安全模式下，字段值被忽略。
```
连接cnct = DriverManager.getConnection（“jdbc：hive2：// <host>：<port>”，“<user>”，“”）;
```
在Kerberos安全模式下，用户信息基于Kerberos凭据。

通过创建Statement对象并使用其executeQuery()方法将SQL提交到数据库。

例如：

Statement stmt = cnct.createStatement（）; ResultSet rset = stmt.executeQuery（“SELECT foo FROM bar”）;

必要时处理结果集。

用到的jar包

如果你是用Maven，加入以下依赖

01
02
03
04
05
06
07
08
09
10
11
<dependency>
        <groupId>org.apache.hive</groupId>
        <artifactId>hive-jdbc</artifactId>
        <version>0.11.0</version>
</dependency>
 
<dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.2.0</version>
</dependency>

import java.sql.SQLException;

import java.sql.Connection;

import java.sql.ResultSet;

import java.sql.Statement;

import java.sql.DriverManager;

public class HiveJdbcClient {

  private static String driverName = "org.apache.hive.jdbc.HiveDriver";

/**

   * @param args

   * @throws SQLException

*/

  public static void main(String[] args) throws SQLException {

      try {

      Class.forName(driverName);

    } catch (ClassNotFoundException e) {

      // TODO Auto-generated catch block

      e.printStackTrace();

      System.exit(1);

}

    //replace "hive" here with the name of the user the queries should run as

    Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", "");

    Statement stmt = con.createStatement();

    String tableName = "testHiveDriverTable";

    stmt.execute("drop table if exists " + tableName);

    stmt.execute("create table " + tableName + " (key int, value string)");

    // show tables

    String sql = "show tables '" + tableName + "'";

    System.out.println("Running: " + sql);

    ResultSet res = stmt.executeQuery(sql);

    if (res.next()) {

      System.out.println(res.getString(1));

}

       // describe table

    sql = "describe " + tableName;

    System.out.println("Running: " + sql);

    res = stmt.executeQuery(sql);

    while (res.next()) {

      System.out.println(res.getString(1) + "\t" + res.getString(2));

}

    // load data into table

    // NOTE: filepath has to be local to the hive server

    // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line

    String filepath = "/tmp/a.txt";

    sql = "load data local inpath '" + filepath + "' into table " + tableName;

    System.out.println("Running: " + sql);

    stmt.execute(sql);

    // select * query

    sql = "select * from " + tableName;

    System.out.println("Running: " + sql);

    res = stmt.executeQuery(sql);

    while (res.next()) {

      System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));

}

    // regular hive query

    sql = "select count(1) from " + tableName;

    System.out.println("Running: " + sql);

    res = stmt.executeQuery(sql);

    while (res.next()) {

      System.out.println(res.getString(1));

}

}

}

运行JDBC示例代码

# Then on the command-line
$ javac HiveJdbcClient.java
 
# To run the program using remote hiveserver in non-kerberos mode, we need the following jars in the classpath
# from hive/build/dist/lib
#     hive-jdbc*.jar
#     hive-service*.jar
#     libfb303-0.9.0.jar
#     libthrift-0.9.0.jar
#     log4j-1.2.16.jar
#     slf4j-api-1.6.1.jar
#     slf4j-log4j12-1.6.1.jar
#     commons-logging-1.0.4.jar
#
#
# To run the program using kerberos secure mode, we need the following jars in the classpath
#     hive-exec*.jar
#     commons-configuration-1.6.jar (This is not needed with Hadoop 2.6.x and later).
#  and from hadoop
#     hadoop-core*.jar (use hadoop-common*.jar for Hadoop 2.x)
#
# To run the program in embedded mode, we need the following additional jars in the classpath
# from hive/build/dist/lib
#     hive-exec*.jar
#     hive-metastore*.jar
#     antlr-runtime-3.0.1.jar
#     derby.jar
#     jdo2-api-2.1.jar
#     jpox-core-1.2.2.jar
#     jpox-rdbms-1.2.2.jar
# and from hadoop/build
#     hadoop-core*.jar
# as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set, 
# and hadoop jars necessary to run MR jobs (eg lzo codec)
 
$ java -cp $CLASSPATH HiveJdbcClient

或者，您可以运行以下bash脚本，这将在调用客户端之前种子数据文件并构建类路径。该脚本还添加了在嵌入式模式下使用HiveServer2所需的所有其他jar。

#!/bin/bash
HADOOP_HOME=/your/path/to/hadoop
HIVE_HOME=/your/path/to/hive
 
echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt
 
HADOOP_CORE=$(ls $HADOOP_HOME/hadoop-core*.jar)
CLASSPATH=.:$HIVE_HOME/conf:$(hadoop classpath)
 
for i in ${HIVE_HOME}/lib/*.jar ; do
    CLASSPATH=$CLASSPATH:$i
done
 
java -cp $CLASSPATH HiveJdbcClient

阅读全文

0 0