hiveserver2详解

来源：互联网发布：mac c语言编程软件编辑：程序博客网时间：2024/06/07 03:43

一 HiveServer2概览

HiveServer2是一个能使客户端针对hive执行查询的一种服务，与HiverServer1比较，它能够支持多个客户端的并发请求和授权的；

HiveCLI 和 hive –e的方式比较单一，HS2允许远程客户端使用多种语言诸如Java,Python等向Hive提交请求，然后取回结果

HS2对于TCP 模式使用TThreadPoolServer，对于HTTP模式使用JettyServer.

TThreadPoolServer为每一个TCP连接分配一个工作者线程，每一个线程总是和一个连接关联，即使该连接是空闲的，所以这儿有个潜在的性能问题：如果有很多连接，将会导致大量的线程。以后可能会换成TThreadedSelectorServer

对于HTTP模式，在客户端和服务器之间需要一个代理，主要是负载均衡或者其他原因，比如HAProxy

二 HiveServer2的参数配置

hive.server2.transport.modeHS2的模式:binary或者http

hive.server2.thrift.http.port:http监听端口，默认10001

hive.server2.thrift.http.min.worker.threadshttp模式下最小工作者线程

hive.server2.thrift.http.max.worker.threadshttp模式下最大工作者线程

hive.server2.thrift.worker.keepalive.time空闲的工作者线程存活时间

如果工作者线程超过最小工作者线程的时候，那些空闲的工作者线程在过了这个存活时间就会被kill掉

hive.server2.thrift.max.message.size：HS2可以接收最大的消息大小

hive.server2.authenticationHS2的授权机制：

NONE:不进行授权检查

LDAP:基于LDAP的授权机制

KERBEROS:基于KERBEROS的授权机制

CUSTOM:定制的授权提供者

PAM:插件式授权模块

hive.server2.thrift.portTCP监听端口，默认10000

hive.server2.thrift.min.worker.threadstcp模式下最小工作者线程，默认是5

hive.server2.thrift.max.worker.threadstcp模式下最大工作者线程，默认是500

hive.server2.thrift.bind.hostTCP绑定主机,默认为localhost

hive.server2.thrift.http.max.idle.timeHTTP模式下工作者线程的空闲时间

hive.server2.thrift.http.worker.keepalive.timeHTTP模式下工作者线程存活时间

hive.server2.async.exec.threads：线程池最多允许多少并发，默认50个

三 HiveServer2的客户端

3.1beeline

3.1.1beeline作为脚本使用

如果通过脚本执行，我们可以直接通过-u -n-p -e等参数让其执行我们的操作，参数选项：

-u<database URL> 要连接的数据库的URL

-n<username> 指定要连接数据库的用户名

-p<password> 指定要连接数据库的密码

-d<driver class> 使用的驱动名字

-i<init file> 初始化的脚本文件

-e<query> 要执行的查询

-f<exec file> 要执行的脚本文件

-w/--password-file<password file> 从文件读取密码

--hiveconf<key=value> 指定hive配置文件

--hivevarname=value 设置hive变量

--showHeader=[true/false]控制是否在查询结果显示列名

--autoCommit=[true/false]是否自动提交事务

--force=[true/false]运行脚本出错是否继续

--outputformat=[table/vertical/csv2/tsv2/dsv] 指定结果展示的格式

例子：

beeline-u jdbc:hive2://hadoop-all-02:10000 -n hadoop -p hadoop -e 'USE hadoop;SELECT *FROM emp'

或者beeline-u jdbc:hive2://hadoop-all-02:10000 -n hadoop -p hadoop -e 'SELECT * FROMhadoop.emp'

beeline-u jdbc:hive2://hadoop-all-02:10000 -n hadoop -p hadoop -e 'SELECT * FROMhadoop.emp' \

-dorg.apache.hive.jdbc.HiveDriver

beeline-u jdbc:hive2://hadoop-all-02:10000 -n hadoop -p hadoop -f '/opt/shell/hs2.sql'

3.1.2 beeline作为命令行使用

!connect 打开数据库一个新的连接

!close 关闭当前数据库的连接

!closeall 关闭当前打开的所有连接

!columns 列出指定表的所有列

!commit 提交当前事务

!describe 描述一张表

!dropall 删除当前数据库所有表

!indexes 列出指定表的索引

!list 列出当前的连接

!outputformat 设置输出格式

!procedures 列出所有的存储过程

!properties 根据指定的属性文件连接数据库

!quit 退出程序

!rollback 回滚事务

!run 根据指定的文件执行脚本

!set 设置一个变量

!sh 执行一个Linux shell命令

!tables 列出数据库所有的表

例子：

!connectjdbc:hive2://hadoop-all-02:10000 hadoop hadoop

!shrm -rf /opt/shell

3.2JDBC

3.2.1JDBC Connection URL TCP 模式格式

jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list

它可以跟多个hiveserver2的实例，如果有多个以逗号分割

3.2.2JDBC Connection URL HTTP 模式格式

jdbc:hive2://<host>:<port>/<db>;transportMode=http;httpPath=<http_endpoint>

http_endpoint： hive-site.xml配置的http 端点，默认是cliservice

默认端口10001

CodeExample:

publicclass HiveJDBCTools {

public staticfinal StringHIVE_DRIVER ="org.apache.hive.jdbc.HiveDriver";

public staticfinal StringHIVE_URL ="jdbc:hive2://hadoop-all-02:10000/hadoop";

public staticfinal StringUSERNAME ="hadoop";

public staticfinal StringPASSWORD ="hadoop";

public staticConnectiongetConnection() {

try {

Class.forName(HIVE_DRIVER);

return DriverManager.getConnection(HIVE_URL,USERNAME,PASSWORD);

}catch (ClassNotFoundExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

}catch (SQLExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

}

returnnull;

}

public staticPreparedStatementprepare(Connectionconn, String sql) {

try {

returnconn.prepareStatement(sql);

}catch (SQLExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

}

returnnull;

}

public staticvoidclose(ResultSetrs,PreparedStatementps,Connectionconn) {

try {

if (rs !=null) {

rs.close();

}

if (ps !=null) {

ps.close();

}

if (conn !=null) {

conn.close();

}

}catch (SQLExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

}

System.out.println("ResourceClosed!!");

}

publicclass HiveQueryTools {

public staticvoidquery(Stringsql){

Connectionconn = HiveJDBCTools.getConnection();

if (conn ==null) {

return;

}

PreparedStatementps = HiveJDBCTools.prepare(conn,sql);

if (ps ==null) {

return;

}

ResultSetrs =null;

try {

rs =ps.executeQuery();

intcolumns = rs.getMetaData().getColumnCount();

while(rs.next()){

for (inti =0;i < columns; i++) {

System.out.println(rs.getString(i+1));

System.out.println("\t");

}

}catch (SQLExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

}

HiveJDBCTools.close(rs,ps,conn);

}

public staticvoidmain(String[]args) {

Stringsql ="SELECT e.ename,e.job,d.dname,d.loc FROM emp e JOINdept d ON e.deptno = d.deptno";

HiveQueryTools.query(sql);

}

packagecom.hive.client;

importjava.sql.Connection;

importjava.sql.DriverManager;

importjava.sql.ResultSet;

importjava.sql.SQLException;

importjava.sql.Statement;

publicclass HiveJdbcClient {

private static StringdriverName ="org.apache.hive.jdbc.HiveDriver";

/**

*@param args

*@throws SQLException

public staticvoidmain(String[]args)throws SQLException {

try {

Class.forName(driverName);

}catch (ClassNotFoundExceptione) {

//TODO Auto-generated catch block

e.printStackTrace();

System.exit(1);

}

// replace "hive" here with thename of the user the queries should run

// as

Connectioncon = DriverManager.getConnection("jdbc:hive2://hadoop-all-02:10000/hadoop","hadoop","hadoop");

Statementstmt =con.createStatement();

StringtableName ="testHiveDriverTable";

stmt.execute("droptable if exists " +tableName);

stmt.execute("createtable " +tableName + " (key int, value string)");

// show tables

Stringsql ="show tables '" + tableName + "'";

System.out.println("Running:" +sql);

ResultSetres =stmt.executeQuery(sql);

if (res.next()) {

System.out.println(res.getString(1));

}

// describe table

sql ="describe " +tableName;

System.out.println("Running:" +sql);

res =stmt.executeQuery(sql);

while (res.next()) {

System.out.println(res.getString(1) + "\t" + res.getString(2));

}

// load data into table

// NOTE:filepath has to be local tothe hive server

// NOTE: /tmp/a.txt is actrl-Aseparated file with two fields per line

Stringfilepath ="/tmp/a.txt";

sql ="load data local inpath '" +filepath +"'into table " +tableName;

System.out.println("Running:" +sql);

stmt.execute(sql);

// select * query

sql ="select * from " +tableName;

System.out.println("Running:" +sql);

res =stmt.executeQuery(sql);

while (res.next()) {

System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));

}

// regular hive query

sql ="select count(1) from " +tableName;

System.out.println("Running:" +sql);

res =stmt.executeQuery(sql);

while (res.next()) {

System.out.println(res.getString(1));

}

阅读全文

0 0