python借助pysh2连接hiveserver2操作hive数据库时thrift.transport.TTransport.TTransportException: TSocket read 0
来源:互联网 发布:linux vi 显示行数 编辑:程序博客网 时间:2024/06/06 01:25
python 借助pysh2包 连接hiveserver2操作hive数据库时,报如下错误提示信息:
python连接hive数据库时运行报错如下:
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
或者
Required field 'sessionHandle' is unset! Struct:TExecuteStatementReq(sessionHandle:null, statement:USE default, confOverlay:{})
hive 的hiveserver2的运行日志报错如下:2017-10-12T14:24:03,540 WARN [HiveServer2-Handler-Pool: Thread-39] service.CompositeService: Failed to open session
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: a6 is not allowed to impersonate anonymous
………………
ERROR [HiveServer2-Handler-Pool: Thread-39] server.TThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client?
序—写在前面:
最近工作中开始接触大数据项目,由于对大数据相关的一些软件感兴趣,如Hadoop,Hbase,hive,thrift,zookeeper等软件包感兴趣,于是在工作间隙在本地mac安装这些开发环境的伪分布式,前几天顺利完成python利用thrift操作hbase的小程序编写,接着就想同样利用python来操作hive数据库,虽然最后成功完成该小程序,但其中过程之波折,主要是遇到如下这个问题,百度之,很少回答或者回答内容让人摸不着头脑,困扰了我快一天,最后在罗大神帮忙下,顺利解决该问题。此文仅仅说明该问题的解决过程,最后再次感谢,罗大神和峰哥的帮忙。一定要充分利用好日志!
闲话不说啦,开始正文啦。本文首选抛出本文要解决的问题,然后进行执行错误原因查找,之后给出原因分析及原因解决方案,最后补充给出Hive中HiveServer或者HiveServer2的区别。
localhost:bin a6$ pwd/Users/a6/Applications/apache-hive-2.3.0-bin/binlocalhost:bin a6$ hive --service hiveserver2 &
localhost:bin a6$ sudo pip install pyhs2Password:The directory '/Users/a6/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.The directory '/Users/a6/Library/Caches/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.Requirement already satisfied: pyhs2 in /Library/Python/2.7/site-packagesRequirement already satisfied: sasl in /Library/Python/2.7/site-packages (from pyhs2)Requirement already satisfied: thrift in /Library/Python/2.7/site-packages/thrift-0.10.0-py2.7-macosx-10.12-intel.egg (from pyhs2)Requirement already satisfied: six in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from sasl->pyhs2)
import pyhs2with pyhs2.connect(host='localhost', port=10000, authMechanism="NOSASL", user='a6', password='' #password='anonymous' ) as conn: with conn.cursor() as cur: #Show databases print "connect hive database success" print cur.getDatabases() print "read data sucess"
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/a6/Downloads/PycharmProjects/test_use_hbase_by_thrift/test11.pyTraceback (most recent call last):dssdskd File "/Users/a6/Downloads/PycharmProjects/test_use_hbase_by_thrift/test11.py", line 13, in <module> print "sucess" File "/Library/Python/2.7/site-packages/pyhs2/connections.py", line 58, in __exit__ self.close() File "/Library/Python/2.7/site-packages/pyhs2/connections.py", line 78, in close self.client.CloseSession(req) File "/Library/Python/2.7/site-packages/pyhs2/TCLIService/TCLIService.py", line 184, in CloseSession return self.recv_CloseSession() File "/Library/Python/2.7/site-packages/pyhs2/TCLIService/TCLIService.py", line 195, in recv_CloseSession (fname, mtype, rseqid) = self._iprot.readMessageBegin() File "build/bdist.macosx-10.12-intel/egg/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin File "build/bdist.macosx-10.12-intel/egg/thrift/protocol/TBinaryProtocol.py", line 217, in readI32 File "build/bdist.macosx-10.12-intel/egg/thrift/transport/TTransport.py", line 60, in readAll File "build/bdist.macosx-10.12-intel/egg/thrift/transport/TTransport.py", line 161, in read File "build/bdist.macosx-10.12-intel/egg/thrift/transport/TSocket.py", line 132, in readthrift.transport.TTransport.TTransportException: TSocket read 0 bytes
2. hive 执行日志的web UI查找
localhost:conf a6$ pwd/Users/a6/Applications/apache-hive-2.3.0-bin/conflocalhost:conf a6$ vi hive-site.xml
<property> <name>hive.server2.webui.host</name> <value>0.0.0.0</value> <description>The host address the HiveServer2 WebUI will listen on</description> </property> <property> <name>hive.server2.webui.port</name> <value>10002</value> <description>The port the HiveServer2 WebUI will listen on. This can beset to 0 or a negative integer to disable the web UI</description> </property>
2017-10-12T14:20:45,755 INFO [HiveServer2-Handler-Pool: Thread-42] session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: Thread-422017-10-12T14:20:45,760 WARN [HiveServer2-Handler-Pool: Thread-42] thrift.ThriftCLIService: Error opening session:org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: a6 is not allowed to impersonate anonymous at org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:419) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:362) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:193) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:440) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:322) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1377) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1362) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-2.3.0.jar:2.3.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: a6 is not allowed to impersonate anonymous at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:89) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) ~[hive-service-2.3.0.jar:2.3.0] at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) ~[hive-service-2.3.0.jar:2.3.0] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) ~[hadoop-common-2.6.5.jar:?] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) ~[hive-service-2.3.0.jar:2.3.0] at com.sun.proxy.$Proxy37.open(Unknown Source) ~[?:?] at org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:410) ~[hive-service-2.3.0.jar:2.3.0] ... 13 moreCaused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: a6 is not allowed to impersonate anonymous三、原因分析及解决方案
python连接hive信息,报出如下信息:
Required field 'sessionHandle' is unset! Struct:TExecuteStatementReq(sessionHandle:null, statement:USE default, confOverlay:{})
显示,这个的时候说明你写的连接Hive的参数有问题。
我的这里的信息是hive账号出现了问题,导致权限不够。
请检查hive的username,或者其他连接信息、
或者项目的hive-jdbc版本和服务器不一致的原因造成的,替换成和服务器一致的版本就可以了,PS:hive前期版本中bug较多,推荐使用最新的版本
我的出错原因是执行查询hive操作的用户与配置hadoop和hive操作的用户不一致
2.解决方案- 1). 修改hadoop 配置文件 etc/hadoop/core-site.xml,加入如下配置项
<!--10-12 add--> <property> <name>hadoop.proxyuser.a6.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.a6.groups</name> <value>*</value> </property>
2). 最终配置结果如下图:
localhost:hadoop a6$ pwd/Users/a6/Applications/hadoop-2.6.5/etc/hadooplocalhost:hadoop a6$ sh ../../sbin/start-all.shThis script is Deprecated. Instead use start-dfs.sh and start-yarn.sh17/10/12 15:07:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [localhost]localhost: starting namenode, logging to /Users/a6/Applications/hadoop-2.6.5/logs/hadoop-a6-namenode-localhost.outlocalhost: starting datanode, logging to /Users/a6/Applications/hadoop-2.6.5/logs/hadoop-a6-datanode-localhost.outStarting secondary namenodes [0.0.0.0]0.0.0.0: starting secondarynamenode, logging to /Users/a6/Applications/hadoop-2.6.5/logs/hadoop-a6-secondarynamenode-localhost.out17/10/12 15:08:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestarting yarn daemonsstarting resourcemanager, logging to /Users/a6/Applications/hadoop-2.6.5/logs/yarn-a6-resourcemanager-localhost.outlocalhost: starting nodemanager, logging to /Users/a6/Applications/hadoop-2.6.5/logs/yarn-a6-nodemanager-localhost.out
在之前的学习和实践Hive中,使用的都是CLI或者hive –e的方式,该方式仅允许使用HiveQL执行查询、更新等操作,并且该方式比较笨拙单一。幸好Hive提供了轻客户端的实现,通过HiveServer或者HiveServer2,客户端可以在不启动CLI的情况下对Hive中的数据进行操作,两者都允许远程客户端使用多种编程语言如Java、Python向Hive提交请求,取回结果。HiveServer或者HiveServer2都是基于Thrift的,但HiveSever有时被称为Thrift server,而HiveServer2却不会。既然已经存在HiveServer为什么还需要HiveServer2呢?这是因为HiveServer不能处理多于一个客户端的并发请求,这是由于HiveServer使用的Thrift接口所导致的限制,不能通过修改HiveServer的代码修正。因此在Hive-0.11.0版本中重写了HiveServer代码得到了HiveServer2,进而解决了该问题。HiveServer2支持多客户端的并发和认证,为开放API客户端如JDBC、ODBC提供了更好的支持。
既然HiveServer2提供了更强大的功能,将会对其进行着重学习,但也会简单了解一下HiveServer的使用方法。在命令中输入hive --service help,结果如下。可以使用hive <parameters> --service serviceName <serviceparameters>启动特定的服务,如cli、hiverserver、hiveserver2等.
http://blog.csdn.net/u011686226/article/details/52044176
http://blog.csdn.net/vfgbv/article/details/51012806
http://blog.csdn.net/u012965373/article/details/52903389
http://blog.csdn.net/u012965373/article/details/52057968
- python借助pysh2连接hiveserver2操作hive数据库时thrift.transport.TTransport.TTransportException: TSocket read 0
- python使用thrift出现TTransportException: TSocket read 0 bytes
- Hive启动metastore的出错org.apache.thrift.transport.TTransportException
- centos7 cdh5.8使用python 的 pysh2连接hiveServer2
- thrift TSocket read 0 bytes(string类型中的一个坑)
- org.apache.storm.thrift.transport.TTransportException
- Program received signal SIGABRT, Aborted + Thrift transport::TSocket::local_open
- thrift之TTransport层的堵塞的套接字I/O传输类TSocket
- thrift之TTransport层的堵塞的套接字I/O传输类TSocket
- org.apache.thrift.transport.TTransportException: No underlying server socket.
- python利用thrift连接hive
- python利用thrift连接hive
- hive beeline连接hiveserver2
- #hive# python利用thrift连接hive
- 使用python来操作hive(通过 pyhs2 和 HiveServer2)
- Hive 安装 python thrift 连接hiserver2
- python通过thrift方式连接hive
- python连接hiveserver2
- 2017.10.12队内互测——新一波高能胡策题
- 工厂三兄弟之抽象工厂模式(五)
- AJAX学习系列1--XMLHttpRequest对象和ActiveXObject对象
- Windows环境下合并zip文件
- Android 名企面试题及涉及知识点整理
- python借助pysh2连接hiveserver2操作hive数据库时thrift.transport.TTransport.TTransportException: TSocket read 0
- Asp.net项目基于jQuery.i18n.properties 实现前端页面的资源国际化
- 机器学习实战学习笔记(三)分类—朴素贝叶斯(python3实现)
- ubuntu 16.04 LTS 常用基本操作键
- POJ 2299 Ultra-QuickSort【线段树】
- centos7.2安装sql server2017
- Jedis的JedisSentinelPool源代码分析 可用于Redis内外网切换
- awk 获取MAC、IP信息
- xpath 常用的命令