Kerberos认证下Sparksql向hive写数据错误

来源:互联网 发布:floor在sql是什么意思 编辑:程序博客网 时间:2024/05/18 00:06

今天在调试sparksql与hive的时候出现错误,首先初始化hiveContext就出错。
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
这里说一下,我们整个集群是通过kerberos认证的,所以需要/etc/spark/conf/目录下存放hive-site.xml文件,里面配置有kerberos。同时我们的集群hive的引擎是TEZ,所有hive-site.xml里面含有tez的配置。
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS];
Host Details : local host is: "NM-304-SA5212M4-BIGDATA-552/
10.142.116.52"; destination host is: "NM-304-RH5885V3-BIGDATA-007":8032;
17/06/13 13:34:05 INFO RetryInvocationHandler:
Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over rm2 after 3 fail over attempts.
Trying to fail over after sleeping for 43105ms.
java.net.ConnectException: Call From NM-304-SA5212M4-BIGDATA-2016-244.BIGDATA.CHINATELECOM.CN/
10.142.116.192 to NM-304-RH5885V3-BIGDATA-008:8032 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy31.getNewApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

这个错误的主要原因是用户在spark on yarn-cluster模式下运行程序,需要上传spark连接hive的配置,即hive-site.xml文件,在提交spark程序时,通过–files指定即可。需要重点说明,spark在使用hive-site.xml文件时,需要把里面的tez配置去掉,不然会报错。我们的用户在指定文件时,指定的是/etc/hive/conf/目录下的hive-site.xml文件。果然,出现上面的错误。后来,指定/etc/spark/conf下的去掉tez配置的hive-site.xml文件后,正确运行。这个问题在spark on yarn-client模式与local模式时不会报错。