Oozie报错Py4JJavaError Secu…

来源:互联网 发布:oracle 数据库日志 编辑:程序博客网 时间:2024/06/05 04:53
解决了缺py4j.zip和pyspark.zip问题后,结果还是返回exit code 1:

【回到exit code返回1的问题】
根据http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/SparkStreaming-ExitCodeException-exitCode-13/m-p/32832
的解释:
I got thesolution.
In my Spark Streamingapplication I had set SparkConf.setMaster("local[*]") and inspark-submit I was providing --master yarn-cluster.
So there was conflict in boththe masters and it was remaining in ACCEPTED state andexiting.
【尝试一】把setMaster("local[*]")去掉重新上传到oozie进行任务
【依然报错相同的问题】

【尝试二:试着删除冲突的javax.servlet】
首先,在oozie目录下查找
$ find oozie -namejavax*.*
oozie/hadooplib/share/hadoop/mapreduce/lib/javax.inject-1.jar
oozie/hadooplib/share/hadoop/yarn/lib/javax.inject-1.jar
oozie/oozie-4.3.0/oozie-server/webapps/oozie/WEB-INF/lib/javax.inject-1.jar
oozie/oozie-4.3.0/libext/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/hive/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/hive2/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/spark/javax.servlet-3.0.0.v201112011016.jar
oozie/oozie-4.3.0/lib/javax.inject-1.jar
接着分别在spark、hadoop等目录下查找:
$ find hadoop-2.7.2 -namejavax*.*
hadoop-2.7.2/share/hadoop/mapreduce/lib/javax.inject-1.jar
hadoop-2.7.2/share/hadoop/yarn/lib/javax.inject-1.jar
【spark下直接没找到】
$ findspark-1.6.2-bin-hadoop2.6 -name javax*.*
$
【按照教程提示,先删除hdfs上ooziesharelib的javax.servlet-3.0.0.v201112011016.jar】http://blog.csdn.net/shuxue051/article/details/47256171
重启oozie,再次运行job
【报错找不到javax.servlet】
py4j.protocol.Py4JJavaError: An error occurred while callingNone.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError:javax/servlet/FilterRegistration

【重新搜索整个/usr/share里面的javax.servlet】
share]$ find -name javax.servlet*.*
./presto-server-0.152/lib/javax.servlet-api-3.1.0.jar
./apache-hive-2.1.0-bin/lib/javax.servlet-3.0.0.v201112011016.jar
./sqoop-1.99.7-bin-hadoop200/server/lib/javax.servlet-api-3.1.0.jar
./oozie/oozie-4.3.0/share/lib/spark/javax.servlet-3.0.0.v201112011016.jar
./spark-2.0.0/jars/javax.servlet-api-3.1.0.jar
./apache-drill-1.8.0/jars/classb/javax.servlet-api-3.1.0.jar

可以看到有2个版本:
javax.servlet-api-3.1.0.jar
javax.servlet-3.0.0.v201112011016.jar
javax.servlet-api-3.1.0.jar
javax.servlet-3.0.0.v201112011016.jar
javax.servlet-api-3.1.0.jar
javax.servlet-api-3.1.0.jar
【复制javax.servlet-api-3.1.0.jar版本到hdfs的ooziesharelib】
程序一直运行不停止。kill后发现如下错误:
Exception in thread"dag-scheduler-event-loop" java.lang.NoSuchMethodError:com.esotericsoftware.kryo.Kryo.setInstantiatorStrategy(Lorg/objenesis/strategy/InstantiatorStrategy;)V
       atcom.twitter.chill.KryoBase.setInstantiatorStrategy(KryoBase.scala:86)
       atcom.twitter.chill.EmptyScalaKryoInstantiator.newKryo(ScalaKryoInstantiator.scala:59)
       atorg.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
       atorg.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
       atorg.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:258)
       atorg.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174)
根据https://github.com/twitter/chill/issues/209的解释,可能是版本冲突问题
【估摸着javax.servlet-api-3.1.0.jar也不对】

【最终解决】
exit code 返回 1 可能有无数种问题,得在yarn根据运行的applicationId来纠错。
本次遇到的问题,具体查看yarn的log后:
py4j.protocol.Py4JJavaError: Anerror occurred while callingNone.org.apache.spark.api.java.JavaSparkContext.
: java.lang.SecurityException:class "javax.servlet.FilterRegistration"'s signer information doesnot match signer information of other classes in the samepackage
      at
      at
      at
      at
      at
      at
      at
       …… …… ……
是javax.servlet.FilterRegistration版本问题。
【步骤一:用javax.servlet-api-3.1.0.jar替换掉hdfs中oozieShareLib中的javax.servlet-3.0.0.v201112011016.jar】
【接着run不停止,查看yarn的log也看不出问题(因为没有停止运行)出现关于包Kryo的问题】
可以考虑kill掉application再查看log
也可以考虑进yarn的界面点击各个运行的container查看运行状态log页面:
提示:
Exception in thread"dag-scheduler-event-loop" java.lang.NoSuchMethodError:com.esotericsoftware.kryo.Kryo.setInstantiatorStrategy(Lorg/objenesis/strategy/InstantiatorStrategy;)V
       atcom.twitter.chill.KryoBase.setInstantiatorStrategy(KryoBase.scala:86)
       atcom.twitter.chill.EmptyScalaKryoInstantiator.newKryo(ScalaKryoInstantiator.scala:59)
       atorg.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
       atorg.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
       atorg.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:258)
       atorg.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174)
       atorg.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201)
       atorg.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
       …… …… ……
【这问题极其卧槽】
查看hdfs上的oozieShareLib,发觉是有kryo-2.22.jar包的,加载到netbeans里查看
其中的com.esotericsoftware.kryo.Kryo类里面的setInstantiatorStrategy方法长这样的:
public voidsetInstantiatorStrategy(InstantiatorStrategy strategy){
   [Compiled Code]
}
从网上下载了一个kryo-2.24.0.jar
其中的com.esotericsoftware.kryo.Kryo类里面的setInstantiatorStrategy方法长这样的:
public voidsetInstantiatorStrategy(org.objenesis.strategy.InstantiatorStrategystrategy) {
   [Compiled Code]
}
WTF?!?!?!?!??!
InstantiatorStrategy???!?!?
org.objenesis.strategy.InstantiatorStrategy?!?!!?
你就认不出来了?!?!?!?

然后2.22版本开头进行了如下引用:
packagecom.esotericsoftware.kryo;

importcom.esotericsoftware.kryo.factories.SerializerFactory;
importcom.esotericsoftware.kryo.io.Input;
importcom.esotericsoftware.kryo.io.Output;
importcom.esotericsoftware.kryo.util.FastestStreamFactory;
importcom.esotericsoftware.kryo.util.IdentityMap;
importcom.esotericsoftware.kryo.util.IntArray;
importcom.esotericsoftware.kryo.util.ObjectMap;
importcom.esotericsoftware.shaded.org.objenesis.instantiator.ObjectInstantiator;
importcom.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy;
importjava.util.ArrayList;
而2.24.0版本开头是如下引用:
packagecom.esotericsoftware.kryo;

importcom.esotericsoftware.kryo.factories.SerializerFactory;
importcom.esotericsoftware.kryo.io.Input;
importcom.esotericsoftware.kryo.io.Output;
importcom.esotericsoftware.kryo.util.IdentityMap;
importcom.esotericsoftware.kryo.util.IntArray;
importcom.esotericsoftware.kryo.util.ObjectMap;
importjava.util.ArrayList;
?!@?#!?@¥?#@¥?@#¥?%¥#%……干!!!!!

好吧,是不是我替换掉就好了呢?
【于是当你满心欢喜地把kryo-2.22.jar删了,上传kryo-2.24.0.jar上去的时候】
重启oozie,再跑一遍job,它给你报错:
Exception in thread"dag-scheduler-event-loop" java.lang.NoClassDefFoundError:com/esotericsoftware/minlog/Log
       atcom.esotericsoftware.kryo.util.DefaultClassResolver.register(DefaultClassResolver.
       atcom.esotericsoftware.kryo.Kryo.register(Kryo.
       atcom.esotericsoftware.kryo.Kryo.(Kryo.
       atcom.esotericsoftware.kryo.Kryo.(Kryo.
       atcom.twitter.chill.KryoBase.(KryoBase.scala:32)
       atcom.twitter.chill.EmptyScalaKryoInstantiator.newKryo(ScalaKryoInstantiator.scala:57)
       atorg.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
       atorg.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
【卧槽啊!这个com.esotericsoftware.minlog.Log类只有kryo-2.22.jar里面才有啊!】
所以你两个kryo都得放在sharelib里面,它们不是替换关系!
你让我是无语呢?还是无语呢?还是无语呢?

这就像你开发一个新版本,结果把旧版本的某个功能删除了。
而某人开发的功能,又要用到你新版本和旧版本的功能。
那新旧版本都有的功能客户该用哪个?是都一样的吗?
会不会有冲突?
冲突发生这今后问题会如何扩散?
【最终解决】
一、用javax.servlet-api-3.1.0.jar替换掉hdfs中oozieShareLib中的javax.servlet-3.0.0.v201112011016.jar
二、确保HDFS上oozie的spark的sharelib有kryo-2.22.jar和kryo-2.24.0.jar
(本文指的是:/user/oozie/share/lib/spark)
二、发觉我简直是误人子弟,果断打脸。再次运行py果然由于两个包冲突,还是发生了找不到Kryo类的问题。正确答案是去Kryo项目主页下载Minlog扔到hdfs里面(而不是留着kryo-2.22.jar):
https://github.com/EsotericSoftware/kryo/blob/master/build/minlog-1.2.jar
0 0