Oozie运行python的spark job…

来源:互联网 发布:oracle 数据库日志 编辑:程序博客网 时间:2024/06/05 03:06
尝试在oozie上运行pyspark程序:
先配置yarn-env.sh以解决找不到pyspark库等的问题
export SPARK_HOME=/usr/share/spark

$ hdfs dfs -copyFromLocal py4j.zip/user/oozie/share/lib/spark
$ hdfs dfs -copyFromLocal pyspark.zip/user/oozie/share/lib/spark
【问题没有解决】

现在先解决单独用spark-submit运行的问题,再解决通过oozie运行的问题。
单独用spark-submit运行,不带参数,可以成功
带 --master yarn-cluster 会失败,在8088里面提示这样的错误

Applicationapplication_1486993422162_0016 failed 2 times due to AM Containerfor appattempt_1486993422162_0016_000002 exited with exitCode:-1000
For more detailed output, checkapplication trackingpage:http://bigdata-master:8088/cluster/app/application_1486993422162_0016Then,click on links to logs of each attempt.
Diagnostics: File does notexist:hdfs://bigdata/user/hadoop/.sparkStaging/application_1486993422162_0016/spark1.py
java.io.FileNotFoundException:File does not exist:hdfs://bigdata/user/hadoop/.sparkStaging/application_1486993422162_0016/spark1.py
    atorg.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.

【尝试一】把py里面的
#    conf =conf.setMaster("local[*]") 注释掉,让spark自动选取运行的master
再次运行这样的命令:
spark-submit --master yarn-clusterpythonApp/lib/spark1.py
【成功,8088那儿不报错了!】
【失败,去掉local[*]后,单独spark-submit会造成17/02/15 16:18:11 ERRORSparkDeploySchedulerBackend: Application has been killed. Reason:All masters are unresponsive! Giving up.】



【尝试二(未尝试)】在尝试一的基础上:
SparkConf sc中添加路径
sc.addFile("hdfs:<filepath_on_hdfs>/optimize-spark.py")

【放到oozie那儿还是报错找不到py4j.zip和pyspark.zip】
【尝试一】
更改job里面的properties
把master从local[*]改成yarn-cluster
【均不能解决找不到pyspark的问题】

【尝试二】添加<file>标签看看能不能自动包含文件
py4j.zip  pyspark.zip spark1.py
<file>${nameNode}/user/oozie/${examplesRoot}/apps/pythonApp/lib/pyspark.zip</file>
<file>${nameNode}/user/oozie/${examplesRoot}/apps/pythonApp/lib/py4j.zip</file>
【尝试二添加后出错】
Error: E0701 : E0701: XML schema error,cvc-complex-type.2.4.a: Invalid content was found starting withelement 'file'. One of'{"uri:oozie:spark-action:0.1":spark-opts,"uri:oozie:spark-action:0.1":arg}'is expected.
【尝试三,添加到py里面】
conf.addFile("hdfs://bigdata/user/oozie/examples/apps/pythonApp/lib/pyspark.zip")
conf.addFile("hdfs://bigdata/user/oozie/examples/apps/pythonApp/lib/py4j.zip")
【失败】
无论在本地lib添加py4j.zip和pyspark.zip
还是在<hdfs>/user/oozie/share/lib/spark里面添加py4j.zip和pyspark.zip
还是在<本地>$OOZIE_HOME/share/lib/里面添加py4j.zip和pyspark.zip
还是在lib里面解压出文件再添加,都不行,都失败

【尝试四】
添加syspath  /usr/share/spark/python/lib
import sys
import random
del sys.path[9]
sys.path.append("/usr/share/spark/python/lib")
【问题】这样直接提交spark-submit就会失败,提示:
$ spark-submit spark1.py
Traceback (most recent calllast):
  File"/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/spark1.py",line 8, in <module>
   from pyspark import SparkConf,SparkContext
  File"/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/pyspark/__init__.py",line 41, in <module>
   from pyspark.context importSparkContext
  File"/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/pyspark/context.py",line 21, in <module>
   import shutil
  File"/usr/share/anaconda2/lib/python2.7/shutil.py",line 12, in <module>
   import collections
  File"/usr/share/anaconda2/lib/python2.7/collections.py",line 8, in <module>
   from _collections import deque,defaultdict
ImportError: No module named_collections
【但是通过--master yarn-cluster提交是成功的】
【失败:如此情况,通过oozie还是失败】


【尝试五】通过shell来启动spark-submit来运行任务
 <class>org.lzl.MainClass</class>
<jar>hdfs://bigdata/user/oozie/examples/apps/sparkHello/lib/OozieHelowod.jar</jar>
<arg>hdfs://bigdata/user/oozie/examples/input-data/text/data.txt</arg>
<arg>hdfs://bigdata/user/oozie/examples/output-data/spark/new/</arg>
hdfs://bigdata
【失败得日志都找不到怎么失败的】
/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/spark1.py

【尝试六】http://blog.csdn.net/xyf123/article/details/50853578
按照末尾的提示,给profile和spark配置文件添加SPARK_HOME
在所有的yarn节点都这么配置:
ssh bigdata-6
cd $HADOOP_HOME/etc/hadoop
cp yarn-env.sh yarn-env_backup2017_2_15_1800.sh
echo $SPARK_HOME
export SPARK_HOME=/usr/share/spark



/usr/share/spark
【移动到master】
配置环境变量 /usr/share/oozie/oozie-4.3.0
export SPARK_HOME=/usr/share/spark
export OOZIE_HOME=/usr/share/oozie/oozie-4.3.0
exportCATALINA_HOME=/usr/share/oozie/oozie-4.3.0/oozie-server
export OOZIE_URL=http://bigdata-master:11000
export OOZIE_CONFIG=/usr/share/oozie/oozie-4.3.0/conf
启动后上传py依然失败,提示缺乏zip文件



继续taurus的操作
【尝试七】拷贝zip到tomcat的oozie的web里面的lib里面
oozie-server/webapps/oozie/WEB-INF/lib
【依然无法解决】


而奇葩的是,今天
$ spark-submit spark1.py --py-files py4j.zip,pyspark.zip
Pi is roughly 3.328000
$ spark-submit spark1.py --py-files py4j.zip,pyspark.zip--master yarn-cluster
Pi is roughly 3.232000
$ spark-submit spark1.py --master yarn-cluster
Pi is roughly 3.296000
$ spark-submit spark1.py
Pi is roughly 2.848000
$
单独跑全跑通了


【尝试八】按照https://oozie.apache.org/docs/4.3.0/AG_Install.html#Oozie_Share_Lib
的操作,把zip文件从
/user/oozie/share/lib/spark转移到
/user/oozie/share/lib/spark/lib
运行会报错找不到
ActionExecutorException: JA008: File does not exist:hdfs://bigdata/user/oozie/share/lib/spark/py4j.zip#py4j.zip
ActionExecutorException: JA008: File does not exist:hdfs://bigdata/user/oozie/share/lib/spark/pyspark.zip#pyspark.zip
【感觉可能是添加了--py-files py4j.zip,pyspark.zip 附加命令的影响】



【尝试九】
别特么重命名py4j.zip,特么oozie里面的源码(sharelib/spark/src/main/语句来查找的!
………………
importjava.io.File;
importjava.util.ArrayList;
importjava.util.HashSet;
importjava.util.List;
importjava.util.regex.Matcher;
importjava.util.regex.Pattern;

public class SparkMain extendsLauncherMain {
   private static final String MASTER_OPTION ="--master";
   private static final String MODE_OPTION ="--deploy-mode";
   private static final String JOB_NAME_OPTION ="--name";
   private static final String CLASS_NAME_OPTION ="--class";
   private static final String VERBOSE_OPTION ="--verbose";
   private static final String EXECUTOR_CLASSPATH ="spark.executor.extraClassPath=";
   private static final String DRIVER_CLASSPATH ="spark.driver.extraClassPath=";
   private static final String DIST_FILES ="spark.yarn.dist.files=";
   private static final String JARS_OPTION ="--jars";
   private static final String PY_FILES ="--py-files";
   private static final Pattern[]PYSPARK_DEP_FILE_PATTERN = {Pattern.compile("py4\\S*src.zip"),
         Pattern.compile("pyspark.zip")};
   private String sparkJars = null;
………………

而且!特么它不会纠错!!特么你有了pyspark.zip,特么照样提示你缺2文件!!!!太特么傻逼了!!!!!!!
重命名回py4j-0.9-src.zip
【好吧缺成功解决缺py4j.zip和pyspark.zip问题】


新问题,现在不报错缺这2个文件了:
进hdfs中oozie的spark库里面删掉一系列py文件夹和zip,发觉py4j文件夹(文件夹!文件夹!重要的事情说三遍!)不能删除,还是需要依赖
pyspark文件夹(就是zip解压出来的文件夹)也不能删除
【不仅如此】
【pythonApp/lib下面的zip也不能省略,必须存在】

【运行返回exit with 1错误】
似乎运行有发现包冲突,想到之前为了处理确包问题到处上传py4j和pyspark的zip,把hdfs上面/user/oozie/share/lib/spark里面自己新建的lib文件夹给删除
删掉后居然【报错】找不到/user/oozie/share/lib/spark/lib/py4j.zip??!?!?!?
不应该啊,py4j是我自己起的名字,不是固定要依赖的库啊?
我决定重启OOZIE。可能是sharelib刷新了,oozie没有认识到这个问题
【成功解决缺py4j.zip和pyspark.zip问题,重启一遍果然没有依赖库的问题了】
0 0