How-to: resolve spark "/usr/bin/python: No module named pyspark" issue
来源:互联网 发布:des算法加解密过程 编辑:程序博客网 时间:2024/06/03 17:36
Error:
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/home/hadoop/tmp/nm-local-dir/usercache/chenfangfang/filecache/43/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/home/hadoop/tmp/nm-local-dir/usercache/chenfangfang/filecache/43/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
Root cause:
I am using 1.7.0_45. While python spark on yarn has some issue which makes pyspark does not work with spark build with jdk7: https://issues.apache.org/jira/browse/SPARK-1520. There was not such issue with cdh 5.4.1 spark. But cdh 5.4.1 announced that it was using jdk 1.7.0_45, while its spark was build with jdk6.
Solution:
It is not reasonable for us to rebuild spark with jdk 6, as there are some issue during building. One available solution could be:
Regerate new package with following way:
unzip -d foo spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
cd foo
$JAVA6_HOME/bin/jar cvmf META-INF/MANIFEST.MF ../spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
don't neglect the dot at the end of that command
0 0
- How-to: resolve spark "/usr/bin/python: No module named pyspark" issue
- No module named pyspark
- 从 "No module named pyspark" 到远程提交 spark 任务
- HOW TO FIX No module named pywintypes
- Py第二十问 How to fix “ImportError: No module named …” error in Python?
- how to resolve "key values mismatch" issue
- How-to: resolve "java.io.NotSerializableException" issue during spark reading hbase table
- python: no module named mysqldb
- python ImportError: No module named
- python no module named yaml
- python : ImportError: No module named '****'
- Python No module named win32api
- python lockfile no moudle named LockFile issue
- ubuntu安装软件过程出错,涉及到ImportError: No module named 'ConfigParser' 和E: Sub-process /usr/bin/dpkg returned
- How to solve DevStack error “Exception Value: /usr/bin/env: node: No such file or directory”
- [Environment Config]How to fix “ImportError: No module named scapy.all”
- python安装软件 No module named setuptools
- [Python]ImportError: No module named Cython.Distutils
- Linux workqueue工作原理
- ffmpeg 获取视频关键帧
- 请求分享一个文件
- PHP格式化导出EXCEL 【数值型字符串显示问题】
- stack
- How-to: resolve spark "/usr/bin/python: No module named pyspark" issue
- arm-linux笔记3:arm-linux PC文件传输方法总结(4种最常见的方法)
- Android getevent/sendevent详解
- jquery鼠标放上去显示悬浮层即弹出定位的div层
- oracle备份恢复批处理文件
- The Architecture of Open Source Applications-知名开源项目各种架构分析-前言
- java (clone)克隆不容易
- Android性能调优
- ffmpeg 获取视频关键帧