spark编译打包

来源:互联网 发布:淘宝p图一般多少钱啊 编辑:程序博客网 时间:2024/06/01 17:52

用maven编译打包spark2.1.1
以下编译都是在root用户下执行的命令

建议虚拟机内存4g以上

编译前安装一些压缩解压缩工具
yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop openssl openssl-devel

1.安装Maven 3.3.9和Java 8,并且配置环境变量;

2.安装R包
先安装epel
yum list epel*
yum install epel-release
再安装R
yum list R
yum -y install R
如果没安装R包 ./dev/make-distribution.sh …会报错:
Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:exec (sparkr-pkg) on project spark-core_2.10: Command execution failed. Process exited with an error: 127 (Exit value: 127) -> [Help 1]

3.设置maven选项
jdk1.7:
export MAVEN_OPTS=”-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m”
jdk1.8:
export MAVEN_OPTS=”-Xmx2g -XX:ReservedCodeCacheSize=512m”

4.切换到spark解压后的源码根目录下
cd /root/spark-2.1.1

5.这里的选择scala-2.11,用2.10编译报错,在build前切换scala版本
./dev/change-scala-version.sh 2.11

6.切换到spark2.1.0解压后的源码根目录下
./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -Dscala-2.11 -Phive -Phive -thriftserver -DskipTests clean package

7.打包
./dev/make-distribution.sh –name custom-spark –tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pyarn