Spark-2.2.0源码编译报错

来源:互联网 发布:js 修改class 属性值 编辑:程序博客网 时间:2024/06/05 17:22

环境:
系统:Linux(CentOS7)
Spark:spark-2.2.0.tgz(Apache 官网下载)
Scala:scala-2.11.8.tgz

Hadoop:hadoop-2.9.0


编译方式:使用Spark源码解压目录下的spark-2.2.0/dev/make-distribution.sh 脚本进行编译

命令:./dev/make-distribution.sh --name spark-2.2.0-hadoop2.9.0 --tgz  -Pyarn -Phadoop-2.9 -Phive -Phive-thriftserver -Dhadoop.version=2.9.0


先说结论,这个版本的源码编译,pom.xml文件中要修改以下三处:

1:追加:

  红色部分为追加项

  <parent>
    <groupId>org.apache</groupId>
    <artifactId>apache</artifactId>
    <version>14</version>
    <relativePath></relativePath>
  </parent>


2:追加:(可以搜索<hadoop.version>关键字,追加到相应的部分)

      以下内容均为追加项

     <profile>
      <id>hadoop-2.9</id>
      <properties>
        <hadoop.version>2.9.0</hadoop.version>
      </properties>
    </profile>


3:红色部分为修改项(此处是导致编译失败的根本原因

    <configuration>
    <scalaVersion>${scala.version}</scalaVersion>
     <recompileMode>incremental</recompileMode>

    <useZincServer>false</useZincServer>


说明:

#1:

否则mvn -X 中会看到如下错误:

[WARNING] 'parent.relativePath' of POM org.apache.spark:spark-parent_2.11:2.2.0 (/software/spark/spark-2.2.0/pom.xml) points at org.apache.spark:spark-parent_2.11 instead of org.apache:apache, please verify your project structure @ org.apache.spark:spark-parent_2.11:2.2.0, /software/spark/spark-2.2.0/pom.xml, line 22, column 11

具体说明:

http://maven.apache.org/ref/3.0.3/maven-model/maven.html#class_parent

Set the value to an empty string in case you want to disable the feature and always resolve the parent POM from the repositories.


#2:

    因为我使用的是2.9.0版本,默认的pom.xml文件中没有

    各位小伙伴可以先在pom.xml中搜下hadoop.version,确认pom.xml是否含有和你当前的hadoop一致的版本(ps:主要是小版本也要对应上哦~),如果没有,手动修改/添加下   


#3.

由于报错3才是导致源码编译失败的根本原因,1,2都是在调查中顺便解决的,所以下面对3进行详细说明:



编译失败后看到的报错信息是这样的:

和百度上常见的报错不太一样

一般常见报错是在Spark Project Core编译时,而我是刚开始执行就报错


[error] Compile failed at 2017-11-25 23:09:53 [0.531s]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [ 14.880 s]
[INFO] Spark Project Tags ................................. FAILURE [  1.220 s]
[INFO] Spark Project Sketch ............................... SKIPPED
[INFO] Spark Project Networking ........................... SKIPPED
[INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
[INFO] Spark Project Unsafe ............................... SKIPPED
[INFO] Spark Project Launcher ............................. SKIPPED
[INFO] Spark Project Core ................................. SKIPPED
[INFO] Spark Project ML Local Library ..................... SKIPPED
[INFO] Spark Project GraphX ............................... SKIPPED
[INFO] Spark Project Streaming ............................ SKIPPED
[INFO] Spark Project Catalyst ............................. SKIPPED
[INFO] Spark Project SQL .................................. SKIPPED
[INFO] Spark Project ML Library ........................... SKIPPED
[INFO] Spark Project Tools ................................ SKIPPED
[INFO] Spark Project Hive ................................. SKIPPED
[INFO] Spark Project REPL ................................. SKIPPED
[INFO] Spark Project YARN Shuffle Service ................. SKIPPED
[INFO] Spark Project YARN ................................. SKIPPED
[INFO] Spark Project Hive Thrift Server ................... SKIPPED
[INFO] Spark Project Assembly ............................. SKIPPED
[INFO] Spark Project External Flume Sink .................. SKIPPED
[INFO] Spark Project External Flume ....................... SKIPPED
[INFO] Spark Project External Flume Assembly .............. SKIPPED
[INFO] Spark Integration for Kafka 0.8 .................... SKIPPED
[INFO] Kafka 0.10 Source for Structured Streaming ......... SKIPPED
[INFO] Spark Project Examples ............................. SKIPPED
[INFO] Spark Project External Kafka Assembly .............. SKIPPED
[INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 17.839 s
[INFO] Finished at: 2017-11-25T23:09:53+08:00
[INFO] Final Memory: 52M/376M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project spark-tags_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :spark-tags_2.11


于是尝试了以下办法:

1.因为刚编译就出错,考虑可能和scala版本有关(我一开始使用的是scala-12.2版本),于是将scala版本安装为官网推荐的11.8

不过后来发现即使没有预装scala,编译时候也会先下载一个11.8版本的scala


2.由于报错信息实在看不出来什么

[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project spark-tags_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed -> [Help 1]

于是执行下记命令查看详细报错:

../bulid/mvn -X  -Pyarn -Phadoop-2.9 -Dhadoop.version=2.9.0 -Phive -Phive-thriftserver -Dscala-2.12.2 -DskipTests clean package

详细报错内容:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.089 s
[INFO] Finished at: 2017-11-25T22:24:24+08:00
[INFO] Final Memory: 52M/369M
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-2.9" could not be activated because it does not exist.
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project spark-tags_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project spark-tags_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed.
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.PluginExecutionException: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed.
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:145)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
... 20 more
Caused by: Compile failed via zinc server
at sbt_inc.SbtIncrementalCompiler.zincCompile(SbtIncrementalCompiler.java:136)
at sbt_inc.SbtIncrementalCompiler.compile(SbtIncrementalCompiler.java:86)
at scala_maven.ScalaCompilerSupport.incrementalCompile(ScalaCompilerSupport.java:303)
at scala_maven.ScalaCompilerSupport.compile(ScalaCompilerSupport.java:119)
at scala_maven.ScalaCompilerSupport.doExecute(ScalaCompilerSupport.java:99)
at scala_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:482)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
... 21 more
[ERROR] 
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :spark-tags_2.11
[hadoop@hadoop1 spark-2.2.0]$ 


第一条Caused by和最开始的报错没有啥区别,仍旧看不出什么,于是按照第二条Caused by搜索关键字,收到两个兄弟和我遇到一样的问题,其中一个来自Apache Spark官网,回复如下:

(http://mail-archives.apache.org/mod_mbox/spark-user/201605.mbox/%3Ctencent_1CB911BF4B21DB77007D341E@qq.com%3E)



首先这位兄弟提问时间是16年,使用的Spark版本也比我的旧,虽然百度不到,但至少说明这不是个很难解决的问题或者bug,不然新版本应该就修复了;这点也给了我信心

可是按照官网的回复仍然不知道要怎么处理,即stop Zinc

但按照这个思路查到了POM文件中的 <useZincServer>false</useZincServer> 配置可以用来关闭Zinc

解决问题

Zinc是一个用来加速编译的服务,以前官网上有Zinc加速编译的章节,标题名称十分明显,现在这部分内容在Buliding Spark的【Speeding up Compilation】章节里



另外在编译过程中还走了个弯路,在报错信息里挖出了这句话,调查了半天,对。。。。明明都写了info了,Error不看看info。。。也是有点病急乱投医,花费了时间,希望大家不要像我一样(emmmmm...估计也没人像我这样吧。。。)至少优先查error,不要在这种地方浪费时间


[INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @ spark-tags_2.11 ---
[INFO] Using zinc server for incremental compilation
[warn] Pruning sources from previous analysis, due to incompatible CompileSetup.
[info] Compiling 2 Scala sources and 6 Java sources to /software/spark/spark-2.2.0/common/tags/target/scala-2.11/classes...
[info] Error occurred during initialization of VM
[info] java.lang.Error: Properties init: Could not determine current working directory.
[info] at java.lang.System.initProperties(Native Method)
[info] at java.lang.System.initializeSystemClass(System.java:1166)
[info] 
[error] Compile failed at 2017-11-25 22:41:22 [0.423s]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  3.540 s]
[INFO] Spark Project Tags ................................. FAILURE [  0.833 s]
[INFO] Spark Project Sketch ............................... SKIPPED


最后,在调查的时候,看到某位前辈的博客说,Spark编译的确比较麻烦,卡住一天甚至几天都是正常的,这也是成为高手的必吃的苦头,感觉有不是那么沮丧,有点信心了~

虽然写出来很简单,但我也折腾了一晚上+一上午

所以如果你也在编译出错苦苦调查中,不要灰心哈!


原创粉丝点击