spark碰到的问题 TODO

来源:互联网 发布:阿里云服务器购买 编辑:程序博客网 时间:2024/06/05 02:36

java.io.FileNotFoundException: File does not exist

java.io.FileNotFoundException: File does not exist: hdfs://master:9000/user/hmaster/.sparkStaging/application_1498791665418_0041/spark-assembly-1.4.0-hadoop2.6.0.jar        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:415)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)        at java.util.concurrent.FutureTask.run(FutureTask.java:262)        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)        at java.util.concurrent.FutureTask.run(FutureTask.java:262)        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)        at java.lang.Thread.run(Thread.java:745)Failing this attempt. Failing the application.         ApplicationMaster host: N/A         ApplicationMaster RPC port: -1         queue: root.hmaster         start time: 1499311838058         final status: FAILED         tracking URL: http://master:8088/cluster/app/application_1498791665418_0041         user: hmasterException in thread "main" org.apache.spark.SparkException: Application application_1498791665418_0041 finished with failed status        at org.apache.spark.deploy.yarn.Client.run(Client.scala:841)        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:867)        at org.apache.spark.deploy.yarn.Client.main(Client.scala)        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at java.lang.reflect.Method.invoke(Method.java:606)        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

原因: 目前主要碰到这两种
1. 代码中本地测试setMaster(“local”),集群中运行应该去掉这个
2. 编译打包用的jdk版本和集群上配置不一样

spark yarn 模式下 一直 运行不报错

spark yarn模式下,打开节点很少

spark standalone 模式下,不报错,停在某个statge额某个任务

个人目前觉得standalone模式在一些复杂的计算中不太使用,之前测试一份数据,计算比较复杂,standalone为了追求速度,所有节点火力全开,导致cpu阻塞

不断增加