Spark问题14之Spark stage retry问题
来源:互联网 发布:ubuntu命令行中文乱码 编辑:程序博客网 时间:2024/06/07 16:18
更多代码请见:https://github.com/xubo245
基因数据处理系列之SparkBWA
1.解释
1.1 简述
当partitions超过节点数量的时候Lost executor的问题,已经提交到SparkBWA中,https://github.com/citiususc/SparkBWA/issues/35
另外发现,tmp里面有临时文件没有删除,而且stage retry
未解决
2.记录
完整报错:
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2.[M::bwa_idx_load_from_disk] read 0 ALT contigs[M::process] read 10284 sequences (1028400 bp)...[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2'[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1.[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2.[M::bwa_idx_load_from_disk] read 0 ALT contigs[M::process] read 10286 sequences (1028600 bp)...[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4579, 0, 0)[M::mem_pestat] skip orientation FF as there are not enough pairs[M::mem_pestat] analyzing insert size distribution for orientation FR...[M::mem_pestat] (25, 50, 75) percentile: (191, 198, 205)[M::mem_pestat] low and high boundaries for computing mean and std.dev: (163, 233)[M::mem_pestat] mean and std.dev: (198.15, 10.05)[M::mem_pestat] low and high boundaries for proper pairs: (149, 247)[M::mem_pestat] skip orientation RF as there are not enough pairs[M::mem_pestat] skip orientation RR as there are not enough pairs[M::mem_process_seqs] Processed 10284 reads in 3.555 CPU sec, 1.835 real sec[main] Version: 0.7.15-r1140[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2[main] Real time: 2.076 sec; CPU: 16.998 sec[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Return code from BWA 0.17/02/13 19:24:28 ERROR BwaAlignmentBase: getOutputFile:/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam17/02/13 19:24:28 ERROR BwaAlignmentBase: outputSamFileName:/xubo/project/alignment/sparkBWA/output/g38chr1/standaloneT20170213192403L100c100000Nhs20Paired12/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam17/02/13 19:24:29 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocksjava.io.IOException: Failed to connect to Mcnode1/219.219.220.180:44294 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:193) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:88) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120) at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:97) at org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:152) at org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:265) at org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:112) at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:43) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:96) at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:95) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29) at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:94) at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:33) at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction2$1.apply(JavaPairRDD.scala:1024) at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)Caused by: java.net.ConnectException: Connection refused: Mcnode1/219.219.220.180:44294 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) ... 1 more[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4601, 0, 0)[M::mem_pestat] skip orientation FF as there are not enough pairs[M::mem_pestat] analyzing insert size distribution for orientation FR...[M::mem_pestat] (25, 50, 75) percentile: (192, 199, 205)[M::mem_pestat] low and high boundaries for computing mean and std.dev: (166, 231)[M::mem_pestat] mean and std.dev: (198.69, 9.94)[M::mem_pestat] low and high boundaries for proper pairs: (153, 244)[M::mem_pestat] skip orientation RF as there are not enough pairs[M::mem_pestat] skip orientation RR as there are not enough pairs[M::mem_process_seqs] Processed 10286 reads in 6.262 CPU sec, 4.261 real sec[main] Version: 0.7.15-r1140[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2[main] Real time: 4.478 sec; CPU: 20.226 sec*** Error in `/usr/lib/jvm/jdk1.7.0/bin/java': munmap_chunk(): invalid pointer: 0x00007fc3702d3710 ***
参考
【1】https://github.com/xubo245 【2】http://blog.csdn.net/xubo245/
0 0
- Spark问题14之Spark stage retry问题
- Spark job, stage, task, partition相关问题
- Spark DAG之划分Stage
- spark问题
- spark问题
- Spark问题
- spark 问题
- spark 问题
- Spark源码分析之三:Stage划分
- Spark源码分析之四:Stage提交
- Spark Stage 的划分
- spark如何划分stage
- Spark Stage的划分
- Spark源码--Stage
- spark定制之七:解决spark 1.0.1 import问题
- 【Spark】Stage生成和Stage源码浅析
- spark学习5之sbt问题
- Spark问题4之Excutor lost
- ExtJS 布局组件
- jQuery节点操作方法
- Android OkHttp完全解析
- cvc-complex-type.2.4.a: Invalid content was found starting with element
- 在Android Studio 中导入opencv2.4.9
- Spark问题14之Spark stage retry问题
- Hibernate异常--->关于Hibernate 映射当中的重复映射问题
- 【Codeforces Round #403】Codeforces 781C Underground Lab
- 基于Spark之上的基础环境设置
- Unity iOS打开AppStore评星页面,浅谈Application.OpenURL()方法。
- Longest Common Prefix问题及解法
- Java多线程之内存可见性
- Pycharm3的注册码
- POJ 2719 Faulty Odometer G++