基因数据处理82之cs-bwamem处理SRR003161(参考基因组为GRCH38chr1)

来源:互联网 发布:网络管理员好考吗 知乎 编辑:程序博客网 时间:2024/06/06 06:52

core用不少了,只用了4个,实际可以14个。

1.由于GRCH过大,及其内存小,运行不了全基因组匹配

2.上传:

spark-submit  --class cs.ucla.edu.bwaspark.BWAMEMSpark --master spark://219.219.220.149:7077  /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar upload-fastq   0 5 /xubo/alignment/data/SRR003161.fastq /xubo/data/alignment/data/SRR003161Upload.fastqhadoop@Master:~/xubo/project/alignment$ spark-submit  --class cs.ucla.edu.bwaspark.BWAMEMSpark --master spark://219.219.220.149:7077  /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar upload-fastq   0 5 /xubo/alignment/data/SRR003161.fastq /xubo/data/alignment/data/SRR003161Upload.fastqcommand: upload-fastqMap('isPairEnd -> 0, 'filePartNum -> 5, 'inFilePath1 -> /xubo/alignment/data/SRR003161.fastq, 'outFilePath -> /xubo/data/alignment/data/SRR003161Upload.fastq)Upload FASTQ command line arguments: 0 5 /xubo/alignment/data/SRR003161.fastq  /xubo/data/alignment/data/SRR003161Upload.fastq 250000[WARNING] Avro: Invalid default for field comment: null not a "bytes"SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".                SLF4J: Defaulting to no-operation (NOP) logger implementationSLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.Upload FASTQ to HDFS Finished!!!

3.cs-bwamem比对:

hadoop@Master:~/xubo/project/alignment$ spark-submit --executor-memory 4g --class cs.ucla.edu.bwaspark.BWAMEMSpark --total-executor-cores 20 --master spark://219.219.220.149:7077  --conf spark.driver.host=219.219.220.149 --conf spark.driver.cores=4 --conf spark.driver.maxResultSize=4g --conf spark.storage.memoryFraction=0.7  --conf spark.akka.threads=2 --conf spark.akka.frameSize=1024 /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar cs-bwamem -bfn 1 -bPSW 1 -sbatch 10 -bPSWJNI 1  -oChoice 2 -oPath /xubo/data/alignment/output/SRR003161.adam -localRef 1  -isSWExtBatched 1  0 /home/hadoop/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem/GRCH38BWAindex/GRCH38chr1L3556522.fasta  /xubo/data/alignment/data/SRR003161Upload.fastqcommand: cs-bwamemMap('isPSWJNI -> 1, 'localRef -> 1, 'batchedFolderNum -> 1, 'isPSWBatched -> 1, 'subBatchSize -> 10, 'inFASTQPath -> /xubo/data/alignment/data/SRR003161Upload.fastq, 'inFASTAPath -> /home/hadoop/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem/GRCH38BWAindex/GRCH38chr1L3556522.fasta, 'outputPath -> /xubo/data/alignment/output/SRR003161.adam, 'isSWExtBatched -> 1, 'isPairEnd -> 0, 'outputChoice -> 2)CS- BWAMEM command line arguments: false /home/hadoop/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem/GRCH38BWAindex/GRCH38chr1L3556522.fasta /xubo/data/alignment/data/SRR003161Upload.fastq 1 true 10 true ./target/jniNative.so 2 /xubo/data/alignment/output/SRR003161.adamHDFS master: hdfs://Master:9000Input HDFS folder number: 23Head line: @RG  ID:foo  SM:barRead Group ID: fooLoad Index FilesLoad BWA-MEM optionsOutput choice: 2SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".                SLF4J: Defaulting to no-operation (NOP) logger implementationSLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.CS-BWAMEM Finished!!!  Jun 9, 2016 1:44:32 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 516/06/09 14:37:41 WARN QueuedThreadPool: 6 threads could not be stopped16/06/09 14:37:48 WARN QueuedThreadPool: 1 threads could not be stopped

4.移动数据:

hadoop@Master:~/xubo/project/alignment$ hadoop fs -mv /xubo/data/alignment/data/SRR003161Upload.fastq /xubo/alignment/data/SRR003161Upload.fastqhadoop@Master:~/xubo/project/alignment$ hadoop fs -mv /xubo/data/alignment/output/SRR003161.adam /xubo/alignment/output/SRR003161.adam

5.merge:

hadoop@Master:~/xubo/project/alignment$ spark-submit --executor-memory 4g --class cs.ucla.edu.bwaspark.BWAMEMSpark --total-executor-cores 4 --master spark://219.219.220.149:7077  --conf spark.driver.host=219.219.220.149 --conf spark.driver.cores=4 --conf spark.driver.maxResultSize=6g --conf spark.storage.memoryFraction=0.7  --conf spark.akka.threads=2 --conf spark.akka.frameSize=1024 /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar merge hdfs://219.219.220.149:9000 /xubo/alignment/output/SRR003161.adam /xubo/alignment/output/SRR003161.merge.adamcommand: mergeTotal number of new file partitions18SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".                SLF4J: Defaulting to no-operation (NOP) logger implementationSLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.Jun 9, 2016 3:09:34 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:08 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:17 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:24 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:32 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:39 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:47 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:10:55 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:03 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:10 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:18 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:25 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:32 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:40 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:49 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:55 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:11:56 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:03 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:11 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:18 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:26 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:34 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5Jun 9, 2016 3:12:41 PM INFO: parquet.hadoop.ParquetInputFormat: Total input paths to process : 5

6.数据读取统计:
集群运行: 大数据集:

hadoop@Master:~/xubo/project/alignment/CountAlignment$ cat load.sh     #!/usr/bin/env bash      spark-submit   \--class  org.gcdss.cli.alignment.CountAlignment \--master spark://219.219.220.149:7077 \--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \--conf spark.kryo.registrator=org.bdgenomics.adam.serialization.ADAMKryoRegistrator \--jars /home/hadoop/cloud/adam/lib/adam-apis_2.10-0.18.3-SNAPSHOT.jar,/home/hadoop/cloud/adam/lib/adam-cli_2.10-0.18.3-SNAPSHOT.jar,/home/hadoop/cloud/adam/lib/adam-core_2.10-0.18.3-SNAPSHOT.jar,/home/hadoop/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem/BWAMEMSparkAll/gcdss-cli-0.0.3-SNAPSHOT.jar \--executor-memory 4096M \--total-executor-cores 20 BWAMEMSparkAll.jar \/xubo/alignment/output/SRR003161.merge.adam

运行结果:

hadoop@Master:~/xubo/project/alignment/CountAlignment$ ./load.sh start main:+--------------------+---------+---------+----+----------------+--------------------+--------------------+----------------+---------------------+-------------------+----------+----------+----------+----------+-----------+------------+-------------------------+-------------+------------------+------------------+----------------+------------------+----------------------+--------------------+--------+--------------------+---------------+---------------------------+----------------------+-----------------------+--------------------+----------------------+------------------+------------------------------------+-------------------+-----------------------+-----------------+------------------+----------------+----------+|              contig|    start|      end|mapq|        readName|            sequence|                qual|           cigar|basesTrimmedFromStart|basesTrimmedFromEnd|readPaired|properPair|readMapped|mateMapped|firstOfPair|secondOfPair|failedVendorQualityChecks|duplicateRead|readNegativeStrand|mateNegativeStrand|primaryAlignment|secondaryAlignment|supplementaryAlignment|mismatchingPositions|origQual|          attributes|recordGroupName|recordGroupSequencingCenter|recordGroupDescription|recordGroupRunDateEpoch|recordGroupFlowOrder|recordGroupKeySequence|recordGroupLibrary|recordGroupPredictedMedianInsertSize|recordGroupPlatform|recordGroupPlatformUnit|recordGroupSample|mateAlignmentStart|mateAlignmentEnd|mateContig|+--------------------+---------+---------+----+----------------+--------------------+--------------------+----------------+---------------------+-------------------+----------+----------+----------+----------+-----------+------------+-------------------------+-------------+------------------+------------------+----------------+------------------+----------------------+--------------------+--------+--------------------+---------------+---------------------------+----------------------+-----------------------+--------------------+----------------------+------------------+------------------------------------+-------------------+-----------------------+-----------------+------------------+----------------+----------+|[chr1,248956422,n...|230167209|230167288|   3|SRR003161.900001|TCAGGAAGGCTTTGGGT...|DDDDDDDDDDDDDDDDD...|     263S79M161S|                    0|                  0|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|            true|             false|                 false|       5G41T12C0A7G9|    null|NM:i:5 AS:i:54 XS...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|231374587|231374657|   0|SRR003161.900001|GCTCACTGCAGCCTCAA...|<=====<<<<====;66...|     269H70M164H|                  269|                164|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|        9G7C24A12A14|    null|NM:i:4 AS:i:50 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 30946793| 30946858|   0|SRR003161.900001|AGCTACTCAGGAGGCTG...|====<:66666::;:::...|     171H65M267H|                  171|                267|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|           8T11T5G38|    null|NM:i:3 AS:i:50 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|245252150|245252217|   0|SRR003161.900001|CTGTAGTCCTAGCTACT...|>>><:::::=====<:6...|     161H67M275H|                  161|                275|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|         9C7A0G11T36|    null|NM:i:4 AS:i:47 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|224087945|224088001|   0|SRR003161.900001|CCTGTAGTCCTAGCTAC...|>>>><:::::=====<:...|     160H56M287H|                  160|                287|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|            10C20T24|    null|NM:i:2 AS:i:46 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|155567224|155567270|   0|SRR003161.900001|GTGATTCTCCCGCCTCA...|./4555:7.11:77;::...|     300H46M157H|                  300|                157|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|                  46|    null|NM:i:0 AS:i:46 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 83991503| 83991589|   0|SRR003161.900001|GGCACGGTGGTACACCT...|::::>>>>>>>>>>>>>...|     146H86M271H|                  146|                271|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|24C11G8T5G3G1T11A...|    null|NM:i:8 AS:i:46 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|235432538|235432598|   0|SRR003161.900001|GGAGGCTGAGGCGGGAG...|66::;:::;77:11.7:...|     180H60M263H|                  180|                263|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|           11T5G5T36|    null|NM:i:3 AS:i:45 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|200393343|200393386|   0|SRR003161.900001|GATTCTCCCGCCTCAGC...|4555:7.11:77;:::;...|     302H43M158H|                  302|                158|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|                  43|    null|NM:i:0 AS:i:43 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|244608406|244608484|   0|SRR003161.900001|ACACCTGTAGTCCTAGC...|>>>>>>><:::::====...|     157H78M268H|                  157|                268|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|  40G2T0G1T12G1C0A15|    null|NM:i:7 AS:i:43 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 99915639| 99915700|   0|SRR003161.900001|GTACACCTGTAGTCCTA...|>>>>>>>>><:::::==...|     155H61M287H|                  155|                287|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|         32A4A4T5T12|    null|NM:i:4 AS:i:41 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 26229317| 26229393|   0|SRR003161.900001|ACCTGTAGTCCTAGCTA...|>>>>><:::::=====<...|     159H76M268H|                  159|                268|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|  9T1C8G11T24G1C0A15|    null|NM:i:7 AS:i:41 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|207776701|207776752|   0|SRR003161.900001|CACTGCAGCCTCAAACT...|===<<<<====;666:;...|     272H51M180H|                  272|                180|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|            14C24A11|    null|NM:i:2 AS:i:41 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 64969003| 64969063|   0|SRR003161.900001|GGAGGCTGAGGCGGGAG...|66::;:::;77:11.7:...|     180H60M263H|                  180|                263|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|          9T2T4G5T36|    null|NM:i:4 AS:i:40 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|215011363|215011437|   0|SRR003161.900001|GCTCACTGCAGCCTCAA...|<=====<<<<====;66...|     269H74M160H|                  269|                160|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false| 10T4G1C24A0T6T12G10|    null|NM:i:7 AS:i:39 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|202096162|202096200|   0|SRR003161.900001|TAGCTCACTGCAGCCTC...|6:<=====<<<<====;...|     267H38M198H|                  267|                198|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|                  38|    null|NM:i:0 AS:i:38 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|201518889|201518942|   0|SRR003161.900001|GCCTCAGCCTCCTGAGT...|:77;:::;::66666:<...|311H36M2D15M141H|                  311|                141|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|            36^TG5A9|    null|NM:i:3 AS:i:38 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|160381379|160381422|   0|SRR003161.900001|TATCCTAGCTCACTGCA...|:<<666:<=====<<<<...|     262H43M198H|                  262|                198|     false|     false|      true|     false|      false|       false|                    false|        false|             false|             false|           false|              true|                 false|                37A5|    null|NM:i:1 AS:i:38 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...|202041920|202041962|   0|SRR003161.900001|CCTGTAGTCCTAGCTAC...|>>>><:::::=====<:...|     160H42M301H|                  160|                301|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|                6A35|    null|NM:i:1 AS:i:37 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null||[chr1,248956422,n...| 42275579| 42275615|   0|SRR003161.900001|TGAGCCCAGGAGTTTGA...|/455...47;;:666;=...|     204H36M263H|                  204|                263|     false|     false|      true|     false|      false|       false|                    false|        false|              true|             false|           false|              true|                 false|                  36|    null|NM:i:0 AS:i:36 RG...|            foo|                       null|                  null|                   null|                null|                  null|              null|                                null|               null|                   null|              bar|              null|            null|      null|+--------------------+---------+---------+----+----------------+--------------------+--------------------+----------------+---------------------+-------------------+----------+----------+----------+----------+-----------+------------+-------------------------+-------------+------------------+------------------+----------------+------------------+----------------------+--------------------+--------+--------------------+---------------+---------------------------+----------------------+-----------------------+--------------------+----------------------+------------------+------------------------------------+-------------------+-----------------------+-----------------+------------------+----------------+----------+only showing top 20 rows

24808265

附录:

问题:

hadoop@Master:~/xubo/project/alignment$ spark-submit --executor-memory 4g --class cs.ucla.edu.bwaspark.BWAMEMSpark --total-executor-cores 20 --master spark://219.219.220.149:7077  --conf spark.driver.host=219.219.220.149 --conf spark.driver.cores=4 --conf spark.driver.maxResultSize=4g --conf spark.storage.memoryFraction=0.7  --conf spark.akka.threads=2 --conf spark.akka.frameSize=1024 /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar cs-bwamem -bfn 1 -bPSW 1 -sbatch 10 -bPSWJNI 1  -oChoice 2 -oPath /xubo/data/alignment/output/SRR003161.adam -localRef 1  -isSWExtBatched 1  0 /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna  /xubo/alignment/data/SRR003161Upload.fastqcommand: cs-bwamemMap('isPSWJNI -> 1, 'localRef -> 1, 'batchedFolderNum -> 1, 'isPSWBatched -> 1, 'subBatchSize -> 10, 'inFASTQPath -> /xubo/alignment/data/SRR003161Upload.fastq, 'inFASTAPath -> /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna, 'outputPath -> /xubo/data/alignment/output/SRR003161.adam, 'isSWExtBatched -> 1, 'isPairEnd -> 0, 'outputChoice -> 2)CS- BWAMEM command line arguments: false /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna /xubo/alignment/data/SRR003161Upload.fastq 1 true 10 true ./target/jniNative.so 2 /xubo/data/alignment/output/SRR003161.adamException in thread "main" java.io.FileNotFoundException: File /xubo/alignment/data/SRR003161Upload.fastq does not exist.    at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:697)    at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)    at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)    at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)    at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)    at cs.ucla.edu.bwaspark.FastMap$.memMain(FastMap.scala:103)    at cs.ucla.edu.bwaspark.BWAMEMSpark$.main(BWAMEMSpark.scala:301)    at cs.ucla.edu.bwaspark.BWAMEMSpark.main(BWAMEMSpark.scala)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:606)    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) hadoop@Master:~/xubo/project/alignment$ spark-submit --executor-memory 4g --class cs.ucla.edu.bwaspark.BWAMEMSpark --total-executor-cores 20 --master spark://219.219.220.149:7077  --conf spark.driver.host=219.219.220.149 --conf spark.driver.cores=4 --conf spark.driver.maxResultSize=4g --conf spark.storage.memoryFraction=0.7  --conf spark.akka.threads=2 --conf spark.akka.frameSize=1024 /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar cs-bwamem -bfn 1 -bPSW 1 -sbatch 10 -bPSWJNI 1  -oChoice 2 -oPath /xubo/data/alignment/output/SRR003161.adam -localRef 1  -isSWExtBatched 1  0 /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna  /xubo/data/alignment/data/SRR003161Upload.fastqcommand: cs-bwamemMap('isPSWJNI -> 1, 'localRef -> 1, 'batchedFolderNum -> 1, 'isPSWBatched -> 1, 'subBatchSize -> 10, 'inFASTQPath -> /xubo/data/alignment/data/SRR003161Upload.fastq, 'inFASTAPath -> /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna, 'outputPath -> /xubo/data/alignment/output/SRR003161.adam, 'isSWExtBatched -> 1, 'isPairEnd -> 0, 'outputChoice -> 2)CS- BWAMEM command line arguments: false /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna /xubo/data/alignment/data/SRR003161Upload.fastq 1 true 10 true ./target/jniNative.so 2 /xubo/data/alignment/output/SRR003161.adamHDFS master: hdfs://Master:9000Input HDFS folder number: 23Head line: @RG  ID:foo  SM:barRead Group ID: fooLoad Index FilesException in thread "main" java.lang.OutOfMemoryError: Java heap space    at cs.ucla.edu.bwaspark.datatype.BinaryFileReadUtil$.readIntArray(BinaryFileReadUtil.scala:151)    at cs.ucla.edu.bwaspark.datatype.BWTType.BWTLoad(BWTType.scala:147)    at cs.ucla.edu.bwaspark.datatype.BWTType.load(BWTType.scala:54)    at cs.ucla.edu.bwaspark.datatype.BWAIdxType.load(BWAIdxType.scala:58)    at cs.ucla.edu.bwaspark.FastMap$.memMain(FastMap.scala:119)    at cs.ucla.edu.bwaspark.BWAMEMSpark$.main(BWAMEMSpark.scala:301)    at cs.ucla.edu.bwaspark.BWAMEMSpark.main(BWAMEMSpark.scala)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:606)    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)hadoop@Master:~/xubo/project/alignment$ spark-submit --executor-memory 4g --class cs.ucla.edu.bwaspark.BWAMEMSpark --total-executor-cores 20 --master spark://219.219.220.149:7077  --conf spark.driver.host=219.219.220.149 --conf spark.driver.cores=4 --conf spark.driver.maxResultSize=4g --conf spark.storage.memoryFraction=0.7  --conf spark.akka.threads=2 --conf spark.akka.frameSize=1024 /home/hadoop/xubo/tools/cloud-scale-bwamem-0.2.1/target/cloud-scale-bwamem-0.2.0-assembly.jar cs-bwamem -bfn 1 -bPSW 1 -sbatch 10 -bPSWJNI 1  -oChoice 2 -oPath /xubo/data/alignment/output/SRR003161.adam -localRef 1  -isSWExtBatched 1  0 /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna  /xubo/data/alignment/data/SRR003161Upload.fastqcommand: cs-bwamemMap('isPSWJNI -> 1, 'localRef -> 1, 'batchedFolderNum -> 1, 'isPSWBatched -> 1, 'subBatchSize -> 10, 'inFASTQPath -> /xubo/data/alignment/data/SRR003161Upload.fastq, 'inFASTAPath -> /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna, 'outputPath -> /xubo/data/alignment/output/SRR003161.adam, 'isSWExtBatched -> 1, 'isPairEnd -> 0, 'outputChoice -> 2)CS- BWAMEM command line arguments: false /xubo/ref/GRCH38Index/GCA_000001405.15_GRCh38_full_analysis_set.fna /xubo/data/alignment/data/SRR003161Upload.fastq 1 true 10 true ./target/jniNative.so 2 /xubo/data/alignment/output/SRR003161.adamHDFS master: hdfs://Master:9000Input HDFS folder number: 23Head line: @RG  ID:foo  SM:barRead Group ID: fooLoad Index FilesException in thread "main" java.lang.OutOfMemoryError: Java heap space    at cs.ucla.edu.bwaspark.datatype.BinaryFileReadUtil$.readIntArray(BinaryFileReadUtil.scala:151)    at cs.ucla.edu.bwaspark.datatype.BWTType.BWTLoad(BWTType.scala:147)    at cs.ucla.edu.bwaspark.datatype.BWTType.load(BWTType.scala:54)    at cs.ucla.edu.bwaspark.datatype.BWAIdxType.load(BWAIdxType.scala:58)    at cs.ucla.edu.bwaspark.FastMap$.memMain(FastMap.scala:119)    at cs.ucla.edu.bwaspark.BWAMEMSpark$.main(BWAMEMSpark.scala:301)    at cs.ucla.edu.bwaspark.BWAMEMSpark.main(BWAMEMSpark.scala)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:606)    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)

参考

【1】https://github.com/xubo245/AdamLearning【2】https://github.com/bigdatagenomics/adam/ 【3】https://github.com/xubo245/SparkLearning【4】http://spark.apache.org【5】http://stackoverflow.com/questions/28166667/how-to-pass-d-parameter-or-environment-variable-to-spark-job  【6】http://stackoverflow.com/questions/28840438/how-to-override-sparks-log4j-properties-per-driver

研究成果:

【1】 [BIBM] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Chao Wang, and Xuehai Zhou, "Distributed Gene Clinical Decision Support System Based on Cloud Computing", in IEEE International Conference on Bioinformatics and Biomedicine. (BIBM 2017, CCF B)【2】 [IEEE CLOUD] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Xuehai Zhou. Efficient Distributed Smith-Waterman Algorithm Based on Apache Spark (CLOUD 2017, CCF-C).【3】 [CCGrid] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Jinhong Zhou, Xuehai Zhou. DSA: Scalable Distributed Sequence Alignment System Using SIMD Instructions. (CCGrid 2017, CCF-C).【4】more: https://github.com/xubo245/Publications

Help

If you have any questions or suggestions, please write it in the issue of this project or send an e-mail to me: xubo245@mail.ustc.edu.cnWechat: xu601450868QQ: 601450868
阅读全文
'); })();
0 0
原创粉丝点击
热门IT博客
热门问题 老师的惩罚 人脸识别 我在镇武司摸鱼那些年 重生之率土为王 我在大康的咸鱼生活 盘龙之生命进化 天生仙种 凡人之先天五行 春回大明朝 姑娘不必设防,我是瞎子 火麻仁茶 火麻仁的功效 火麻仁油哪里有卖 火麻仁作用 火麻仁是什么 火麻仁怎么吃 火麻仁用量 火麻仁是什么东西 火麻仁的图片 火麻仁的功效与作用 火麻仁的作用与功效 火麻仁油多少钱一斤 炒火麻仁的功效与作用 麻仁和火麻仁一样吗 火麻仁油的功效与作用 中药火麻仁的功效与作用 火麻仁的功效与作用及禁忌 火麻仁的功效与作用吃法 火麻仁和麻仁的区别 火麻茶 火麻油价格 火麻种子 火麻油厂家 火麻仔 火麻功效 麻仁丸价格 火麻怎么吃 火麻是什么 火麻图片 火麻是什么东西 麻仁丸图片 什么人不宜服用火麻仁 火麻油是麻油吗 火麻油的价格 火麻油能减肥吗 火麻油使用方法 火麻油的功效与作用 火麻油的功效 火麻油超市有卖的吗 火麻油图片 什么是火麻油