基因数据处理63之snap修改默认设置后处理大于400bp的记录
来源:互联网 发布:淘宝网怎么找人工客服 编辑:程序博客网 时间:2024/06/05 03:02
通过修改Read.h中的400=》4000,之后可以运行,但是匹配的命中率好低。但是bwamen很不错,下一篇有记录。
xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner single snapindex/ g38l500N10000.fq -o g38l500N10000.snap1.samWelcome to SNAP version 1.0beta.23.Loading index from directory... 32s. 248957422 bases, seed size 20Aligning.Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Reads/s Time in Aligner (s)10,000 8,933 (89.33%) 98 (0.98%) 969 (9.69%) 0 (0.00%) 4,068 2xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner single snapindex/ g38l1000N10000.fq -o g38l1000N10000.snap1.samWelcome to SNAP version 1.0beta.23.Loading index from directory... 33s. 248957422 bases, seed size 20Aligning.Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Reads/s Time in Aligner (s)10,000 796 (7.96%) 8 (0.08%) 9,196 (91.96%) 0 (0.00%) 2,608 4xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner snapindex/ g38l500N100000.fq -o g38l500N100000.snap.samWelcome to SNAP version 1.0beta.23.Invalid command: snapindex/Usage: snap-aligner <command> [<options>]Commands: index build a genome index single align single-end reads paired align paired-end reads daemon run in daemon mode--accept commands remotelyType a command without arguments to see its help.xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner single snapindex/ g38l500N100000.fq -o g38l500N100000.snap.samWelcome to SNAP version 1.0beta.23.Loading index from directory... 34s. 248957422 bases, seed size 20Aligning.Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Reads/s Time in Aligner (s)100,000 88,891 (88.89%) 1,083 (1.08%) 10,026 (10.03%) 0 (0.00%) 4,200 24xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner single snapindex/ g38l1000N100000.fq -o g38l1000N100000.snap1.samWelcome to SNAP version 1.0beta.23.Loading index from directory... 33s. 248957422 bases, seed size 20Aligning.Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Reads/s Time in Aligner (s)100,000 7,786 (7.79%) 67 (0.07%) 92,145 (92.14%) 2 (0.00%) 2,390 42xubo@xubo:~/xubo/data/alignment/cs-bwamem$ snap-aligner single snapindex/ g38l1000N1000000.fq -o g38l1000N1000000.snap1.samWelcome to SNAP version 1.0beta.23.Loading index from directory... 32s. 248957422 bases, seed size 20Aligning.Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Reads/s Time in Aligner (s)1,000,000 78,762 (7.88%) 602 (0.06%) 920,610 (92.06%) 26 (0.00%) 2,420 413
统计信息:
xubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l500N10000.snap1.sam 10000 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates9031 + 0 mapped (90.31% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)xubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l500N100000.snap1.sam [E::hts_open_format] fail to open file 'g38l500N100000.snap1.sam'samtools flagstat: Cannot open input file "g38l500N100000.snap1.sam": No such file or directoryxubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l500N100000.snap.sam 100000 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates89974 + 0 mapped (89.97% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)xubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l1000N10000.snap1.sam 10000 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates804 + 0 mapped (8.04% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)xubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l1000N100000.snap1.sam 100000 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates7853 + 0 mapped (7.85% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)xubo@xubo:~/xubo/data/alignment/cs-bwamem$ samtools flagstat g38l1000N1000000.snap1.sam 1000000 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates79364 + 0 mapped (7.94% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)xubo@xubo:~/xubo/data/alignment/cs-bwamem$
参考
【1】https://github.com/xubo245/AdamLearning【2】https://github.com/bigdatagenomics/adam/ 【3】https://github.com/xubo245/SparkLearning【4】http://spark.apache.org【5】http://stackoverflow.com/questions/28166667/how-to-pass-d-parameter-or-environment-variable-to-spark-job 【6】http://stackoverflow.com/questions/28840438/how-to-override-sparks-log4j-properties-per-driver
研究成果:
【1】 [BIBM] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Chao Wang, and Xuehai Zhou, "Distributed Gene Clinical Decision Support System Based on Cloud Computing", in IEEE International Conference on Bioinformatics and Biomedicine. (BIBM 2017, CCF B)【2】 [IEEE CLOUD] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Xuehai Zhou. Efficient Distributed Smith-Waterman Algorithm Based on Apache Spark (CLOUD 2017, CCF-C).【3】 [CCGrid] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Jinhong Zhou, Xuehai Zhou. DSA: Scalable Distributed Sequence Alignment System Using SIMD Instructions. (CCGrid 2017, CCF-C).【4】more: https://github.com/xubo245/Publications
Help
If you have any questions or suggestions, please write it in the issue of this project or send an e-mail to me: xubo245@mail.ustc.edu.cnWechat: xu601450868QQ: 601450868
阅读全文
0 0
- 基因数据处理63之snap修改默认设置后处理大于400bp的记录
- 基因数据处理62之snap默认无法处理大于400bp的reads
- 基因数据处理64之bwamem处理500bp和1000bp的记录
- 基因数据处理65之bwa处理500bp和1000bp的记录
- 基因数据处理59之snap运行single-end(1千万条100bp的reads)
- 基因数据处理58之snap运行paired-end(1千万条100bp的reads对)
- 基因数据处理61之idea运行cs-bwamem处理single-end(1条100bp的reads)
- 基因数据处理29之avocado运行snap-basic有问题
- 基因数据处理57之BWA-MEM运行single-end(1千万条100bp的reads)
- 基因数据处理60之bwa运行single-end(1千万条100bp的reads)
- 基因数据处理39之mango安装记录
- 基因数据处理13之bwa处理SRR003161
- 基因数据处理50之cs-bwamem、bwa、snap、bwa-mem与art比较
- 基因数据处理52之cs-bwamem集群版运行(1千万条100bp的reads)
- 基因数据处理53之cs-bwamem集群版运行paird-end(1千万条100bp的reads)
- 基因数据处理54之bwa-mem运行paird-end(1千万条100bp的reads)
- 基因数据处理56之bwa运行paird-end(1千万条100bp的reads).md
- 基因数据处理80之disease的DataProcessing
- 指针的大小
- oracle创建用户和表空间
- 不说再见,感谢有你!
- 有关html+css+JavaScript的讲解
- slf4j+log4j2基础教程(拿来即用教程)
- 基因数据处理63之snap修改默认设置后处理大于400bp的记录
- ArrayList并发add()可能出现数组下标越界异常 | 10+10<20
- [非技术]《设计与生存》读书笔记
- Java学习与技术总结——(二)神奇的排序算法
- SSH整合
- 谈谈单例模式(三)
- LeetCode27. Remove Element
- 基因数据处理64之bwamem处理500bp和1000bp的记录
- 基因数据处理65之bwa处理500bp和1000bp的记录