基因数据处理21之BWASW算法ref分块建立索引然后比对(ref切分为四段,read为250条)

来源:互联网 发布:怪物猎人p3软件数据 编辑:程序博客网 时间:2024/06/07 16:13

1.时间分析

对ref为单条染色体进行比对,第一次比对在3-5s不等,对chr1-4比对,在20s左右

连续比对多次后,对单染色体比对降到1s左右,chr1-4降到2s左右

不懂为什么比一次比对时间比较长,后面几次比对时间变短


运行代码:

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 2.885 sec; CPU: 1.118 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.068 sec; CPU: 1.022 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.068 sec; CPU: 1.017 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.068 sec; CPU: 1.019 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 2.511 sec; CPU: 1.056 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 0.999 sec; CPU: 0.950 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 1.017 sec; CPU: 0.964 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 1.009 sec; CPU: 0.965 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.071 sec; CPU: 1.019 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.072 sec; CPU: 1.015 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.068 sec; CPU: 1.018 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.065 sec; CPU: 1.017 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.070 sec; CPU: 1.017 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq[main] Real time: 1.050 sec; CPU: 1.009 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 1.017 sec; CPU: 0.969 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 1.015 sec; CPU: 0.969 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq[main] Real time: 1.023 sec; CPU: 0.966 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq[main] Real time: 0.940 sec; CPU: 0.885 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq[main] Real time: 0.933 sec; CPU: 0.888 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq[main] Real time: 0.915 sec; CPU: 0.872 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq[main] Real time: 0.918 sec; CPU: 0.871 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq[main] Real time: 0.919 sec; CPU: 0.868 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq[main] Real time: 0.889 sec; CPU: 0.853 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 20.819 sec; CPU: 3.195 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 17.380 sec; CPU: 2.803 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 14.140 sec; CPU: 2.454 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 4.305 sec; CPU: 2.166 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 2.034 sec; CPU: 1.970 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 2.059 sec; CPU: 1.995 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 2.079 sec; CPU: 2.000 sechadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ rm SRR003161h1000chr1-4.sam hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam[M::bwa_idx_load_from_disk] read 0 ALT contigs[bsw2_aln] read 250 sequences/pairs (161179 bp) ...[main] Version: 0.7.13-r1126[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq[main] Real time: 2.046 sec; CPU: 1.997 sec

2.准确性分析:

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1.sam 264 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates105 + 0 mapped (39.77% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr2.sam 260 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates83 + 0 mapped (31.92% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr3.sam 256 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates80 + 0 mapped (31.25% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr4.sam 254 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates58 + 0 mapped (22.83% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1-4.sam 264 + 0 in total (QC-passed reads + QC-failed reads)0 + 0 secondary0 + 0 supplementary0 + 0 duplicates146 + 0 mapped (55.30% : N/A)0 + 0 paired in sequencing0 + 0 read10 + 0 read20 + 0 properly paired (N/A : N/A)0 + 0 with itself and mate mapped0 + 0 singletons (N/A : N/A)0 + 0 with mate mapped to a different chr0 + 0 with mate mapped to a different chr (mapQ>=5)

3.比对结果文件,太长,就不粘了

0 0
原创粉丝点击