R与bioconductor--Short Read(读取fastq) Rsamtools
来源:互联网 发布:熊猫tv淘宝买竹子 编辑:程序博客网 时间:2024/06/05 03:49
博主自学了coursera上来自约翰霍普金斯大学<使用Bioconductor分析基因组科学数据>,很不错,推荐给大家
Short Read(读取fastq)
library(ShortRead)
fl <- system.file(package="ShortRead", "extdata", "E-MTAB-1147",
"ERR127302_1_subset.fastq.gz")
reads <- readFastq(fl)
fqFile<- FastqFile(fl)
reads <- readFastq(fl)
sread(reads)[1:2]#读取序列
quality(reads)[1:2]#读取序列质量
id(reads)[1:2]#读取序列ID
as(quality(reads),"matrix")[1:2,1:10]#转化序列质量
Rsamtools
library(Rsamtools)
bamPath <- system.file("extdata","ex1.bam",package = "Rsamtools")
bamFile <- BamFile(bamPath)
bamFile
seqinfo(bamFile)
aln <- scanBam(bamFile)
length(aln)
class(aln)
names(aln)
aln <- aln[[1]]
names(aln)
lapply(aln,function(xx)xx[1])#取出每个列表里面的第一个元素
yieldSize(bamFile) <- 1
bamFile#此刻yieldSize: 1 每次只读取一行
open(bamFile)
scanBam(bamFile)[[1]]$seq
scanBam(bamFile)[[1]]$seq
scanBam(bamFile)[[1]]$seq
gr <- GRanges(seqnames = "seq2",
ranges = IRanges(start = c(100,1000),end =c(1500,2000)))
gr
params<- ScanBamParam(which = gr,what = scanBamWhat())
scanBamWhat()
aln <- scanBam(bamFile,param = params)
names(aln)
head(aln[[1]]$pos)#有些reads很长,与设置的100边界重叠
bamView <- BamViews(bamPath)#读取多个bam文件
bamView
aln <- scanBam(bamView)#读入bam文件
names(aln[[1]][[1]])
bamRanges(bamView) <- gr#对BamViews设置ranges
aln<- scanBam(bamView)
names(aln[[1]])
quickBamFlagSummary(bamFile)#快速读取bam文件,给出summary
最后是完整代码片段
library(ShortRead)fl <- system.file(package="ShortRead", "extdata", "E-MTAB-1147", "ERR127302_1_subset.fastq.gz")reads <- readFastq(fl)fqFile<- FastqFile(fl)reads <- readFastq(fl)sread(reads)[1:2]#读取序列quality(reads)[1:2]#读取序列质量id(reads)[1:2]#读取序列IDas(quality(reads),"matrix")[1:2,1:10]#转化序列质量library(Rsamtools)bamPath <- system.file("extdata","ex1.bam",package = "Rsamtools")bamFile <- BamFile(bamPath)bamFileseqinfo(bamFile)aln <- scanBam(bamFile)length(aln)class(aln)names(aln)aln <- aln[[1]]names(aln)lapply(aln,function(xx)xx[1])#取出每个列表里面的第一个元素yieldSize(bamFile) <- 1bamFile#此刻yieldSize: 1 每次只读取一行open(bamFile)scanBam(bamFile)[[1]]$seqscanBam(bamFile)[[1]]$seqscanBam(bamFile)[[1]]$seqgr <- GRanges(seqnames = "seq2",ranges = IRanges(start = c(100,1000),end =c(1500,2000)))grparams<- ScanBamParam(which = gr,what = scanBamWhat())scanBamWhat()aln <- scanBam(bamFile,param = params)names(aln)head(aln[[1]]$pos)#有些reads很长,与设置的100边界重叠bamView <- BamViews(bamPath)#读取多个bam文件bamViewaln <- scanBam(bamView)#读入bam文件names(aln[[1]][[1]])bamRanges(bamView) <- gr#对BamViews设置rangesaln<- scanBam(bamView)names(aln[[1]])quickBamFlagSummary(bamFile)#快速读取bam文件,给出summary
阅读全文
0 0
- R与bioconductor--Short Read(读取fastq) Rsamtools
- NGS项目六:R语言与Bioconductor分析affymetrix芯片
- R与bioconductor--IRanges GRanges AnnotationHub Biostrings BSgenome GenomicRanges GenomicFeatures rtra
- R与bioconductor--ExpressionSet SummarizedExperiment GEOquery biomaRt S4-Classes S4-Methods
- Fastq与Fasta格式
- R read.table读取数据中的困惑
- Bioconductor软件安装与升级
- insmod: short read错误
- R数据导入读取read.table函数详解
- R 数据导入读取read.table函数详解
- R中高效读取文件readxl, read.fwf
- R读取含中文excel文件,read.xlsx乱码问题
- R-read.table读取文本文件的一个错误
- 在Debian 6下安装R和Bioconductor
- 在debian中安装Emacs和R/Bioconductor
- R语言——read.table;read.csv(读取外部数据)
- 【R语言】R读取含中文excel文件,read.xlsx乱码问题
- int(Integer) 与 short(Short)
- 虚拟机参数设置
- caffe自带画图工具的一些问题
- Unity中计算某一个方法的耗时的几种方法
- 深入理解指针以及二级指针(指针的指针)
- Android ListView 去除边缘阴影、选中色、拖动背景色等(android:cacheColorHint="#00000000")
- R与bioconductor--Short Read(读取fastq) Rsamtools
- ES5, ES6, ES2016, ES.Next: JavaScript 的版本是怎么回事?
- Java基础二:Java在内存分配
- Xcode使用教程详细讲解(上)
- tp框架一个controller控制器对应一个view下的文件夹
- 当RecycleView跟ScrollView冲突设置自定义LinearLayoutManager的时候出现IllegalArgumentException异常
- AWS-RedHat下安装Firekylin
- hadoop架构分析之启动脚本分析
- Spring整合mybatis