SNP芯片探针回帖基因序列

来源:互联网 发布:手机上淘宝网怎么付款 编辑:程序博客网 时间:2024/06/13 21:32

SNP Flank sequence align gene sequence

根据SNP标记探针的序列来查看具体的一个基因上有多少SNP标记。采用序列回帖的方法进行查看。在R和Linux 中操作,需要安装的软件是BWA, samtools 等
步骤如下:

#extract snp flank sequence to make a fasta file for BWA alignmentsetwd("/Users/zhanghuairen/600Ksnp/")library(stringr)da=read.csv(file="chr1.csv",header = T) #snp chip annotation filehead(da)sink('snp_marker_seq.fa')#创建一个fasta文件for (i in 1:nrow(da)){  Frank_seq=unlist(str_split(da$Flank[i],"\\[.*\\]")) #split the seq with []  cat(str_c(">",paste(da$Probe.Set.ID[i],da$Physical.Position[i],sep="|")))  cat("\n")  cat(paste(Frank_seq[1],Frank_seq[2],sep=""))  cat("\n")}sink()######using BWA to create a index file for align#system("bwa index gene2.fasta -p gene2 ")######using BWA to align snp marker frank sequence to gene sequence system("bwa mem gene2.fasta snp_marker_seq.fa > snpMarke_align.sam")#####using samtools to extract the algined sequence system("samtools view -F 4 snpMarke_align.sam > snpMarkerFiltered.sam")system("wc -l  snpMarkerFiltered.sam ")
0 0