Smith-Waterman (SW) algorithm
来源:互联网 发布:买家怎样申请淘宝介入 编辑:程序博客网 时间:2024/04/29 06:28
Smith-Waterman (SW) algorithm
1. 什么是Smith-Waterman (SW) algorithm
The Smith-Waterman (SW) algorithm is in essence a derivation of the Needleman-Wunsch (NW) algorithm in which penalties are assigned to mismatched pairs, insertions and deletions. Assigning penalties to mismatches and gaps focuses the scope of the algorithm. Rather than lining up entire sequences, the algorithm is able to examine all subsequences found in the two sequences, and return only the highest scoring subsequence alignment(s) found.
上面是Smith-Waterman (SW) algorithm 算法的定义,本文以最简单的一个实际的例子来说明 Smith-Waterman (SW) algorithm 打分矩阵是怎么算的,回溯的过程是怎么回溯的。
2.Smith-Waterman (SW) algorithm 的主要两步操作
1.计算打分矩阵
2.打分矩阵的回溯,计算出最相似的字符串部分
3.举例说明
A = “CGATCGATCGATATAGTG”
B = “TAGCTAGATCCGAGAT”构成矩阵
现在要做的事就是这个打分矩阵是怎么计算出来的
In the SW system, the scoring of a cell depends on a variety of user specified weights. These weights are for matches, mismatches, gaps, and gap extensions. By manipulating the different weights, the outcome of an alignment can be drastically altered. For example if great a weight is assigned to the mismatch score, and a lesser weight is assigned to gap penalties, the resulting alignment would contain no mismatches and a large number of gaps. Conversely, maximizing gap penalties and minimizing mismatch penalties can result in alignments containing a greater number of mismatches and a small number of gaps.
我们现在来看上图中?的位置的值怎么计算?
In the subsequent scoring of A and B, the following weights were used:
Match score = 10
Mismatch score = -5
Gap penalty = 10
Gap extension penalty = 8
我们发现?位置的 T(竖坐标)、G(横坐标),T!=G ,所以我们的Match score=-5
分这三个块分别计算其中的值,然后取最大的值作为“?”处的值
In scoring cell M13,13 (labeled with a ”?”) the maximum score as it is derived from the equation in figure 10 is implemented. The equation in Figure 10 reveals that the possible scores for this cell are: 22 (diagonal score + mismatch score: (27-5)), 22 (greatest column gap score: 40-(10+(8*1)), 11 (greatest row gap score: 45-(10+(8*3)), and 0. As it is the largest of these scores, a 22 is entered into the cell. Scoring proceeds to the right and down.
4.打分矩阵回溯
怎么回溯?
1.从最大的值得位置开始回溯,整个打分矩阵最大的值是67,所以从67的位置“左上方”回溯,那为什么就回溯到57了?,我们先定义67的坐标为(x,y),那么67要回溯就得与(x-1,y-1)这个点同行,同列范围查找(图中红色的方框),找到最大的值作为,67的回溯位置,这样就到了57的位置,然后,以57的位置继续回溯,直到碰到第一个值为零的位置停止。
2.再来练习一个值得回溯
A’ G - - A T C G A T C G - A T A T
B’ G C T A - - G A T C C G A G A T这样读 从10到15 水平上要走3步 G C T A
但是竖直的方向只能走一步所有需要等两步才到 A
所以是
G C T A
G 等 等 A
其他情况依次类推
- Smith-Waterman (SW) algorithm
- Smith-waterman算法 openmp+mpi实现
- Needleman-Wunsch 算法和Smith-Waterman算法
- 简单的Smith Waterman算法实现
- Needleman-wunsch 和 Smith-Waterman 比对算法
- 动态编程之序列比对:Needleman-Wunsch 算法和Smith-Waterman算法
- 算法笔记学习000——Smith-Waterman算法寻找两个字符串中匹配度最高的子串
- sw
- SW
- sw
- smith chart
- Smith Numbers
- Smith Numbers
- Smith数
- smith 数
- Smith Numbers
- 什么是SW
- SW SMI
- 【并查集】POJ1611-The Suspects + 并查集简单理解
- 数据结构实验之栈一:进制转换
- 破解安卓恶意软件病毒锁机解决方案
- HDU:4707 Pet(并查集+某元素到根节点的距离计算)
- 01.C语言入门
- Smith-Waterman (SW) algorithm
- arcpy 批量添加字段
- Ice_cream's world I(并查集找环)
- c++与java异同
- Android 高级控件
- MYSQL数据转移和“恢复”
- Java中变量的使用
- iOS中通知机制
- 为自己建立一个简单的日志文件