动态规划(dynamic program)&& 最长公共子序列(LCS)

来源:互联网 发布:想学淘宝美工如何学 编辑:程序博客网 时间:2024/05/22 12:52

动态规划特征:

1.最优子结构 the property of optimal substructure 

An opt solution to a problem contain opt solution subproblem

2.重叠子过程尽量少

recursive solution contains a small number distinct subproblems repeat many times


Longest common subsequence


1.memoization alg 备忘法

伪代码

LSC(x,y,i,j)  //ignoring base caseif c[i][j] = NULL   then if x[i] = y[j]             c[i][j] = LSC(x,y,i-1,j-1) + 1    else c[i][j] = max{LCS(x,y,i-1,j),LCS(x,y,i,j-1)}     return c[i][j]else return c[i][j]   
2.动态规划法

Traceback example ØAGCATØ000000G0\overset{\ \ \uparrow}{\leftarrow}0\overset{\nwarrow}{\ }1\overset{\ }{\leftarrow}1\overset{\ }{\leftarrow}1\overset{\ }{\leftarrow}1A0\overset{\nwarrow}{\ }1\overset{\ \ \uparrow}{\leftarrow}1\overset{\ \ \uparrow}{\leftarrow}1\overset{\nwarrow}{\ }2\overset{\ }{\leftarrow}2C0\overset{\ \uparrow}{\ }1\overset{\ \ \uparrow}{\leftarrow}1\overset{\nwarrow}{\ }2\overset{\ \ \uparrow}{\leftarrow}2\overset{\ \ \uparrow}{\leftarrow}2

伪代码

function LCSLength(X[1..m], Y[1..n])    C = array(0..m, 0..n)    for i := 0..m       C[i,0] = 0    for j := 0..n       C[0,j] = 0    for i := 1..m        for j := 1..n            if X[i] = Y[j]                C[i,j] := C[i-1,j-1] + 1            else                C[i,j] := max(C[i,j-1], C[i-1,j])    return C[m,n]

#include<stdio.h>int c[50][50];void LCSlength(char x[],char y[],int m,int n){    int i,j;    for(i = 0;i<m;i++)        c[0][i] = 0;    for(j = 0;j<n;j++)        c[j][0] = 0;    for(i = 0;i<m;i++)        for(j = 0;j<n;j++){            if(x[i] == y[j]) c[i+1][j+1] = c[i][j] + 1;            else if(c[i+1][j]>c[i][j+1]) c[i+1][j+1] = c[i+1][j];            else c[i+1][j+1] = c[i][j+1];        }}void LCS(char *lcs,char *x,char *y,int m,int n){    int i,j,k;    LCSlength(x,y,m,n);    i = m-1;    j = n-1;    k = c[m][n]-1;    while(i>=0&&j>=0){        if(x[i] == y[j]) {lcs[k--] = x[i];           i--;           j--;                 }         else if(c[i][j+1]>c[i+1][j]) i--;        else j--;    }}int main(){    char x[7] = {'A','B','C','B','D','A','B'};    char y[6] = {'B','D','C','A','B','A'};    char lcs[6];    LCS(lcs,x,y,7,6);    lcs[c[7][6]] = '\0';    printf("%d %s\n",c[7][6],lcs);}

递归方法回溯LCS(一个)

伪代码

function backtrack(C[0..m,0..n], X[1..m], Y[1..n], i, j)    if i = 0 or j = 0        return ""    else if  X[i] = Y[j]        return backtrack(C, X, Y, i-1, j-1) + X[i]    else        if C[i,j-1] > C[i-1,j]            return backtrack(C, X, Y, i, j-1)        else            return backtrack(C, X, Y, i-1, j)
回溯所有LCS

伪代码

function backtrackAll(C[0..m,0..n], X[1..m], Y[1..n], i, j)    if i = 0 or j = 0        return {""}    else if X[i] = Y[j]        return {Z + X[i] for all Z in backtrackAll(C, X, Y, i-1, j-1)}    else        R := {}        if C[i,j-1] ≥ C[i-1,j]            R := backtrackAll(C, X, Y, i, j-1)        if C[i-1,j] ≥ C[i,j-1]            R := R ∪ backtrackAll(C, X, Y, i-1, j)        return R



相关:

1.Shortest common supersequence

u 是 x和y的common supersequence当且仅当,x和y均为u的子序列

Given two sequences X = < x1,...,xm > and Y = < y1,...,yn >, a sequence U = < u1,...,uk > is a common supersequence of X and Y ifU is a supersequence of both X and Y. In other words, a shortest common supersequence of strings x and y is a shortest string z such that both x and y are subsequences of z.

For example, if X[1..m] = abcbdab and Y[1..n] = bdcaba, the lcs is Z[1..r] = bcba. By inserting the non-lcs symbols while preserving the symbol order, we get the scs: U[1..t] = abdcabdab.

与LCS的关系


2.编辑距离/Levenshtein距离

编辑距离,又称Levenshtein距离,是指两个字串之间,由一个转成另一个所需的最少编辑操作次数。许可的编辑操作包括将一个字符替换成另一个字符,插入一个字符,删除一个字符。

The edit distance when only insertion and deletion is allowed (no substitution), or when the cost of the substitution is the double of the cost of an insertion or deletion, is:

d'(X,Y) = n + m - 2 \cdot \left|LCS(X,Y)\right|.


0 0
原创粉丝点击